File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/90/c90-3048_abstr.xml
Size: 11,504 bytes
Last Modified: 2025-10-06 13:46:58
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-3048"> <Title>Machine Translation without a source text</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> Tiffs lmper concerns an approach to Machine Translation whieJJ differs from the typical 'standard' approaches crucially in.that it does not rely on the prior existence of a source text as a basis of the translation. Our approach can be characterised as an 'intelligent secretary with knowledge of the foreign language', which helps monolingual users to formulate the desired target-language text in the context of a (keyboard) dialogue translation systems.</Paragraph> <Paragraph position="1"> Keywords: Machine translation; natural language interface; dialogue Introduction Machine Translation (M'f) or natural lang~lge translation in general is a typical example of the 'under-constrained' problems which we often encounter in the field of artificial intelligence 1. That is to .say, the same 'messages' can and should be translated differently depending on the surrounding contexts (where and when they are used), and on the Sl~eakers' intention (what they really want to express) etc. It is all too often the case that this information, which is neces~ry for the selection of the appropriate overall target text structure, is not ntade explicit in source texts prepared for translation. The author of the source text naturally follows the 'rules' of the source language in preparation of source texts and assumes that the factors which will affect the selection of target expressions are self-evident.</Paragraph> <Paragraph position="2"> MT systems developed so far or being developed have been trying to compensate this genuine property of language translation by extending the units of translation from sentences to texts (e.g. Rothkegel 1986, Weber 1987) or t The authors would like to acknowledge the contribution to this work of the other members of the project team: Bill Black, Jeremy Carroll, Anna Gianetti, Makoto Hirai, Natsuko Holden, John Phillips and Kenji Yoshimura.</Paragraph> <Paragraph position="3"> by introducing 'understanding' based on 'domain specific knowledge' (as in the 'sublanguage' approach - cf. Kosaka et al.</Paragraph> <Paragraph position="4"> 1988, Lehrberger & Bourbeau 1988). This course of research would be inevitable if we were to confine ourselves to translation of prepared texts which already exist before translation. In such cases, we have to recover from text itself or by using extra 'knowledge', such implicit information which is necessary for formulating target expressions.</Paragraph> <Paragraph position="5"> However, we can imagine a quite different course of research for developing a different type of MT system, i.e. an 'expert' system which can play the role of an 'intelligeut secretm-y with knowledge of the toreign language'. Such a system does not require the user (the writer) to prepare full source texts in advance. It slarts from rough sketches of what the writer wants to say and gathers the information necessary for formulating target texts by asking the writer questions, because the wdtor is the person who really intends to communicate and has a clear idea about what s/he wants to say. We can get much richer information through such interactions than in the usual written text translation by professional translators. Through interaction, we can get information concemed with, for example, the user's intention which is not explicitly expressed in the 'text' to translate but which is nonetheless necessary for producing quality target texts.</Paragraph> <Paragraph position="6"> This sort of system is different from the widely promoted 'Translator's Workbench' idea (e.g. Kay 1980, Melby 1982), the main aims of which are to help translators to translate texts.</Paragraph> <Paragraph position="7"> In this scenario, both the system and the user have knowledge about both source and target language, and it is sometimes difficult to see where the most appropriate division of labour should occur: indeed, there is sometimes a conflict between what the system offers the translator-user, and what the user already l 271 knows, or between the extent to which the system or the user should take the initiative, which might differ from occasion to occasion.</Paragraph> <Paragraph position="8"> On the other hand, in the proposed expert system scenario, the partition of knowledge is clear: the system knows mainly about translation, the writer knows only about the desired communicative content of the message.</Paragraph> <Paragraph position="9"> There is no conflict between what the system assumes to be the extent of the writer's (the user's) knowledge, nor in the writer's expectations. In this respect we are following the line taken by Johnson & Whitelock (1987), and the work here at UMIST on the ENtran project (Whitelock et al. 1986, Wood & Chandler 1988) developing an MT system for a monolingual user.</Paragraph> <Paragraph position="10"> MT systems so far have been developed based on the implicit assumption that source texts contain all (or almost all) the information necessary for translation. We take as a starting point that this assumption is not necessarily true, especially when we consider pairs of unrelated languages where cultural as well as linguistic differences contribute to this problem. Notice that the concept of 'source text' in the above is quite different from that in the normal context of MT. That is, we do not have a source text to translate as such, but instead, the user has his/her communicative goals and the translation system can help to formulate the most appropriate target linguistic forms by gathering information necessary to accomplish these goals through 'clarification dialogues'.</Paragraph> <Paragraph position="11"> It could be argued that this generation of a target text on the basis of something other than a source text is not 'real translation'. Such an argument might derive from an overly traditional view of translation where a translator gets some text (say, in the post) and sits at a desk with a bilingual dictionary and translates 'blind' i.e. with no actual knowledge of the writer's intentions, goals, etc. There is a sense in which second generation MT systems simply reflect this scenario of a translator. Of course, the best translations are done by a translator who can ask the original author &quot;What did you mean when you said...?&quot;; by the same token we believe we can build a better translation system if we can elicit such information from the originator of the 'text' at the time of 'writing'. General background to the research This research is undertaken in the context of the more general activities of the Japanese ATR research programme into automatic interpretation between English and Japanese of telephone conversations. As such it is oriented towards translation of dialogues. One approach to dialogue translation has been the 'phrasebook' approach of Steer & Stentiford (1989). In this speech translation prototype system, set phrases are stored, as in a holidaymaker's phrasebook; they are retrieved by the fairly crude, though effective, technique of recognising keywords in a particular order in the input speech signal. The main disadvantage of this system is its inflexibility: if the phrase you want is not in the phrasebook, you cannot say anything.</Paragraph> <Paragraph position="12"> In the research programme to be reported here, we are not concerned with speech processing per se, and we assume the context of an on-line keyboard conversation function such as talk in UNIX TM (cf. Miike et al. 1988). It has been found that keyboard conversations have the same fundamental features as telephone conversations, notwithstanding the obvious differences between written and spoken language (Arita et al. 1987, Iida 1987).</Paragraph> <Paragraph position="13"> Furthermore, we restrict ourselves to goal-oriented dialogues, i.e. dialogues where one participant is seeking information from the other: our experimental domain is dialogues for a conference registration and hotel reservation system.</Paragraph> <Paragraph position="14"> When such conversations are subjected to the additional distortion of being transmitted via a traditional MT system, several further problems accrue, as the talk experiment mentioned above showed, notably when mistranslation occurs.</Paragraph> <Paragraph position="15"> The problem of human-machine interaction in the specific area of clarification dialogues for MT must be studied. The need to incorporate different types of clarification dialogue has general implications for the question of system architectures for interactive MT systems. This aspect is discussed in detail below.</Paragraph> <Paragraph position="16"> In the above scenario, the system tries to gather information necessary for formulating target texts through interactions. This means the system formulates target texts by adding information to 'source texts' (in the conventional sense). We can extend this idea further. In the extreme case, we can hnagine a system which has stereotypical target texts in certain restricted domains (e.g. business correspondences in specific areas), retrieves appropriate texts through dialogues with users and reformulates them to fulfill the specific 272 2 requi~-ements expressed by users. In this scenario, the MT sysmm becomes a kind of multilingual text generation system and adds a lot of inlormation not contained in the 'source text' at all. This idea has becn investigated here at UMIST in the context of a research programme for British Telecom (Jones & Tsujii 1990), and has significantly influcnc(xt the research reported here (a similar idea for 'automated text composition' in Japanese has tven suggested by Suite & Tomita 1986).</Paragraph> <Paragraph position="17"> D~alogue MT It is important to emphasize that there is a basic difference between Dialogue Machine Translafion (DMT) 2 systems on the one hand and conventional MT systems on the olher, namely the difference of user types. In DMT, users are dialogue participants who actually have their respective communicative goals and who really know what they want to say. On the other hand, the users of conventkmal MT ,are typically translators who, though they have enough knowledge about both languages, lack 'complete understanding' of texts to Ix: translateddeg This difference in user-types leads to diffenmt characterizations of interactions betw(~m MT systems and their users. We have to mkc into account what this differcnce implies in designing actual DMT systems. The main implications can be smnmarized as follows.</Paragraph> <Paragraph position="18"> In DMT, the system can ask in thcocC/ any questions to elicit tile information necessary tot translation which is not explicitly expressed in the 'source text'. This is impossible in conventional M-F, because the users do not have 'complete understanding' of the context in which the texts are prcpmvd, and the users (who are translators) simply could not answer such questions. (It is often the case that even human translalors would like to consult the authors of the original texts in ordcr to produce a good translalion.) In order to exploit this advantage in DMT however, we have to overcome several related difficulties.</Paragraph> </Section> class="xml-element"></Paper>