File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-3028_metho.xml
Size: 26,653 bytes
Last Modified: 2025-10-06 14:12:29
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-3028"> <Title>Translation by Abduction</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> ArtificiM Intelligence Center SRI International </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="160" type="metho"> <SectionTitle> Megulni I(ameya.ma </SectionTitle> <Paragraph position="0"> Center for the Study of Language</Paragraph> <Section position="1" start_page="0" end_page="160" type="sub_section"> <SectionTitle> and Information Stanford University </SectionTitle> <Paragraph position="0"> Stanford, California.</Paragraph> <Paragraph position="1"> Machine Translation and World Knowledge. Many existing approaches to machine translation take for granted that the information presented in the output is found somewhere in the input, and, moreover, that such information should be expressed at a single representational level, say, in terms of the parse trees or of &quot;semantic&quot; mssertions. Languages, however, not only express the equivalent information by drastically different linguistic means, but also often disagree in what distinctions should be expressed linguistically at all. For example, in translating from Japanese to English, it is often necessary to supply determiners for noun phr;tses, and this ira general cannot be (lone without deep understanding of the source ~ text. Similarly, in translating fl'om English to Japanese, politeness considerations, which in English are implicit in tile social situation and explicit in very diffuse wws ira, for examl)le, tile heavy use of hypotheticals, must be realized grammatically in Japanese. Machine translation therefore requires that the appropriate infer(noes be drawn and that the text be interpreted to stone depth (see Oviatt, 1988). Recently, an elegant approach to inference in discourse interpretation has been developed at a number of sites (e.g., ltobbs et al., 1988; Charniak and Goldman, 1988; Norvig, 1987), all based on tim notion of abduction, and we have begun to explore its potential application to machine translation. We argue that this approach provides the possibility of deep reasoning and of mapping between the languages at a variety of levels. (See also Kaplan et al., 1988, on the latter point.) 1 Interpretation as Abduction. Abductive inferenee is inference to the best explanation. The easiest way to understand it is to compare it with two words it rhymes with---deduction and induction. Deduction is; when from a specific fa.ct p(A) and a gen1The authors have profited from discussions about this work with Mark Stickel and with the l)arl.lcipants in the 'rranslation Group at CSLI. The research was funded by the I)eDnse Advanced Research Projects Agency under Ot/iee of Naval Fiesearch contract N00014-85-Co0013, and by a gift fl'om the Systems Development Fmmdatlon.</Paragraph> <Paragraph position="2"> eral rul (V*)v(:) q(*) we co, ch, de q(A). Induction is when from a number of instances of p(A) and q(A) and perhaps other factors, we conclude (Vx)p(x) D q(x). Abduction is the third possibility. It. is when fl'om q(A) and (Vx)p(x) D q(a:), :re conch, de p(A). Think of q(A) as some observational evidence, of (Vx)p(x) D q(x) ~s a general law that could explain the occurrence of q(A), and of p(A) as the hidden, underlying specific cause of q(A). Much of tile way we interpret the world in general can be understood as a process of abduction.</Paragraph> <Paragraph position="3"> When the observational evidence, the thing to be interpreted, is a natural language text, we must provide the best explanation of why the text would be true. In the TACITUS Project at SRI, we have developed a scheme for abductive inference thatyields a significant simplification in the description of interpretation processes and a significant extension of the range of phenomena that can be captured. It has been implemented in the TACITUS System (Itobbs et al., 1988, 1990; Stickel, 1989) and has been applied to several varieties of text. The framework suggests the integrated treatment of syntax, semantics, and pragmattes described below. Our principal aim in this paper is to examine the utility of this frmnework as a model for translation.</Paragraph> <Paragraph position="4"> In the abductive framework, what the interpretation of a sentence is can be described very concisely: '.Ib interpret a sentence: (1) Prove tile logical form of the sentence, together with the constraints that predicates iml)ose on their arguments, allowing for coercions, Merging redundancies where possible, Making assumptions where necessary.</Paragraph> <Paragraph position="5"> By the first line we mean &quot;prove from the predicate calculus axioms in tile knowledge base, the logical</Paragraph> <Paragraph position="7"> form that. has been produced by syntacl.ic analysis and selnantic translation of t.he sentence.&quot; In a discourse situation, the speaker and hearer hotll have their sets of private belieN, and there is a. large overlapping set of mutual beliefs. An utterance stands with one foot in nmtual belief and one foot in the speaker's private beliefs. It is a bid to extend the area of mutual belief to include some private beliefs of the speaker's. It is anchored referentially in mutual belief, and when we prove the logical form and the constraints, we are recognizing this referential anchor. This is the given in formation, the definite, the presupposed. Where it is necessary to make assumptions, the information comes from the speaker's private beliefs, and hence is the new information, the indefinite, the ~sserted. Merging redundancies is a way of getting a minimal, and hence a best, interpretat, ion.</Paragraph> <Paragraph position="8"> An Example. This characterization, elegant though i~ may be, would be of no interest if it did not lead to the solution of the discourse problenas we need to have solved. A brief example will illustrate t.hat it indeed does.</Paragraph> <Paragraph position="9"> (2) The Tokyo office called.</Paragraph> <Paragraph position="10"> This example illustrates three problems in &quot;local pragmatics&quot;, the reference i~roblem (What does &quot;the Tokyo oftlce&quot; refer to'?), t, be compound nominal interpretation problem (What is the implicit relation between Tokyo and the office?), and the metonymy problem (ltow can we coerce from the office to the person at the office who did the calling?).</Paragraph> <Paragraph position="11"> Let us put these problems aside, and interpret the sentence according to characterization (1). The logical form is something like</Paragraph> <Paragraph position="13"> That is, there is a calling event e by a person x related somehow (possibly by identity) to the explicit subject of the sentence o, which is an office and bears some unspecified relation nn to t which is Tokyo.</Paragraph> <Paragraph position="14"> Suppose our knowledge base consists of the following facts: We know that there is ~ person John who works for O which is an office in Tokyo T.</Paragraph> <Paragraph position="15"> (4) person(J), work-fo,'(J,O), office(O), in(O, T), Tokyo(T) Suppose we also know that work-for is a possible coercion relation, (5) (v x, y) y) a.nd that in is a possible implicit relation in compound nominals, (6) (v y, z),:,,(y, z) y) Then the proof of all but tim first, conjunct of (3) is straightforward. We tiros assutl\]e (~ e)call'(e, J), and this constitutes the new informalAon.</Paragraph> <Paragraph position="16"> Notice now that all of our local pragmatics problems have been solved. &quot;The Tokyo office&quot; hms been resolved to O. The implicit relation between Tokyo and the office has been determined to be the in relation. &quot;The Tokyo office&quot; has been coerced into &quot;John, who works for the Tokyo office.&quot; This is of course a simple example. More complex examples and arguments are given in ltobbs at al., (1990). A more detailed description of the method of abductive inference, particularly the system of weights and costs for choosing among possible interpretations, is given in that paper and in Stickel, (1989).</Paragraph> <Paragraph position="17"> The Integrated Framework. The idea of interpretation as abduction can be combined with the older idea of parsing as deduction (Kowalski, 1980, pp. 52-53; Pereira and Warren, 1983). C, onsider a grammar written in Prolog style just big enough t,o handle sentence (2).</Paragraph> <Paragraph position="18"> (7) (Vi,j,k)np(i,j) A v(j,k) D s(i,k) (8) (Vi,j,k,l)det(i,j) A n(j,k) A n(k,I)</Paragraph> <Paragraph position="20"> That is, if we have a noun phrase from &quot;inter-word point&quot; i to point j and a verb from j to k, then we have a sentence from i to k, and similarly for rule (8).</Paragraph> <Paragraph position="21"> We can integrate this with our abductive framework by moving the various pieces of expression (3) into these rules for syntax, ms follows: (9) (Vi,j,k,e,x,y,p)np(i,j,y) A v(j,k,p) Ap'(e,x) A Req(p,x) A rel(x,y) s(i, k, e) That is, if we have a noun phrase from i to j refer null ring to y and a verb from j to k denoting predicate p, if there is an eventuality e which is the condition of p being trne of some entity x (this corresponds to calf(e, x) in (3)), if ~ satisfies the selectional requirement p imposes on its argument (this corresponds to person(x)), and if x is somehow related to, or coercible from, y, then there is an interpretable sentence from i to k describing eventuality e.</Paragraph> <Paragraph position="23"> That is, if l.here is the determiner &quot;the&quot; from i to j, a noun from j to k denoting predicate wl, and another noun from k to 1 denoting predicate w~, if there is a z that wl is l,rue of and a y that w2 is true of, arm if there is a.n nn. relation between z and Y, then there is an interprelable noun phrase fl'om i to I denoting y.</Paragraph> <Paragraph position="24"> These rules incorporate the syntax in the literals like v(j,k,p), the pragmatics in the litera.ls like p'(e,a:), and the compositional semantics in the way the pragmatics expressions are constructed out of tilt in lbrmal.ion provided by the syntactic expressions.</Paragraph> <Paragraph position="25"> To 1)arse wilh a grammar in the Pmlog style, we plove s(0, N) where N is the number of words in I,he sentence. To parse and interpret in the integrated ,,,e prove (3 N, An appeal of su<:h declarative frameworks is their u.~ability for generation as welt as interpretation (Shieber, 1988). Axioms (9) and (10) ca.n be used for generation as well. In generation, we are given an ewmtuality l'2, and we need t.o find a seutence with sorne number n of words that describes it. Thus, we n,:ed t.o prove (3,,)s(0, n, PS'). Whereas in interpretation il, is tile new informal.ion that is assumed, in general, ion it is I:.he terminal nodes, like v(j, k, p), that are a,';:;umed. As.suming them constitutes uttering l, heln.</Paragraph> <Paragraph position="26"> Translation is a matter of interpreting in the source language (say, English) and generating in the target language (sa.y, Japanese). Thus, it can be cha.racterized as proving for a sentence with N words the</Paragraph> <Paragraph position="28"> where sf.: is I.he root node of the English grammar and so is the root. node of the Japanese.</Paragraph> <Paragraph position="29"> Actually, ~,his is not quite true. Missing in the logical form in (3) and in the grammar of (9) and (10) is the &quot;relative mutual identifiabillty&quot; relations that are encoded in the syntactic structure of sentences. For example, the o\[lice in (2) should be mutually identifiable once Tokyo is identified. In the absence of these conditions, the generation conjunct of (11) only says to express something true of e, not something that will enable the hearer to identify it. Nevertheless, the framework a.s it is developed so fa.r will allow us to address some nontrivial problems in translation.</Paragraph> <Paragraph position="30"> This l)oint exhibits a general problem in translation, machine or human, namely, how literal a translation should be produced. We may think of this as a scale. At one pole is what our current formalization yMds--a translation that merely says something true about the eventuality asserted in the source sentence.</Paragraph> <Paragraph position="31"> AI. the other pole is a translation that translates explicitly every property that is explicit in the source sentence. Our translation below of example (2) lies somewhere in between these two poles, ldeMly, tile translation should be one that will lead the hearer to tile same underlyiug situation as an interpretation. I~ is not yet clear how this can be specified hmnally.</Paragraph> <Paragraph position="32"> The ExamI)le Translated. All idiomatic trans- null pp(i,j, e) mean.~ that there is a particle phrase from i to j with the missing a.rgumenl, e. part is a particle and the predicate it encodes.</Paragraph> <Paragraph position="33"> If we are going to translate between the two languages, we need axioms specifying the transfer relao tions. Let us suppose &quot;denwa&quot; is lexically ambiguous between the telephone instrument denwal and the calling event denwa2. This can be encoded in the</Paragraph> <Paragraph position="35"> Lexical disambiguation occurs &s a byproduct of interpretation in this framework, when the proof of the logical form uses one or the other of these axioms.</Paragraph> <Paragraph position="36"> &quot;i)enwa ga aru&quot; is an idiomatic way of expressing a calling event in Japanese. This can be expressed by of the senl.ences are given; in practice, this can be carried down to the level of characters.</Paragraph> <Paragraph position="38"> We will need an axiom i, hat coarsens the granularity of l.he source. If Jolm is in Tokyo when he calls, then Tokyo as well as aolln is the source.</Paragraph> <Paragraph position="40"> If x works for y, then x is in y: (23) (V x, y)work-for(z, y) D in(z, y.) Finally, we will need axioms specifying the equivalence of the particle &quot;karl&quot; with the deep cruse Source (24) and the equivalence between tile particle. <<no&quot; and the implicit relation in English compound nolniuals (2r,) (v v) - .,D Note that these &quot;transfer&quot; axioms encode world knowledge (22 and 23), lexical ambiguities (18 and 19), direct relations between tile two languages (20 and 25), and relations between the lang,\[ages and deep &quot;interlingnal&quot; predicates (21 and 24).</Paragraph> <Paragraph position="41"> 'the proof of expression (11), using the English grammar of (9)-(10), tile knowledge base of (4)-(6), tile Japanese grammar and lexicon of (14)-(19), and the transfer a.xioms of (20)-(25), is shbwn in Figure 1. Boxes are drawn a.round the expressions that need to be assmned, namely, the new information in the interpretation and the occurrence of lexical it.eros in the generation.</Paragraph> <Paragraph position="42"> The axioms occnr at a variety of levels, from tile very superficial (axiom 25), to very langnage-pair specific transfer rules (axiom 20), to deep relations at the interlingual level (axioms 21-24). This approach thus permits mixing in one framework both transfer and interlingual approaches to translation. One can state transfer rules between two languages at various levels of linguistic abstraction, and between different levels of the respective languages. Such freedom in transfer is exactly what is needed for translation, especially for such typologically dissimilar languages as English and Japanese. It is thus possible to build a single system for translating among more than two languages in this framework, incorporating the labor savings of interlingual approaches while allowing the convenient specifieities of transfer approaches.</Paragraph> <Paragraph position="43"> We should note that other translations for sentence (2) are possible in different contexts. Two other possibilities are the following: (26) Tokyo no office ga denwa shirnashita.</Paragraph> <Paragraph position="44"> Tokyo's office Subj call did-Polite Ti, e 'tokyo omce made {aM, el call.</Paragraph> <Paragraph position="45"> (27) Tokyo no otlice kara no denwa ga arimashita.</Paragraph> <Paragraph position="46"> Tokyo's office from's call Subj existed-Polite There was the call fl'om the Tokyo omce (that we were expecting).</Paragraph> <Paragraph position="47"> The difference between (12) and (26) is the speaker's viewpoint. Tile speaker takes tile receiver's viewpoint in (12), while it is neutral between the caller and the receiver in (26). (27) is a more specific version of (12) where the call is mutually identifiable. All of (12), (26) and (27) are polite with the suffix &quot;-masu&quot;. Non-polite variants are also possible translations.</Paragraph> <Paragraph position="48"> On the other hand, in the following sentence (28) Tokyo no office karl denwa shimashita.</Paragraph> <Paragraph position="49"> Tokyo's office from call did-Polite \[1 made {althe\] call fl'om the Tokyo omce.</Paragraph> <Paragraph position="50"> there is a sti!ollg hfference that the caller is the speaker or son\]eone else who is very salient hi the current coiltext. null The use of &quot;shimashita&quot; (&quot;did&quot;)in (26) and (28) indica.tes tim description from a neutral poiig of view of an event of some agent in tile Tokyo office CallSillg a telephone call to occnr at the recipienWs end. This neutral point of view is expressed in (26). In (28), tile subject is omitted and hence must be salient, and consequently, the sentence is told from tile caller's point of view. In (12) &quot;ari-mashitPS' (&quot;existed&quot;) is used, and since the telephone call exists primarily, or only, at the recipient's end, it is a~ssumed the speaker, at least in point of view, is at the receiver's end.</Paragraph> <Paragraph position="51"> Although we have not done it here, it looks as though these kinds of considerations can be formalized in our framework as well.</Paragraph> <Paragraph position="52"> IIard Problems. If a new approach to machine translation is to be compelling, it must show promise of being able to handle some of the hard problems.</Paragraph> <Paragraph position="53"> We have identified four especially hard problems in translating between English and Japanese.</Paragraph> <Paragraph position="54"> 1. The lexical differences (that occur between any two languages).</Paragraph> <Paragraph position="55"> 2. Honorifics.</Paragraph> <Paragraph position="56"> 3. Definiteness and number.</Paragraph> <Paragraph position="57"> 4. The context-dependent &quot;information structure&quot;. The last of these includes the use of &quot;wa&quot; versus &quot;gPS', tile order of noun phrases, and the omission of arguments. null</Paragraph> <Paragraph position="59"> These are the areas where one language's roof phosyntax requires distinctions that are only implicit in the commousense knowle(Ige or context, in tile other language. Such problems cannot be handled by existing senl.ence-by-senteuce translation syst.ems without unnecessarily complicating the representations for each language.</Paragraph> <Paragraph position="60"> In this short paper, we can only give the briefest indication of why we think our framework will be productive in investigating the Iirst three of these prob\]el/iS. null Lexical Differences. Lexical differences , where they can be specified precisely, can be encoded axiomatically: null</Paragraph> <Paragraph position="62"> Information required for supplying Japanese numeral classifiers can be specified similarly. Thus the equivalence between the English &quot;two trees&quot; and the Japanese &quot;ni hou no ki&quot; can be captured by tim ax-</Paragraph> <Paragraph position="64"> Honorilies. Politeness is expressed in very different ways in English and Japanese. In Japanese it is grammaticized and \[exiealized in s6metimes very elaborate ways in the form of houorifics. One might think that the problem of honorifics does not arise in most practical translation tasks, such as translating computer manuals. English lacks honorifics and in Japanese technical literature they are conventionalized. But if we are translating business letters, this aspect of language becomes very important. It is realized in English, but in a very different way. When one is writing to one's superiors, there is, for example, much more embedding of requests in hypotheticals.</Paragraph> <Paragraph position="65"> Consider for example the following English sentence and its most idiomatic translation: Would it perhaps be possible for you to lend me your book? Go-hon o kashite-itadak-e-masu ka.</Paragraph> <Paragraph position="66"> llonorific-book Obj lending-receive-can- Polite ? In Japanese, the object requested is preceded by the honorific particle &quot;go&quot;, &quot;itadak&quot; is a verb used for a receiving by a lower status person from a higher status person, and &quot;rnasu&quot; is a politeness ending for verbs. In English, by contrast, the speaker embeds tile request in various modMs, &quot;would&quot;, &quot;perhaps ~', and &quot;possible&quot;, and uses a more formal register than normal, ill his choice, for example, of &quot;perhaps&quot; rather than &quot;maybe&quot;.</Paragraph> <Paragraph position="67"> The facts about the use of honorifics can be encoded axionmtically, with predicates such as HigherStatus, where this information is known. Since all knowledge in this framework is expressed uniformly in predicate calculus axioms, it is straightforward to combine information from different &quot;knowledge sources&quot;, stlch as syntax and the speech act situation, into single rules. It is therefore relatively easy to write axioms that, for example, restrict the use of certain verbs, depending on the relative status of tim agent and object, or the speaker and hearer. For example, &quot;to give&quot; is translated into the Japanese verb &quot;kudasaru&quot; if tim giver is of higher status than the recipient, but into the verb &quot;s~shiageru&quot; if the giver is of lower status. Similarly, the grammatical fact about the use of tim suffix &quot;-masu&quot; and the fact about the speech act situation that speaker wishes to be polite may also be expressed in the same axiom.</Paragraph> <Paragraph position="68"> We can also express the facts concerning the use of the honorific particle &quot;o&quot; (or &quot;go&quot;) before nouns. There seem to be three closes of nouns in this respect. Some nouns, such as &quot;cha&quot; (&quot;tea&quot;), always take the particle (&quot;o-cha'). Some nouns, especially loan words like &quot;kShi&quot; (&quot;coffee&quot;), never take the particle. Other nouns, such ~ &quot;bSshi&quot; (&quot;hat&quot;), take the honorific prefix if the entity referred to belongs to someone of higher status. For this class of nouns we can state the condition formally.</Paragraph> <Paragraph position="69"> (Y i, j, k, p, x, y)Honorific(i, j, o) ^ No. (j, k, p) A p@) ^ pos s(U, A HigherStatus(y, Speaker) D NP(i, k, x) That is, if the honorific particle &quot;o&quot; occurs from point i to point j, the noun denoting the predicate p occurs from point j to point k, and p is true of some entity x where someone y possesses x and y is of higher status than the speaker, then there is an interpretable noun phrase from point i to point k referring to x.</Paragraph> <Paragraph position="70"> Definiteness and Number. The definiteness and number problem is illustrated by the fact that the Japanese word &quot;ki&quot; can be translated into &quot;the tree&quot; or &quot;a tree&quot; or &quot;the trees&quot; or &quot;trees&quot;. It in not so straightforward to deal with this problem axiomatically. Nevertheless, our framework, based ~ it is on deep interpretation and on the distinction between given and new information, provides us with what we need to begin to address the problem. A first approximation of a method for translating Japanese NPs into English NPs is as follows: 1. R.esolve deep, i.e., find the referrent of the Japanese NP.</Paragraph> <Paragraph position="71"> 2. Does the Japanese NP refer t.o a set of two or more? If so, translate it as a plural, otherwise as a singular.</Paragraph> <Paragraph position="72"> 3. Is the entity (or set) &quot;mutually identifiable&quot;? If so, then translate it ~s a definite, otherwise as an indefinite.</Paragraph> <Paragraph position="73"> &quot;Mutually identifiable&quot; means first of all that the description provided by the Japanese NP is mutually known, and secondly that there is a siltgle most salient such entil,y. &quot;Most salient&quot; means that there are no other equally high-ranking interpretations of the Japanese sentence that resolve tim NP in some other way. (Generic definite noun phrases are }?eyond the scope of this paper.) Conclusion. We have sketched our solutions to the various problems in translation with a fairly broad brush in t, his short paper. We recognize that many details need to be worked out, and that in fact most of l, he work in machine translation is in working out the details. But we felt that in proposing a new formalism \[or translation research, it. was iml)orl, aut to sta.nd 1)a.ek and get a. view of the forest befot'e moving in to examine the individual trees.</Paragraph> <Paragraph position="74"> Most machine translation systems today map the source language text into a logical form that is fairly close to the source language text, transfor,n it into a logical tbrrn that is fairly close to a target, language text, and generate the target language text.. What is needed is first of all the possibility of doing deep interpretation when that is what is called for, and secondly the possibility of translating from the source to the target langua.ge at a variety of levels, from the most superficial to levels requiring deep interpretation and access to knowledge about the world, the context, and the speech act situation. This is precisely what the framework we have presented here makes possible.</Paragraph> </Section> </Section> class="xml-element"></Paper>