File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/94/p94-1031_abstr.xml
Size: 28,861 bytes
Last Modified: 2025-10-06 13:48:16
<?xml version="1.0" standalone="yes"?> <Paper uid="P94-1031"> <Title>Tricolor DAGs for Machine Translation</Title> <Section position="1" start_page="0" end_page="232" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> Machine translation (MT) has recently been formulated in terms of constraint-based knowledge representation and unification theories~ but it is becoming more and more evident that it is not possible to design a practical MT system without an adequate method of handling mismatches between semantic representations in the source and target languages. In this paper, we introduce the idea of &quot;information-based&quot; MT, which is considerably more flexible than interlingual MT or the conventional transfer-based MT.</Paragraph> <Paragraph position="1"> Introduction With the intensive exploration of contemporary theories on unification grammars\[6, 15, 13\] and feature structures\[7, 19\] in the last decade, the old image of machine translation (MT) as a brutal form of natural language processing has given way to that of a process based on a uniform and reversible architecture\[16~ 1, 27\].</Paragraph> <Paragraph position="2"> The developers of MT systems based on the constraint-based formalism found a serious problem in &quot;language mismatching,&quot; namely, the difference between semantic representations in the source and target languages. 1 Attempts to design a pure interlingual MT system were therefore abandoned, 2 and the notion of &quot;semantic transfer&quot;\[24, 22\] came into focus as a practical solution to the problem of handling the language mismatching. The constraint-based formalism\[2\] seemed promising as a formal definition of transfer, but pure constraints are too rigid to be precisely imposed on target-language sentences.</Paragraph> <Paragraph position="3"> system.</Paragraph> <Paragraph position="4"> the concept of defeasible reasoning in order to formalize what is missing from a pure constraint-based approach, and control mechanisms for such reasoning have also been proposed\[5. 3\]. With this additional mechanism, we can formulate the &quot;transfer&quot; process as a mapping from a set of constraints into another set of mandatory and defensible constraints. This idea leads us further to the concept of &quot;information-based&quot; MT, which means that, with an appropriate representation scheme, a source sentence can be represented by a set of constraints that it implies and that, given a target sentence, the set Co of constraints can be divided into three disjoint subsets: * The subset Co of constraints that is also implied by the target sentence * The subset C+ of constraints that is not implied by, but is consistent with, the translated sentence * The subset C- of constraints that is violated by the target sentence The target sentence may also imply another set C~eto of constraints, none of which is in Ca. That is~ the set Ct of constraints implied by the target sentences is a union of C0 and C~e~o, while Cs = CoUC+UC_. When Ca = Co = Ct, we have a fully interlingual translation of the source sentence. If C+ C/ C/, C_ = C/, and Chew = C/, the target sentence is said to be under-generated~ while it is said to be over-generated when C+ = C/, C- = C/, and Cacao y~ C/.s In either case, C- must be empty if a consistent translation is required. Thus, the goal of machine translation is to find an optimal pair of source and target sentences that minimizes C+~C-, and C~w. Intuitively, Co corresponds to essential information, and C+ and Cneto can be viewed as language-dependent supportive information. C_ might be the inconsistency be-ZThe notions of completeness and coherence in LFG\[6\] have been employed by Wedekind\[25\] to avoid over- and under-generation.</Paragraph> <Paragraph position="5"> tween the assumptions of the source- and target-language speakers.</Paragraph> <Paragraph position="6"> In this paper~ we introduce tricolor DAGs to represent the above constraints, and discuss how tricolor DAGs are used for practical MT systems. In particular, we give a generation algorithm that incorporates the notion of semantic transfer by gradually approaching the optimal target sentence through the use of tricolor DAGs, when a fully interlingual translation fails. Tricolor DAGs give a graph-algorithmic interpretation of the constraints, and the distinctions between the types of constraint mentioned above allow us to adjust the margin between the current and optimal solution effectively.</Paragraph> <Paragraph position="7"> Tricolor DAGs A tricolor DAG (TDAG, for short) is a rooted, directed, acyclic 4 graph with a set of three colors (red, yellow, and g'reen) for nodes and directed arcs. It is used to represent a feature structure of a source or target sentence. Each node represents either an atomic value or a root of a DAG, and each arc is labeled with a feature name. The only difference between the familiar usage of DAGs in unification grammars and that of TDAGs is that the color of a node or &quot;arc represents its degree of importance: 1. Red shows that a node (arc) is essential.</Paragraph> <Paragraph position="8"> 2. Yellow shows that a node (arc) may be ignored, but must not be violated.</Paragraph> <Paragraph position="9"> 3. Green shows that a node (arc) may be violated. For practical reasons, the above distinctions are interpreted as follows: 1. Red shows that a node (arc) is derived from lexicons and grammatical constraints.</Paragraph> <Paragraph position="10"> 2. Yellow shows that a node (arc) may be inferred from a source or a target sentence by using domain knowledge, common sense, and so on.</Paragraph> <Paragraph position="11"> 3. Green shows that a node (arc) is defeasibly in- null ferred, specified as a default, or heuristically specified.</Paragraph> <Paragraph position="12"> When all the nodes and arcs of TDAGs are red, TDAGs are basically the same as the feature structures 5 of grammar-based translation\[25, 17\]. A TDAG is well-formed iff the following conditions are satisfied: 4Acyclicity is not crucial to the results in this paper, but it significantly simplifies the definition of the tricolor DAGs and semantic transfer.</Paragraph> <Paragraph position="13"> SWe will only consider the semantic portion of the feature structure although the theory of tricolor DAGS for representing entire feature structures is an interest- null ing topic.</Paragraph> <Paragraph position="14"> 1. The root is a red node.</Paragraph> <Paragraph position="15"> 2. Each red arc connects two red nodes.</Paragraph> <Paragraph position="16"> 3. Each red node is reachable from the root through the red arcs and red nodes.</Paragraph> <Paragraph position="17"> 4. Each yellow node is reachable from the root through the arcs and nodes that are red and/or yellow.</Paragraph> <Paragraph position="18"> 5. Each yellow arc connects red and/or yellow nodes.</Paragraph> <Paragraph position="19"> 6. No two arcs start from the same node, and have the same feature name.</Paragraph> <Paragraph position="20"> Conditions 1 to 3 require that all the red nodes and red arcs between them make a single, connected DAG. Condition 4 and 5 state that a defeasible constraint must not be used to derive an imposed constraint. In the rest of this paper, we will consider only well-formed TDAGs. Furthermore, since only the semantic portions of TDAGs are used for machine translation, we will not discuss syntactic features.</Paragraph> <Paragraph position="21"> The subsurnption relationship among the TDAGs is defined a~ the usual subsumption over DAGs, with the following extensions.</Paragraph> <Paragraph position="22"> * A red node (arc) subsumes only a red node (arc). * A yellow node (arc) subsumes a red node (arc) and a yellow node (arc).</Paragraph> <Paragraph position="23"> * A green node (arc) subsumes a node (arc) with any color.</Paragraph> <Paragraph position="24"> The unification of TDAGs is similarly defined. The colors of unified nodes and arcs are specified as follows: * Unification of a red node (arc) with another node (arc) makes a red node (arc).</Paragraph> <Paragraph position="25"> * Unification of a yellow node (arc) with a yellow or green node (arc) makes a yellow node (arc). * Unification of two green nodes (arcs) makes a green node (arc).</Paragraph> <Paragraph position="26"> Since the green nodes and arcs represent defensible constraints, unification of a green node (either a root of a TDAG or an atomic node) with a red or yellow node always succeeds~ and results in a red or yellow node. When two conflicting green nodes are to be unified, the result is indefinite, or a single non-atomic green node. 6 Now, the problem is that a red node/arc in a source TDAG (the TDAG for a source sentence) 6An alternative definition is that one green node has precedence over the other\[14\]. Practically, such a conflicting unification should be postponed until no other possibility is found.</Paragraph> <Paragraph position="27"> may not always be a red node/arc in the target TDAG (the TDAG for a target sentence). For example, the functional control of the verb &quot;wish&quot; in the English sentence John ~ished to walk may produce the TDAGI in Figure 1, but the red arc corresponding to the agent of the *WALK predicate may not be preserved in a target TDAG2. 7 This means that the target sentence a\]one cannot convey the information that it is John who wished to walk, even if this information can be understood from the context. Hence the red arc is relaxed into a yellow one, and any target TDAG must have an agent of *WALK that is consistent with *JOHN. This relaxation will help the sentence generator in two ways. First, it can prevent generation failure (or non-termination in the worst case). Second, it retains important information for a choosing correct translation of the verb &quot;walk&quot;. s rFor example, the Japanese counterpart &quot;~&quot; for the verb &quot;wish&quot; only takes a sentential complement, and no functional control is observed.</Paragraph> <Paragraph position="28"> SWhether or not the subject of the verb is human is often crucial information for making an appropriate choice between the verb's two Japanese counterparts &quot;~ <&quot; and &quot;~?~7o&quot;.</Paragraph> <Paragraph position="29"> Another example is the problem of identifying number and determiner in Japanese-to-English translation. This type of information is rarely available from a syntactic representation of a Japanese noun phrase, and a set of heuristic rules\[ll\] is the only known basis for making a reasonable guess. Even if such contextual processing could be integrated into a logical inference system, the obtained information should be defeasible, and hence should be represented by green nodes and arcs in the TDAGs. Pronoun resolution can be similarly represented by using green nodes and arcs.</Paragraph> <Paragraph position="30"> It is worth looking at the source and target TDAGs in the opposite direction. From the Japanese sentence, John +subj walk +nom +obj wished we get the source TDAG3 in Figure I, where functional control and number information are missing. With the help of contextual processing, we get the target TDAG4, which can be used to generate the English sentence &quot;John wished to walk.;&quot; Semantic Transfer As illustrated in the previous section, it is often the case that we have to solve mismatches between source and target TDAGs in order to obtain successful translations. Syntactic/semantic transfer has been formulated by several researchers\[18, 27\] as a means of handling situations in which fully interlingual translation does not work. It is not enough, however, to capture only the equivalent relationship between source and target semantic representations: this is merely a mapping among red nodes and arcs in TDAGs. What is missing in the existing formulation is the provision of some margin between what is said and what is translated. The semantic transfer in our framework is defined as a set of successive operations on TDAGs for creating a sequence of TDAGs to, tl, ..., tk such that to is a source TDAG and tk is a target TDAG that is a successful input to the sentence generator.</Paragraph> <Paragraph position="31"> A powerful contextual processing and a domain knowledge base can be used to infer additional facts and constraints, which correspond to the addition of yellow nodes and arcs. Default inheritance, proposed by Russell et al.\[14\], provides an efficient way of obtaining further information necessary for translation, which corresponds to the addition of green nodes and arcs. A set of well-known heuristic rules, which we will describe later in the &quot;Implementation&quot; Section, can also be used to add green nodes and arcs. To complete the model of semantic transfer, we have to introduce a &quot;painter.&quot; A painter maps a red node to either a yellow or a green node, a yellow node to a green node, and so on. It is used to loosen the constraints imposed by the TDAGs. Every application of the painter monotonically loses some information in a TDAG, and only a finite number of applications of the painter are possible before the TDAG consists entirely of green nodes and arcs except for a red root node. Note that the painter never removes a node or an arc from a TDAG, it simply weakens the constraints imposed by the nodes and arcs.</Paragraph> <Paragraph position="32"> Formally, semantic transfer is defined as a sequence of the following operations on TDAGs: * Addition of a yellow node (and a yellow arc) to a given TDAG. The node must be connected to a node in the TDAG by a yellow arc.</Paragraph> <Paragraph position="33"> * Addition of a yellow arc to a given TDAG. The arc must connect two red or yellow nodes in the TDAG.</Paragraph> <Paragraph position="34"> * Addition of a green node (and a green arc) to a given TDAG. The node must be connected to a node in the TDAG by the green arc.</Paragraph> <Paragraph position="35"> * Addition of a green arc to a given TDAG. The arc can connect two nodes of any color in the TDAG.</Paragraph> <Paragraph position="36"> * Replacement of a red node (arc) with a yellow one, as long as the well-formedness is preserved. * Replacement of a yellow node (arc) with a green one, as long as the well-formedness is preserved. The first two operations define the logical implications (possibly with common sense or domain knowledge) of a given TDAG. The next two operations define the defensible (or heuristic) inference from a given TDAG. The last two operations define the painter. The definition of the painter specifies that it can only gradually relax the constraints. That is, when a red or yellow node (or arc) X has other red or yellow nodes that are only connected through X, X cannot be &quot;painted&quot; until each of the connected red and yellow nodes is painted yellow or green to maintain the reachability through X.</Paragraph> <Paragraph position="37"> In the sentence analysis phase, the first four operations can be applied for obtaining a source TDAG as a reasonable semantic interpretation of a sentence. The application of these operations can be controlled by &quot;weighted abduction&quot;\[5\], default inheritance, and so on. These operations can also be applied at semantic transfer for augmenting the TDAG with a common sense knowledge of the target language. On the other hand, these operations are not applied to a TDAG in the generation phase, as we will explain in the next section. This is because the lexicon and grammatical constraints are only applied to determine whether red nodes and arcs are exactly derived. If they are not exactly derived, we will end up with either over- or under-generation beyond the permissible margin.</Paragraph> <Paragraph position="38"> Semantic transfer is applied to a source TDAG as many times 9 as necessary until a successful generation is made. Recall the sample sentence in Figure 1~ where two painter calls were made to change two red arcs in TDAG1 into yellow ones in TDAG2. These are examples of the first substitution operation shown above. An addition of a green node and a green arc, followed by an addition of a green arc, was applied to TDAG3 to obtain TDAG4. These additions are examples of the third and fourth addition operations.</Paragraph> <Section position="1" start_page="228" end_page="230" type="sub_section"> <SectionTitle> Sentence Generation Algorithm </SectionTitle> <Paragraph position="0"> Before describing the generation algorithm, let us look at the representation of lexicons and grammars for machine translation. A lexical rule is represented by a set of equations, which introduce red nodes and arcs into a source TDAG. ldeg A phrasal rule is similarly defined by a set of equations, which also introduce red nodes and arcs for describing a syntactic head and its complements.</Paragraph> <Paragraph position="1"> For example, if we use Shieber's PATR-II\[15\] notation~ the lexical rule for &quot;wished&quot; can be represented as follows:</Paragraph> <Paragraph position="3"> (V pred theme agent) = (V subj pred) The last four equations are semantic equations. Its TDAG representation is shown in Figure 2. It would be more practical to further assume that such a lexicai rule is obtained from a type inference system, 11 which makes use of a syntactic class hierarchy so that each lexical class can inherit general properties of its superclasses.</Paragraph> <Paragraph position="4"> Similarly, semantic concepts such as *WISH and *WALK should be separately defined in an ontological hierarchy together with necessary domain</Paragraph> <Paragraph position="6"> Boston Office called.&quot; fillers and part-of relationships. See KBMT-8918\].) A unification grammar is used for both analysis and generation. Let us assume that we have two unification grammars for English and Japanese.</Paragraph> <Paragraph position="7"> Analyzing a sentence yields a source TDAG with red nodes and arcs. Semantic interpretation resolves possible ambiguity and the resulting TDAG may include all kinds of nodes and arcs. For example, the sentence 12 The Boston office called would give the source TDAG in Figure 3. By utilizing the domain knowledge, the node labeled *PERSON is introduced into the TDAG as a real caller of the action *CALL, and two arcs representing *PERSON work-for *OFFICE and *OFFICE in *BOSTON are abductively inferred.</Paragraph> <Paragraph position="8"> Our generation algorithm is based on Wedekind's DAG traversal algorithm\[25\] for LFG. la The algorithm runs with an input TDAG by traversing the nodes and arcs that were derived from the lexicon mand grammar rules. The termination conditions are as follows: * Every red node and arc in the TDAG was derived. null * No new red node (arc) is to be introduced into the TDAG if there is no corresponding node (arc) of any color in the TDAG. That is, the generator can change the color of a node (arc) to red, but cannot add a new node (arc).</Paragraph> <Paragraph position="9"> * For each set of red paths (i.e., the sequence of red arcs) that connects the same pair of nodes, the reentrancy was also derived.</Paragraph> <Paragraph position="10"> These conditions are identical to those of Wedekind except that yellow (or green) nodes and arcs may or may not be derived. For example, the sentence &quot;The Boston Office called&quot; in Figure 3 can be translated into Japanese by the following sequence of semantic transfer and sentence gener- null ation.</Paragraph> <Paragraph position="11"> 1. Apply the painter to change the yellow of the definite node and the def arc to green.</Paragraph> <Paragraph position="12"> 2. Apply the painter to change the yellow of the singular node and the hum arc to green. The resulting TDAG is shown in Figure 4.</Paragraph> <Paragraph position="13"> 3. Run the sentence generator with an input feature structure, which has a root and an arc pred connecting to the given TDAG. (See the node marked &quot;1&quot; in Figure 4.) 4. The generator applies a phrasal rule, say S ---* NP VP, which derives the subj arc connecting to the subject NP (marked &quot;2&quot;), and the agent arc.</Paragraph> <Paragraph position="14"> 5. The generator applies a phrasal rule, say NP ---+ MOD NP, TM which derives the npmod arc to the 14There are several phrasal rules for deriving this LHS NP in Japanese: (1) A noun-noun compound, (2) a noun, copula, and a noun, and (3) a noun, postpositional particle, and a noun. These three rules roughly correspond to the forms (1) Boston Office, (2) office of Boston, and (3) office in Boston. Inference of the &quot;*OFFICE in *BOSTON&quot; relation is easiest if rule (3) modifier of the NP (marked &quot;3&quot;) and the rood arc.</Paragraph> <Paragraph position="15"> 6. Lexical rules are applied and all the semantic nodes, *CALL, *OFFICE, and *BOSTON are derived.</Paragraph> <Paragraph position="16"> The annotated sample run of the sentence generator is shown in Figure 5. The input TDAG in the sample run is embedded in the input feature structure as a set of PRED values, but the semantic arcs are not shown in the figure. The input feature structure has syntactic features that were specified in the lexical rules. The feature value *UNDEFINED* is used to show that the node has been traversed by the generator.</Paragraph> <Paragraph position="17"> The basic property of the generation algorithm is as follows: Let t be a given TDAG, tmi~ be the connected subgraph including all the red nodes and arcs in t, and t,~, be the connected subgraph of t obtained by changing all the colors of the nodes and arcs to red. Then, any successful generation with the derived TDAG tg satisfies the condition that t,,i~ subsumes ta, and t a subsumes trnaz.</Paragraph> <Paragraph position="18"> The proof is immediately obtained from the definition of successful generation and the fact that the generator never introduces a new node or a new arc into an input TDAG. The TDAGs can also be employed by the semantic head-driven generation algorithm\[17\] while retaining the above property. Semantic monotonicity always holds for a TDAG, since red nodes must be connected. It has been shown by Takeda\[21\] that semantically non-monotonic representations can also be handled by introducing a functional semantic class.</Paragraph> </Section> <Section position="2" start_page="230" end_page="232" type="sub_section"> <SectionTitle> Implementation </SectionTitle> <Paragraph position="0"> We have been developing a prototype English-to-Japanese MT system, called Shalt2122\], with a lexicon for a computer-manual domain including about 24,000 lexemes each for English and Japanese, and a general lexicon including about 50,000 English words and their translations. A sample set of 736 sentences was collected from the &quot;IBM AS/400 Getting Started&quot; manual, and was tested with the above semantic transfer and generation algorithmJ s The result of the syntactic analysis by the English parser is mapped to a TDAG using a set of semantic equations 16 ohis used, but the noun-noun compound is probably the best translation. ! 15We used McCord's English parser based on his English Slot Grammar\[10\], which covered more than 93% of the sentences.</Paragraph> <Paragraph position="1"> l~We call such a set of semantic equations mapping rules (see Shalt2\[20\] or KBMT-8918\]).</Paragraph> <Paragraph position="2"> ;; run the generator with input f-structure tained from the lexicons. We have a very shallow knowledge base for the computer domain, and no logical inference system was used to derive further constraints from the given source sentences. The Japanese grammar is similar to the one used in KBMT-89, which is written inpseudounification\[23\] equations, but we have added several new types of equation for handling coordinated structures. The Japanese grammar can generate sentences from all the successful TDAGs for the sample English sentences.</Paragraph> <Paragraph position="3"> It turned out that there were a few collections of semantic transfer sequences which contributed very strongly to the successful generation. These Other kinds of semantic transfer are rather idiosyncratic, and are usually triggered by a particular lexical rule. Some of the sample sentences used for the translations are as follows: ~s Make sure you are using the proper edition for the level of the product. ~-+f- ~ ~ (c) p~<m ~ ~t~ user +subj product +pos level +for proper edition +obj use +prog +nom +obj confirm +imp Publications are not stocked at the address publication +subj following +loc provide address +loc stock +passive +neg This publication could contain technical inaccuracies or typographical errors.</Paragraph> <Paragraph position="4"> this publication +subj technical inaccuracy or typographical error +obj contain +ability +past 17We decided to include the passivization feature in the semantic representation in order to determine the proper word ordering in Japanese.</Paragraph> <Paragraph position="5"> 1s Japanese translation reflects the errors made in English analysis. For example, the auxiliary verb &quot;could&quot; is misinterpreted in the last sample sentence. The overall accuracy of the translated sentences was about 63%. The main reason for translation errors was the occurrence of errors in lexical and structural disambiguation by the syntactic/semantic analyzer. We found that the accuracy of semantic transfer and sentence generation was practically acceptable.</Paragraph> <Paragraph position="6"> Though there were few serious errors, some occurred when a source TDAG had to be completely &quot;paraphrased&quot; into a different TDAG. For example, the sentence Let's get started.</Paragraph> <Paragraph position="7"> was very hard to translate into a natural Japanese sentence. Therefore, a TDAG had to be paraphrased into a totally different TDAG, which is another important role of semantic transfer. Other serious errors were related to the ordering of constituents in the TDAG. It might be generally acceptable to assume that the ordering of nodes in a DAG is immaterial. However, the different ordering of adjuncts sometimes resulted in a misleading translation, as did the ordering of members in a coordinated structure. These subtle issues have to be taken into account in the framework of semantic transfer and sentence generation.</Paragraph> <Paragraph position="8"> Conclusions In this paper, we have introduced tricolor DAGs to represent various degrees of constraint, and defined the notions of semantic transfer and sentence generation as operations on TDAGs. This approach proved to be so practical that nearly all of the source sentences that were correctly parsed were translated into readily acceptable sentences. Without semantic transfer, the translated sentences would include greater numbers of incorrectly selected words, or in some cases the generator would simply fail 19 Extension of TDAGs for disjunctive information and a set of feature structures must be fully incorporated into the framework. Currently only a limited range of the cases are implemented. Optimal control of semantic transfer is still unknown. Integration of the constraint-based formalism, defeasible reasoning, and practical heuristic rules are also important for achieving high-quality translation. The ability to process and represent various levels of knowledge in TDAGs by using a uniform architecture is desirable, but there appears to be some efficient procedural knowledge that is very hard to represent declaratively. For example, the negative determiner &quot;no&quot; modifying a noun phrase in English has to be procedurally transferred into ~gThe Essential Arguments Algorithm\[9\] might be an alternative method for finding a successful generation path.</Paragraph> <Paragraph position="9"> the negation of the verb governing the noun phrase in 3 apanese. Translation of &quot;any&quot;, &quot;yet&quot;, &quot;only&quot;, and so on involves similar problems.</Paragraph> <Paragraph position="10"> While TDAGs reflect three discrete types of constraints, it is possible to generalize the types into continuous, numeric values such as potential energy\[4\]. This approach will provide a considerably more flexible margin that defines a set of permissible translations, but it is not clear whether we can successfully define a numeric value for each lexical rule in order to obtain acceptable translations. null</Paragraph> </Section> </Section> class="xml-element"></Paper>