File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1060_metho.xml

Size: 12,426 bytes

Last Modified: 2025-10-06 14:14:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1060">
  <Title>Ambiguity Preserving Machine Translation using Packed Representations*</Title>
  <Section position="4" start_page="366" end_page="367" type="metho">
    <SectionTitle>
3 From F-structures to Term Sets
</SectionTitle>
    <Paragraph position="0"> F-stuctures encode information in a hierarchical manner by recursively embedding substructures.</Paragraph>
    <Paragraph position="1"> They provide by nature only outside-in references whereas in transfer frequently inside-out access is necessary. Hence, information access for transformation processes like transfer is not as straightforward as it could be when using flat set representations (Beaven, 1992; Whitelock, 1992). Set representations can be seen as a pool of constraints where co-references between the constraints, i.e. the set elements, are used to encode the same embedding f-structures provide.</Paragraph>
    <Paragraph position="2"> Therefore, the structural embedding which is, on the one hand, part of f-structures themself is represented, on the other hand, in the interpretation of constraint sets. Furthermore, sets come with very simple test and manipulation operations such as tests for membership and set union.</Paragraph>
    <Paragraph position="3"> In the following we define a correspondence between f-structures and sets of terms. We restrict the f-structures to transfer relevant information such as PREDS, grammatical functions, etc. Feature structure constraints are encoded as relational constraints using Prolog syntax (cf. Johnson (1991)). As examples of such sets of terms see (5) and (6) which corresponds to f-structures (3) and (4), respectively.</Paragraph>
    <Paragraph position="5"> The 2-place relation trans given below translates between f-structures and (sets of) terms.</Paragraph>
    <Paragraph position="6"> are references to f-structures which are mapped into nodes i used in terms. F are features, H(.../ describe predicates, v stands for atomic values, and ~o are complex f-structures. Co-occuring parts of f-structures are translated only once.</Paragraph>
    <Paragraph position="7">  1. (atomic values) trans&lt; ~\[r v\], r(i,v) &gt; 2. (predicate values) traitS&lt; \[~PRED II(...)\], H(i) ) 3. (complex f-structure values)</Paragraph>
    <Paragraph position="9"> zrans is bidirectional, i.e. we are able to translate between f-structures and terms for using terms as transfer input, process terms in the transfer, and convert the transfer output back to f-structures which are the appropriate generator representations.</Paragraph>
  </Section>
  <Section position="5" start_page="367" end_page="367" type="metho">
    <SectionTitle>
4 F-structure Transfer
</SectionTitle>
    <Paragraph position="0"> Transfer works on source language (SL) and target language (TL) sets of terms representing predicates, roles, etc. like the ones shown in (5) and (6). The mapping is encoded in transfer rules as in (7). For a rule to be applied, the set on the SL side must be a matching subset of the SL input set. If this is the case, we remove the covering set from the input and add the set on the other side of the rule to the TL output.</Paragraph>
    <Paragraph position="1">  Transfer is complete, if the SL set is empty.</Paragraph>
    <Paragraph position="2"> (7) a. treffen(E) &lt;-&gt; meet(E).</Paragraph>
    <Paragraph position="3"> b. kollege(X) &lt;-&gt; colleague(X).</Paragraph>
    <Paragraph position="4"> c. Berlin(X) &lt;-&gt; Berlin(X).</Paragraph>
    <Paragraph position="5"> d. in(X) &lt;-&gt; in(X).</Paragraph>
    <Paragraph position="6"> e. pro(X) &lt;-&gt; pro(X).</Paragraph>
    <Paragraph position="7"> f. subj(X,Y) &lt;-&gt; subj(X,Y).</Paragraph>
    <Paragraph position="8"> g. obj(X,Y) &lt;-&gt; obj(X,Y).</Paragraph>
    <Paragraph position="9"> h. adjn(X,Y) &lt;-&gt; adjn(X,Y).</Paragraph>
    <Paragraph position="10">  The transfer operator &lt;-&gt; is bidirectional. Upper case letters in argument positions are logical variables which will be bound to nodes at runtime. Because of the variable sharings on both sides of a rule we work on the same nodes of a graph. Hence, the overall mechanism can be formalized as a graph rewriting process.  (8) a. meet(t), subj (1,2) ,pro (2) ,num(2,pl) obj (1,3), colleague (3), num (3, pl), spec (3, def), adj n (1,4), in(4) obj (4,5),Berlin (5) b. &amp;quot;FRED meet&lt;~, ~&gt; SUBJ \[~\]\[PRED pro\] NtJM pl J \[FRED colleague\] \[alFRED 'n&lt;m&gt; n }J ADJN \[ L degB`\] \[~\[PRED Berli  Applying the rule set in (7) to (5), we yield the result in (8a). Using the correspondence between f-structures and term representations it is possible to translate back to the TL f-structure in (8b). This f-structure will be passed on to the generator which will produce the utterance in (2a) as one of the possible paraphrases. The transfer rules in (7c-h) which are defined as the identity transformation between SL and TL are actually redundant. They can be replaced via a general metarule which passes on all singleton sets which are not covered by any explicit transfer rule. The same metarule transfers also morpho-syntactic information like number and definiteness.</Paragraph>
  </Section>
  <Section position="6" start_page="367" end_page="368" type="metho">
    <SectionTitle>
5 Packed Representations
</SectionTitle>
    <Paragraph position="0"> The following example in (9) provides a packed f-structure respresentation for the German sentence in (1). The ambiguous PP attachment of the 'in' PP is represented via a local disjunction 1 (X=I V X=3) which binds the external variable X of the adjunct relation to either node I or node 3 representing the VP or NP attachment, respectively.</Paragraph>
    <Paragraph position="2"> m_-~ v \[\]=ill Applying the very same transfer rules in (7) to the input in (9) produces the result in (10) which fully preserves the ambiguity between source and target language.</Paragraph>
    <Paragraph position="4"> If the generator takes the corresponding f-structure for this packed description as input it will generate (1) repeated in (11) and not any of  the paraphrases in (2) because they would not cover both ambiguities at the same time.</Paragraph>
    <Paragraph position="5"> (11) We will meet the colleagues in Berlin. The local disjunction is not affected by the application of the transfer rule for mapping the adjunct relation to the target language because there is no interaction between the variable x and any other predicate.</Paragraph>
  </Section>
  <Section position="7" start_page="368" end_page="369" type="metho">
    <SectionTitle>
6 Local Disambiguation
</SectionTitle>
    <Paragraph position="0"> If it is not possible to fully preserve the attachment ambiguities between source and target language, we need to partially disambiguate the relevant ambiguity. For example, this would be the case if we would translate (1) to Japanese.</Paragraph>
    <Paragraph position="1"> Depending whether we attach to the NP 'the colleagues' or to the VP we have to choose between two different postpositions 'de' (location) vs. 'no' (adnominal modification). The two sentences in (12) show the Japanese translations together with their English glosses.</Paragraph>
    <Paragraph position="2"> (12) a. watashi tachi -ga berurin -de we NOM Berlin LOC dooryoo -to aimasu colleagues COM will meet (In Berlin we will meet the colleagues.) b. watashi tachi -ga berurin -no we NOM Berlin MOD dooryoo -to aimasu colleagues COM will meet (We will meet the colleagues from Berlin.) The choice of the postposition could be triggered via selectional restrictions in the condition part of the transfer rules. The rules in (13) show two components on their lefthand sides: the part to the right of # is a test on a copy of the original input. The test matches an adjunct relation where the variable Y is bound to the internal argument. Y is coindexed with the node of the SL preposition 'in'. The variable X is bound to the external argument node where the adjunct is attached. The second element of the test checks the selectional restriction 2 of this attachment. 2Instead of using explicit predicates for testing seleetional restrictions the real system uses a sort system. The test on explicit predicates is replaced with a more general sortal subsumption test, e.g. sort (X)&lt;event vs.</Paragraph>
    <Paragraph position="4"> no(Y).</Paragraph>
    <Paragraph position="5"> The Japanese distinction is parallel to the case where the German preposition 'in' would be translated either with the English preposition 'in' or the preposition 'from' depending which of the two meanings is taken. Hence for ease of exposition we will apply the two equivalent transfer rules in (14) for the translation of the  'in' instead of the equivalent Japanese ones. (14) a. in(Y) # adjn(X,Y),treffen(X) -&gt; in(Y).</Paragraph>
    <Paragraph position="6"> b. in(Y) # adjn(X,Y),kollege(X) -&gt; from (Y).</Paragraph>
    <Paragraph position="7">  Since the external argument of the adjunct relation takes part in the local disjunction (X=l V X=3) the application of transfer rule (14a) triggers a local resolution. This is done by applying the distributive law such that the selectional restriction can be tested. For the first disjunct this yields true whereas it fails for the second disjunct. Rule (14b) is treated in the same way where only the test on the second disjunct can be satisfied. Both results are joined together and are associated with the very same disjunction: (X=l, in(4) V X=3, from(4)).</Paragraph>
    <Paragraph position="8">  (15) a. meet(l),  As a final result we get the packed representation in (15), where the two prepositions are distributed into the local disjunction without converting to disjunctive normal form.</Paragraph>
    <Paragraph position="9">  The transferred packed representation corresponds to the two possible utterances in (16). It would be left as a task for the (human) negotiator to find out which of the two sentences would be more appropriate in a given context situation. Due to the local nature of the disjunctions they can be handed over to an additional resolution component in order to disambiguate them or if the discourse and world knowledge is not sufficient for disambiguating to leave them as choices for the human translator.</Paragraph>
    <Paragraph position="10"> (16) a. we will meet the colleagues in Berlin b. we will meet the colleagues from Berlin The main advantage of such an approach is that the transfer rules are independent of the fact whether they are applied to packed representations or not. Unpacking is done only locally and as much as necessary. Only the internal processing needs to be adapted in order to keep track which of the local disjuncts are processed. This is done with a simple book-keeping mechanism which keeps track for any individual term to which local disjunct it belongs. Technically, it is done by using the contexted constraints as described in Maxwell III and Kaplan (1991). Hence the whole mechanism can be kept fully transparent for the transfer rule writer and all of the complexity can be dealt with internally in the transfer rule compiler which compiles the external transfer rule format into an executable Prolog program which propagates the necessary variable sharings.</Paragraph>
    <Paragraph position="11"> In order to avoid duplicated work while trying to apply all possible transfer rule combinations the transfer system uses an internal chart to store all successful rule applications. Each predicate in the input set gets assigned a unique bit in a bit vector such that it can be checked easily that no predicate is covered more than once while trying to combine different edges in the chart. With this scheme it is also possible to identify the final edges because they are the ones where all bits are set. The overall processing scheme using an agenda and the data structures are very similar to the chart representation as proposed for doing chart-based generation from ambiguous input (cf. Kay (1996) and Shemtov (1996)). The main difference stems from the lack of explicit context-free grammar rules. Instead, in the proposed setup, the left hand sides of transfer rules are interpreted as immediate dominance rules as they are used for describing free word order languages supplemented with a single binary context-free rule which recursively tries to combine all possible subsets of terms for which no explicit transfer rule exists.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML