File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2052_metho.xml
Size: 20,376 bytes
Last Modified: 2025-10-06 14:12:24
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-2052"> <Title>Reversible Unification Based Machm. Franslatlon</Title> <Section position="2" start_page="0" end_page="299" type="metho"> <SectionTitle> 2 Unification-based Transfer </SectionTitle> <Paragraph position="0"> in this section 1 wilt give some exan@les of the use of a unification grammar (in PATR II \[17\] notation) to define the relation between language dependent logical forms. For illustrative purposes I will assume logical fl)rms are represented by feature structures consisting of the attributes pred, ar.ql, art2 together with some attribu tes representing 'universM' meanings such as tense, aspect, munber and person; I will not touch upon issues such a~s quantification and modification.</Paragraph> <Paragraph position="1"> The logical forlns of English and Spanish are labeled by the attributes 9 b and sp respectively. As an example</Paragraph> <Paragraph position="3"> tures will often be related ill a straightforward way to a Spanish equivalent., except for the value of the prod attributes. A very simple rule in PATR II style may look as in figure 2. This rule simply states that the</Paragraph> <Paragraph position="5"> translation of a logical form is composed of the translation of its arguments. If the rule applies to the feature structure in 1 tile three daughters of the rule will be instantiated as in figure 3, and the value of sp will be bound to the sp values of these daughters. An example</Paragraph> <Paragraph position="7"> of the rule for the first daughter will be a lexical entry and looks as iu figure 4. The simple English expression 'army' has to be translated as a complex expression in Spanish: 'Nerza militar'. The rule will look a.s in 5 where it is assumed that the construction is analyzed in Spanish as an ordinary noun-adjective construction, and where the logical form of the adjective takes the logical form of the noun as its argument. The translation for 'civilian' is defined in a similar rule (although the translation of 'number' is different). Note that this example of complex transfer is similar to the famous</Paragraph> <Paragraph position="9"> applications the feature structure in figure 1 will get instantiated to tile feature structure in 6, from which the generator generates the strir~g 'La fllerza militar rompio el fuego a la poblacion civil'.</Paragraph> <Paragraph position="11"> In the foregoing examples the relation between Iogicar forms is rather straightforward. Note however that tile full power of a unification grammar can be used to settle more difficult translation cases, because differellt attributes can be used to represent the 'translational syntax'. For instance we can build a tree as value of the attribute tree to represetlt tile derivational history of the translation process. Or we can %hread'information through different nodes to be able to make translations dependent on each other. Translation parameters such as style and subject field can be percolated as attributes of nodes to obtain consistent translations; but these attributes themselves need not be translated.</Paragraph> </Section> <Section position="3" start_page="299" end_page="302" type="metho"> <SectionTitle> 3 Reversible Unitication Grammars </SectionTitle> <Paragraph position="0"> A unification grammar defined in formalisms such as PATR II and 1)CG \[12\] usually defines a relation between a string of words and a logical form. In sign-based approaches such as UCG \[26\] and flPSG \[14\] this string of words is not assigned a privileged status but is the value of one of the attributes of a feature structure. I will assume a formalism similar to PATR II, but without the context-free ba~e; the string is represented as the value of one of the attributes of a feature structure. Thus more generally, unification grammars define relations between the values of two (or more 1) attributes - for example the relation between the value of the attributes string and If, or between the value of the attributes sp and gb; these relations are all relations between feature structures.</Paragraph> <Section position="1" start_page="300" end_page="301" type="sub_section"> <SectionTitle> 3.1 \]Reversibility </SectionTitle> <Paragraph position="0"> I will call a binary relation reversible if the relation is symmetric and computable. Both symmetry and computability will be explained in the following subsections. A grammar G is reversible for a relation R iff R is reversible and defined by G. For example, a grammar that relates strings to logical forms is reversible if both the parsing and generation problem is computable, and the relation between strings and logical forms is symmetric; the parsing problem is computable if for a given string all corresponding logical forms can be enumerated by some terminating procedure; such a procedure should halt if the given string does not have a corresponding logical form. Thus: reversible -- symmetric + computable. Note that reversibility as defined here is different from bidirectionality. The latter merely says that grammars are to be used in two directions, but does not state how the two directions relate.</Paragraph> <Paragraph position="1"> It is easy to see that a composition of reversible relations is a a reversible relation too; i.e. if some feature structure fl is related to some feature structure f~ via the reversible relations .Ri(fi,fi+l), each defined by some reversible grammar Gi, then R'(fl, fn) is reversible. Thus an MT system that defines a relation R(,%, st) via the relations t~ ( s~, 5), Ry ( l~, lt ) and Ra(lt, st) is reversible if R1,2,3 are reversible.</Paragraph> <Paragraph position="2"> A relation R C A x B is symmetric iff R(a, b) implies R(b, a ~) where a and a' are equivalent. For an MT system we want to define 'equivalence' in such a way that the translation relation is a symmetric relation between strings, ttowever, strings are feature structures thus wc must define equivalence for feature structures to obtain this effect.</Paragraph> <Paragraph position="3"> Unification grammars as they are commonly used implement a rather weak notion of equivalence between feature structures: feature structures a and b are equiv-</Paragraph> <Paragraph position="5"> Two feature structures fl, f2 are weakly equivalent iff fl U f2 exists.</Paragraph> <Paragraph position="6"> if feature structures are taken to stand for all their grouml iustances this yields an acceptable version of sym ;~e.try. Moreover, under the assumption that 1 Note that it is possible to define a unification gra~nmar that relates several language dependent logical forms; in this approach a multilingual transfer system consists of only one transfer gramm,'m feature structures which represent strings are always ground (i.e. these feature structures cannot be extended), this results in a symmetric relation between (feature structures that represent) strings.</Paragraph> <Paragraph position="7"> It is also possible to define a 'strong' notion of equivalence for feature structures that does not rely on this assumption.</Paragraph> <Paragraph position="8"> Definition 2 (Strong equivalence) Two feature structures fl,f2 are strongly equivalent (fl =-- f2) iff /2 E A and A E f2.</Paragraph> <Paragraph position="9"> A grammar that defines a computable relation between two attributes under the strong definition of equivalence might be cMled strongly reversible. Similarly a weakly reversible grammar is reversible under a weak definition of equivalence. Again these results can be generMized to a series of unification grammars. The strong version of equivalence can be motivated on the ground that it may be easier to obtain computability; this is the topic of the next subsection. In section 3.2 I will discuss possible relaxations of the strong version of equivalence to obtain 'mildly' reversible grammars. A relation R C A x B is computable iff for a given a E A the set {b C B\]R(a,b)} can be enumerated by some terminating procedure. To discuss cornputability it is useful to look a bit more careful at the relations we are interested in. These relations are all binary relations between feature structures, ttowever, in tile case of the relation between strings and logical forms, strings will always be related to logical forms and logical forms will be related to strings. Similarly for the relation between l)utch and Spanish logical forms.</Paragraph> <Paragraph position="10"> Clearly, the domain and range of the relation is structured and can be partioned into two sets A and \]3, \[or example the set of feature structures representing strings and the set of feature structures representing logical forms. The relation R C A U B x A U B can be partitioned similarly into the relations r C A x I\] and its inverse, r -~ C B x A. The problem to compute R is now replaced by two problems: the computation of r and r -1. For example the problem to compute the relation between logical forms and strings consists of the parsing- and generation problem. It is now possible to incorporate the notion of equivalence, to obtain a definition of a parser, generator and translator. For example, an Mgorithm that computes the foregoing relation r will enumerate for a given features structure fl all feature strnctures fy, such that r(fa, fy) and f~ and f3 are equivalent. In the case of strong equivalence this implies that fl~ f3 (completeness), and fa U fl (coherence). In other words, the input should not be extended (coherence) and should completely be derived (completeness). This usage of the terms completeness and coherence was introduced in \[24\]. In the following I will discuss ways to obtain computability of one such partition.</Paragraph> <Paragraph position="11"> It is well known that relations defined by unrestricted unification grammars are not computable in general a~s such grammars have Turing power \[13\]; it is thus not decidable whether the relation is defined for some given input. Usually some constraint on grammars is defined to remedy this. For example the off-line-parsability constraint \[13, 5\] ensures that the recognition problem is solvable. Moreover this constraint also implies that the parsing problem as defined here is computable; i.e.</Paragraph> <Paragraph position="12"> the proof procedure will always terminate (because the constraint implies that there is a limit to the depth of possible parse trees for all strings of a given length). llowever the off-line-parsability constraint assumes a context-free base of the formalism. A generalization of the off-line-parsability constraint for any binary relation defined by unification grammars will consist of three parts; the first and third of these parts are usually implicit in the case of parsing.</Paragraph> <Paragraph position="13"> Pirst, the value of the input must be built in a well-behaved compositional way. For example in the case of parsing: each daughter of a rule dominates part of the string dominated by the mother of that rule. Similarly for transfer and generation: each daughter of a rule has a value for If that is part of the value of If of the mother.</Paragraph> <Paragraph position="14"> Second, a special condition is defined for rules where the input vMue of the mother is the same as the input value of one of the daughters. \[:or parsing such rules have exactly one daughter. A chain of applications of such rules is disallowed by some constraint or other; this is the core of most definitions of the off-line parsabilityoconstraint. For example in \[13\] such a chain is disMlowed as the principal functor of a term may only occur once in a chain. For a slightly more general definition, cf. \[5\]. For generation and transfer a similar constraint can be defined. In the terminology of \[18, 19\] the 'head' of a rule is a daughter with the same logical form as its mother. A chain of these heads must be disallowed.</Paragraph> <Paragraph position="15"> Third, the input should not get extended during the proof procedure. In the case of parsing this is achieved eaMly because the input is ground 2. For generation and transfer this is not necessarily the case. This is the point where the usefulness of the coherence condition comes in; the coherence requirement explicitly states that extension of the input is not allowed. For this reason strong reversiblity may be easier to obtain than weak reversibility. In the next subsection I will discuss two relaxations of strong symmetry that will not affect the computability properties discussed here.</Paragraph> <Paragraph position="16"> Generalizing the terminology introduced by \[13\] a proof procedure is strongly stable iff it always terminates for grammars that adhere to a generalized off-line parsability constrMnt. In \[15\] a general proof procedure for DCG based on head-driven generation \[18, 19, 22\] is defined that is strongly stable for a specific instantiation of the generalized off-line parsability constraint. ?'Note that this is the reason that most DCG parsers expect that the input value of the string has an atomic tail, i.e. parse(\[john, kisses,mary\], ~) will work fine, but parse(\[john, kisses, mary\]X\], X) will cause problenas.</Paragraph> </Section> <Section position="2" start_page="301" end_page="302" type="sub_section"> <SectionTitle> 3.2 Possible relaxations </SectionTitle> <Paragraph position="0"> It is easy to see that the completeness and coherence requirenrents make life hard for the rulewriter as she/he needs to know exactly what the possible values of inputs are for some component. It is possible to relax the completeness and coherence requirement in two ways that will not affect the reversibility properties between strings. The useflfiness of these relaxations depends on the analyses a user wishes to define.</Paragraph> <Paragraph position="1"> The first relaxation assumes that there is a sort system defined for feature structures that makes it possible to make a distinction between cyclic and non-cyclic attributes (cf. \[5\]). For the moment a non-cyclic attribute may be defined a.s an attribute with a finite number of possible values (i.e. it is not recursive). For example the attributes argl and arg2 will be cyclic whereas number will be non-cyclic. The completeness and coherence condition is restricted to cyclic attributes. As the proof procedure can only further instantiate non-cyclic attributes no termination problems occur because there are only a finite number of possibilities to do this. The definition of 'equivalence' for feature structures is now slightly changed. \[\[b define this properly it is necessary to define the notion non-cyclic extension. A non-cyclic extension of a feature structure only instantiates non-cyclic attributes. This results in the following definition of equivalence: Definition 3 (Non-cyclic equivalent) Two feature structures f:, f2 are non.cyclic equivalent iff f~ _=__ f~ where f~ are non-cyclic extensions of f,~.</Paragraph> <Paragraph position="2"> It will be clear that the usefulness of this definition depends heavily on the style of grammar writing that is used. Note that it is of course also possible to declare for each non-cyclic attribute whether the completeness and coherence requirements hold.</Paragraph> <Paragraph position="3"> The second relaxation is not without ramifications for the organization of a transfer grammar. Tlfis relaxation has to do with reentrancies in feature structures. Some constructions such as control verbs and relative clauses may be represented using reentrancies; for example 'the soldiers tried to shoot the president' may be represented by a feature structure where the first argument of 'try' is reentrant with the first argument of 'shoot', cf. figure 7. The translation of such logical forms to Dutch equivalents can be defined as in rule 8.</Paragraph> <Paragraph position="4"> In this rule the reentrancy is explicitly mentioned for two reasons. The first reason is simply that in the case of different possible translations of ar91 we want the same translation for both argl and the embedded argl. Note that the translation of 'soldier' into Dutch can be both 'soldaat' or 'militair'. If the reentrancy is not mentioned the system has to try to generate from four different Dutch logical forms, two of which without matching argl's.</Paragraph> <Paragraph position="6"/> <Paragraph position="8"> The reentrancy is also mentioned because this is required by the completeness condition. It is possible to relax the completeness and coherence condition with respect to these reentrancies, again without directing the reversibility properties of the system by slightly changing the definition of equiva\]ence. There is a trade-off between simplicity of the transfer grammar (in the presence of this relaxation) and the efficiency of the system. In the case of this relaxation the system will eventually lind the good translations, but it may take a while. On the other hand, if we are to mention all (possibly unbounded) reentrancies explicitly then the transfer grammar will have to be complicated by a threading mechanism to derive such reen trancies. Again, the specific use o:\[ reentrancies in the logical forms that are defined will deterlnine whether this relaxation is desired or not.</Paragraph> </Section> </Section> <Section position="4" start_page="302" end_page="302" type="metho"> <SectionTitle> 4 Final remarks </SectionTitle> <Paragraph position="0"> The objective to build a reversible MT system using a series of unification grammars is similar to the objective of the CRITTER system as expressed in \[3, 7\], and the work of Zajac in \[25\]. Instead of using unification grammars CRITTER uses logic grammars; Zajac uses a type system including an inheritance mechanism to define transfer-like rules. In these two approaches less attention is being paid to an exact definition of reversibility; although our work may be compatible with these approaches.</Paragraph> <Paragraph position="1"> A somewhat different approach is advocated in \[9\].</Paragraph> <Paragraph position="2"> In that approach a system is described where an I, FG grammar for some source language is augnlented with equations that define (part of) the target level representations. A generator derives from this partial description a string according to some LFG grammar of the target language. Instead of a series of three grammars this architecture thus assumes two grammars, one of which both defines the source language and the relation with the target language. The translation relation is not only defined between logical forms but may relate ~ll levels of representation ( c.structure, f-structure, a-structure). Although in this approach monolingual grammars may be used in a bidirectional way it is unclear whether the translation equations can be used bidirectionally 3 An important problem for the approach advocated here is the problem of logical form equivalence. Shieber \[16\] noted that unification grammars usually define a relation between strings and some canonical logical form. Depending on the nature of logical forms that are being used, severM representations of a logical form may have the same 'meaning'; just as in first order predicate calculus the formulas p v q and q v p are logically equivalent; a unification grammar will not know of these equivalences and, consequently, all equivalences have to be defined separately (if such equivalents are thought of as being translational equiwdents); for example in a transfer grammar two rules may be defined to translate p V q into both p' V q' and q' V p' if these formulas arc thought of ,~ being equivalent. Of course this technique can only be applied if the number of equivalences is finite, it is not possible to define that p is equivalent with ..... p for any even number of --'s.</Paragraph> <Paragraph position="3"> The approach discussed so far can be extended just as unification grammars for parsing and generation have been extended. Apart from equationM constraints it will be useful to add others such as disjunction and negation. Moreover it seems useful to allow some version of universal constraints or some inheritance mechanisrn to be able to express generalizations and exceptions more easily.</Paragraph> </Section> class="xml-element"></Paper>