File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/w91-0110_metho.xml
Size: 18,689 bytes
Last Modified: 2025-10-06 14:12:49
<?xml version="1.0" standalone="yes"?> <Paper uid="W91-0110"> <Title>I \[ A Uniform Architecture for Parsing, Generation and Transfer</Title> <Section position="3" start_page="71" end_page="75" type="metho"> <SectionTitle> 2 A REWRITE MACHINE FOR TYPED FEATURE STRUCTURES </SectionTitle> <Paragraph position="0"> The basic motivation behind the Typed Feature Structure rewriting system is to provide a language which has the same deductive and logical properties of logic programming languages such as PROLOG, but which is based on feature terms instead of first order terms \[Ai't-Kaci 84, A~t-Kaci 86, Emele/Zajac 90a\]. Such a language has a different semantics than the Herbrand semantics: this semantics is based on the notion of approximation, which captures in a computational</Paragraph> <Paragraph position="2"> framework the idea that feature structures represent partial information \[Zajac 90b\] 3. Of course, as in PROLOG,i problems of completeness and el- L ficiency have to be addressed.</Paragraph> <Paragraph position="3"> The universe: of feature terms is structured in an inheritance hierarchy which defines a partial ordering on kinds of available information. The backbone of the hierarchy is defined by a partial order _< on !a set of type symbols T. To this set, we add two more symbols: T which represents completly underspecified information, and _l_ which represents inconsistent information. Two type symbols have a common most general sub-type (Greatest Lower Bound - GLB): this sub-type inherits ale information associated with all its super-types. We define a meet operation on two type symbols A and B as A A B = glb(A, B). Formally, a type hierarchy defined as a tuple (T, <, A) i is a meet semi-lattice. A technicality arises when two types A and B have more than one GLB: in that case, the set of GLBs is interpreted as a disjunction. null As different shts of attribute-value pairs make sense for differen~t kind of objects, we divide our feature terms into different types. Terms are closed in the sense that ~ach type defines a specific association of features iand restrictions on their possible values) which are I appropriate for it, expressed as a feature structure '(the definition of the type). Since types are organized in an inheritance hierarchy, a type inherits all the features and value restrictions from all its 'super-types. This type-discipline for feature structures enforces the following two constraints: a term cannot have a feature which is not appropriate for its type 4 and conversely, a pair of feature and value should always be defined for some type. Thus a feature term is always typed and it is not possible to introduce an arbitrary feature in a term (by unification): all features added to some term should be appropriate for its type.</Paragraph> <Paragraph position="4"> We use the attribute-value matrix (AVM) notation for feature terms and we write the type symbol for each feature term in front of the opening square bracket of the AVM. A type symbol which does not have any feature defined for it is atomic.</Paragraph> <Paragraph position="5"> All others types are complex.</Paragraph> <Paragraph position="6"> A type definition has the following form: the type symbol to be defined appears on the left-hand side of the equation. The right-hand side is an expression of conjunctions and disjunctions of typed feature terms (Figure 1). Conjunctions are interpreted as meets on typed feature terms (implemented using a typed unification algorithm \[Emele 91\]). The definition may have conditional constraints expressed as a logical conjunction of feature terms and introduced by ':-'. The right-hand side feature term may contain the left-hand side type symbol in a subterm (or in the condition), thus defining a recursive type equation which gives the system the expressive power needed to describe complex linguistic structures.</Paragraph> <Paragraph position="7"> A subtype inherits all constraints of its super-types monotonically: the constraints expressed as an expression of feature terms are conjoined using unification; the conditions are conjoined using the logical and operation.</Paragraph> <Paragraph position="8"> i. 4Checked at compde time.</Paragraph> <Paragraph position="9"> A set of type definitions defines an inheritance hierarchy of feature terms which specifies the available approximations. Such a hierarchy is compiled into a rewriting system as follows: each direct link between a type A and a subtype B generates a rewrite rule of the form A\[a\] ~ B\[b\] where \[a\] and \[b\] are the definitions of A and B, respectively.</Paragraph> <Paragraph position="10"> The interpreter is given a &quot;query&quot; (a feature term) to evaluate: this input term is already an approximation of the final solution, though a very rough approximation. The idea is to incrementally add more information to that term using the rewrite rules in order to get step by step closer to the solution: we stop when we have the best possible approximation.</Paragraph> <Paragraph position="11"> A rewrite step for a term t is defined as follows: if u is a subterm of t of type A and there exists a rewrite rule A\[a\] ~ B\[b\] such that A\[a\] N u ~ _l_, the right-hand side B\[b\] is unified with the subterm u, giving a new term t' which is more specific than t. This rewrite step is applied non-deterministically everywhere in the term until no further rule is applicable 5. Actually, the rewriting process stops either when all types are minimal types or when all subterms in a term correspond exactly to some approximation defined by a type in the hierarchy. A term is &quot;solved&quot; when any subterm is either more specific than the definition of a minimal type, or does not give more information than the definition of its type.</Paragraph> <Paragraph position="12"> This defines an if and only if condition for a term to be a solved-form, where any addition of information will not bring anything new and is implemented using a lazy rewriting strategy: the application of a rule A\[a\] ~ B\[b\] at a subterm u is actually triggered only when A\[a\] N u U d\[a\].</Paragraph> <Paragraph position="13"> This lazy rewriting strategy implements a fully data-driven computation scheme and avoids useless branches of computation. Thus, there is no 5Conditions do not change this general scheme and are omitted from the presentation for the sake of simplicity. See for example \[Dershowitz/Plaisted 88\], and \[Klop 90\] for a survey. need to have a special treatment to avoid what corresponds to the evaluation of un-instantiated goals in PROLOG, since a general treatment based on</Paragraph> <Paragraph position="15"> the semantics oflthe formalism itself is built in the evaluation strategy of the interpreter.</Paragraph> <Paragraph position="16"> The choice of which subterm to rewrite is only partly driven by the availability of information (using the lazy rewriting scheme). When there are several subterms that could be rewritten, the computation rule is to choose the outer-most ones (inner-most strategies are usually nonterminating) 6. Such an outer-most rewriting strategy has interesting termination properties, since there are problems where a TFS program will terminate when the corresponding PROLOG program will not z.</Paragraph> <Paragraph position="17"> For a given subterm , the choice of which rule to apply is done non-deterministically, and the search space is explored depth-first using a backtracking scheme. This strategy is not complete, though in association with the outer-most rule and with the lazy evaluation scheme, it seems to terminate on any &quot;well-defined&quot; problem, i.e. when terms introduced by recursive definitions during execution are strictly decreasing according to some mesure (for example, see the definition of guides in \[Dymetman et al. 90\] for the parsing and generation problems). A complete breadth-first search strategy is planned for debugging purposes.</Paragraph> <Paragraph position="18"> The interpreter described above is implemented s and has been used to test several models such as LFG, HPSG, or DCG on toy examples \[Emele/Zajac 90b, Emele et al. 90, Zajac 90a\].</Paragraph> <Paragraph position="19"> 8This outer-mos t rewriting strategy is similar to hyper-resolution in logic programming. The lazy evaluation mechanism is related to the 'freeze' predicate of, e.g. Prolog-II and Sicstus Prolog, though in Prolog, it has to be called explicitly.</Paragraph> <Paragraph position="20"> 7e.g. the problem: of left-recursive rules in naive PROLOG implementations of DCGs SA prototype version is publically available.</Paragraph> </Section> <Section position="4" start_page="75" end_page="76" type="metho"> <SectionTitle> 3 PARSING, GENERATION, AND BIDI- RECTIONAL TRANSFER </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="75" end_page="76" type="sub_section"> <SectionTitle> 3.1 Parsing/generation </SectionTitle> <Paragraph position="0"> A grammar describes the relation between strings of words and linguistic structures. In order to implement a reversible grammar, we have to encode both kinds of structure using the same kind of data structure provided by the TFS language: typed feature structures. A linguistic structure will be encoded using features and values, and the set of valid linguistic structures has to be declared explicitly. A string of words will be encoded as a list of word forms, using the same kind of definitions as in Figure 1.</Paragraph> <Paragraph position="1"> To abolish the distinction between &quot;input&quot; and &quot;output&quot;, the relation between a string and a linguistic structure will be encoded in a single term with, for example, two features, string and syn and we can call the type of such a structure SIGN 9.</Paragraph> <Paragraph position="2"> The type SIGN is divided into several subtypes corresponding to different mappings between a string and a linguistic structure. We will have at least the classifcation bewteen phrases and words.</Paragraph> <Paragraph position="3"> The definition of a phrase will recursively relate subphrases and substrings, and define the phrase as a composition of subphrases and the string as the concatenation r of substrings. The formalism does not impose constraints on how the relations between phrases and strings are defined, and the grammar writer has to define them explicitly. One possibility is to use context-free like mappings, using for example the same kind of encoding as in DCGs for PATR-like gramars or tIPSG \[Emele/Zajac 90b\]. But other possibilities are available as well: using a kind of functional composition reminiscent of categorial grammars as in \[Dymetman et al. 90\], or linear precedence rules \[Pollard/Sag 87, Reape 90\].</Paragraph> <Paragraph position="4"> For example, a rule like \[Shieber 86\] ldeg</Paragraph> <Paragraph position="6"> is encoded in TFS using a type S for the sentence type with two features np and vp for encoding the constituent structure, and similarly for NPs and VPs. The string associated with each constituent is encoded under the feature string. The string associated with the sentence is simply the concatenation of the string associated with the VP and the string associated with the NP: this constraint is expressed in a condition using the APPEND relation on lists (Figure 4).</Paragraph> <Paragraph position="7"> The difference between the parsing and the generation problem is then only in the form of the term given to the interpreter for evaluation. An underspecified term where only the string is given defines the parsing problem: An underspecified term where only the seman-</Paragraph> <Paragraph position="9"> is also specified in the condition part, and these contrastive definitions are defined separately from the lexical definitions.</Paragraph> <Paragraph position="10"> In both cases , the same interpreter uses the same set of rewrite rules to fill in &quot;missing information&quot; according to the grammar definitions. The result in both cases is exactly the same: a fully specified term containing the string, the semantic form, and also all other syntactic information like the constituent Structure (Figure 5).</Paragraph> <Paragraph position="11"> The transfer problem for one direction or the other is stated in the same way as for parsing or generation: the input term is an under-specified &quot;bilingual sign&quot; where only one structure for one language is given. Using the contrastive grammar, the interpreter fills in missing information and builds a completely specified bilingual sign 11 .</Paragraph> </Section> <Section position="2" start_page="76" end_page="76" type="sub_section"> <SectionTitle> 3.2 Bi-directional transfer in MT </SectionTitle> <Paragraph position="0"> We have sketrched above a very general framework for specifying mappings between a linguistic structure, effcoded as a feature structure and a string; also encoded as a feature structure. We apply a similar technique for specifying MT transfer rules, which we prefer to call &quot;contrastive rules&quot; since there is no directionality involved \[Zajac 89, Zajac;90a\].</Paragraph> <Paragraph position="1"> The idea is rather simple: assume we are working with linguistic structures similar to LFG's functional structures for English and French \[Kaplan et al. 8~\]. We define a translation relation as a type TAU-LEX with two features, eng for the English structure and fr for the French structure. This &quot;bilingual sign&quot; is defined on the lexical structure: each shbtype of TAU-LEX defines a lexical correspondence between a partial English lexical structure and. a partial French lexical structure for a given lexical equivalence. Such a lexical contrastive definition also has to pair the arguments recursively, and this is expressed in the condition part of the definltion (Figure 6). The translation of syntactic features, like tense or determination,</Paragraph> </Section> </Section> <Section position="5" start_page="76" end_page="77" type="metho"> <SectionTitle> 4 THE TERMINATION PROBLEM AND EFFICIENCY ISSUES </SectionTitle> <Paragraph position="0"> For parsing and generation, since no constraint is imposed on the kind of mapping between the string and the semantic form, termination has to be proved for each class of grammar and the for the particular evaluation mechanism used for either parsing or generation with this grammar. If we restrict ourselves to class of grammars for which terminating evaluation algorithms are known, we can implement those directly in TFS.</Paragraph> <Paragraph position="1"> However, the TFS evaluation strategy allows more naive implementations of grammars and the outer-most evaluation of &quot;sub-goals&quot; terminates on a strictly larger class of programs than for corresponding logic programs implemented in a conventional PROLOG. Furthermore, the grammar writer does not need, and actually should not, be aware of the control which follows the shape of the input rather than a fixed strategy, thanks to the lazy evaluation mechanism.</Paragraph> <Paragraph position="2"> tIPSG-style grammars do not cause any problem: completeness and coherence as defined for LFG, and extended to the general case by \[Wedekind 88\], are implemented in HPSG using the &quot;subcategorization feature principle&quot; \[Johnson 87\]. Termination conditions for parsing are well understood in the framework of context-free grammars. For generation using feature structures, one of the problems is that the input could be &quot;extended&quot; during processing, i.e. arbitrary feature structures could be introduced in the semantic part of the input by unification with the semantic part of a rule. However, if the semantic part of the input is fully speficied according to a set of type definitions describing the set of well-formed semantic structures (and this condition is easy to check), this cannot arise in a type-based system. A more general approach is described in \[Dymetman et al. 90\] who define sufficient properties for termination for parsing and generation for the class of &quot;Lexical Grammars&quot; implemented in PROLOG. These properties seem generalizable to other classes of grammars as well, and are also applicable to TFS implementations. The idea is relatively simple and says that for parsing, each rule must consume a non empty part of the string, and for generation, each rule must consume a non empty part of the semantic form. Since Lexical Grammars are implemented in PROLOG, left-recursion must be eliminated for parsing and for generation, but this does not apply to TFS implementations. null Termination for reversible transfer grammars is discussed in \[van Noord 90\]. One of the problems mentioned is the extension of the &quot;input&quot;, as in generation, and the answer is similar (see above).</Paragraph> <Paragraph position="3"> Itowever, properties similar to the &quot;conservative guides&quot; of \[Dymetman et al. 90\] have to hold in order to ensure termination.</Paragraph> <Paragraph position="4"> The lazy evaluation mechanism has an almost optimal behavior on the class of problems that have an exponential complexity when using the &quot;generate and test&quot; method \[van Hentenryck/Dincbas 87, A\[t-Saci/Meyer 90\].</Paragraph> <Paragraph position="5"> It is driven by the availability of information: as soon as some piece of information is available, the evaluation of constraints in which this information appears is triggered. Thus, the search space is explored &quot;intelligently&quot;, never following branches of computation that would correspond to uninstanciated PROLOG goals. The lazy evaluation mechanism is not yet fully implemented in the current version of TFS, but with the partial implementation we have, a gain of 50% for parsing has already been achieved (in comparison with the previous implementation using only the outer-most rewriting strategy).</Paragraph> <Paragraph position="6"> The major drawback of the current implementation is the lack of an efficient indexing scheme for objects. Since the dictionaries are accessed using unification only, each entry is tried one after the other, leading to an extremely inefficient behavior with large dictionaries. However, we think that a general indexing scheme based on a combination of methods used in PROLOG implementations and in object-oriented database systems is feasible.</Paragraph> </Section> class="xml-element"></Paper>