File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2029_metho.xml
Size: 20,272 bytes
Last Modified: 2025-10-06 14:12:24
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-2029"> <Title>Constraining Tree Adjoining Grammars by Unification</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 The Two Basic Formalisms TAG </SectionTitle> <Paragraph position="0"> and PATR In this section, the formalism of TAGs is motivated to be appropriate to replace a context-free grammar in natural language description. Weighing the disadvantages remaining for TAGs, which are the seane as for CFGs, the same extending formalism as for CFGs - Unification Grammars - has been chosen to resolve these disadvantages for TAGs.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 A Short Outline of Tree Adjoining Grammars </SectionTitle> <Paragraph position="0"> In 1975, the formalism of Tree Adjoining Grammars (TAGs) was introduced by Aravind K. Joshi, Leon S.</Paragraph> <Paragraph position="1"> Levy and Masako Takahashi (see \[Joshi et al. 75\]).</Paragraph> <Paragraph position="2"> Since then, a wide variety of properties - formal properties as well as linguistically relevant ones - were studied (see, e.g., \[Joshi 85\] for a good overview).</Paragraph> <Paragraph position="3"> The following example describing the crossed dependencies in Dutch should illustrate the formalism (see Figure 1, where the node numbers written in slanted font should be ignored here, they make sense in combination with Figure 3). A TAG is a tree generation system. It consists of two different sets of trees, which are combinable. Intuitively, the set of initial trees can be seen as context-free derivation trees. This means, the start symbol is the root node, all inner nodes are nonterminals and all leaves are terminals (e.g., in Figure 1 tree ~). The second set, the auxiliary trees, which can replace a node in an initial tree (which is possibly modified by further adjoinings) during the process of adjoining, must have a form that again a derivation tree results. The trees ~31 and \[~2 demonstrate that restriction. A special leaf (the foot node) must exist, labelled with the same nonterminal as the root node is labelled with. Further, it is obligatory that an auxiliary tree derives at least one terminal. The union of the initial mid the auxili~u:y trees, so to speak the set of rules of a 'FAG, is called lhe set of elementary trees.</Paragraph> <Paragraph position="5"> is an initial tree with an arbitrary number of adjoinings (here \[51 is adjoined at the node S* in ct and ~2 in the process of adjoining The most obvious property of TAG rules (elementary trees), which arises from the close relation with context-free derivation trees - with which linguists are familiar - is the easy way to write and understand such rules. The advantage instead of a context-free grammar producing the derivation trees is that related facts can be described in one rule. E.g., in Figure 1, each tree contains exactly the dependent pieces without any further processing (for more examples see, e.g., \[Kroch, Joshi 85\]).</Paragraph> <Paragraph position="6"> With the close relation between TAGs and CFGs, one can think that both formalisms are equivalent. But TAGs are more powerful. In the linguistic community, it is discussed controversially, how powerful a linguistic formalism should be (see, e.g., Pullum 84\] or \[Shieber 85\]). TAGs are mildly context-sensitive, which means that they can describe some context-sensitive languages, but not all (e.g., www with w e {a,b} , but ww is acceptable for a TAG). There is the thesis that natural language can be described very well by a mildly context-sensitive formalism. But this can only be empirically confirmed by describing difficult linguistic phenomenons (here, the example in Figure 1 can only give an idea for the appropriateness of TAGs fbr natural language description).</Paragraph> <Paragraph position="7"> If an efficient implementation of a parser for TAGs is desired (e.g., in a natural language access system to an expert system), the existence of polynomial time acceptors for the word problem of TAGs becomes relevant (upper time bound O(n 4 log n), see \[Harbusch 89\]). On the basis of thi~ efficient algorithm the new definition has been implemented which is mentioned later in the summary.</Paragraph> <Paragraph position="8"> With this short impression of some advantages of the TAG formalism, the disadvantages sbould now be tackled. The main property, which has its roots in the close relation to context-free grammars, is the same problem with subcategorisation. Further information encoded in the category name leads to combinatory explosion of the grammar. In the framework of CFGs, this disadvantage is removed by defining a Unification Gr~Jnrnar cx tending a context-free grcunmar.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 PATR Unification Briefly Revisited </SectionTitle> <Paragraph position="0"> A Unification Grammar U (brief!y called UG or PATR grammar) consists of a CFG G, where each rule is extended by a possibly empty set of specification rules (for a good introduction to Unification Grammars, e.g., see \[Shieber 87\]). Such a rule consists of two paths which ,are unified. A path consists of a number uniquely referring to a constituent in the context-free rule together with a list of feature names and/or an atomic value.</Paragraph> <Paragraph position="1"> A pair (context-free rule, list of specification rules) is called unification rule. E.g.,((S NP VP)) (((0 fset)(1 fset))((lfset syntax)(2 fset syntax))((2 fset syntax verbform) active)))) is such a rule. Another representation of the specification rules is a DAG (directed acyclic graph). Figure 2 shows this representation for the above described exauaple rule. It is built by representing the numbers, feature names and values as nodes which are connected in the way riley are put together in the string representation. Common prefixes are represented only once.</Paragraph> <Paragraph position="2"> The pro~'ess of recursion - called unification - is defined as an operation of union on all specification rules according to a context-free derivation, loosely spoken. More h)rmally, the result of the unification of two DAGs x and y (UNIFY(x,y)) is defined inductively as a new DAG z, where - z=x, ifx=y, - z = x, if x is atomic and y is empty, and vice versa z = y, if y is atomic and x is empty, - if neither x nor y is atomic, then V features I such that II,ul ~ x, II,vl e y, I1,UNIFY(u,v)I e z and V features I such that II,wl ~ (u u v) - (u r~ v), II,wl e z.</Paragraph> <Paragraph position="3"> It is easy to see that this extension of the context-free formalism allows us to introduce various additional information by specification rules. The main problem of the formalism is that it has Turing capacity - and so all disadvan 'tages inherited with this power. Informally, this paper shows that the combination with TAGs restricts the power of unification.</Paragraph> </Section> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 The Two Different Definitions </SectionTitle> <Paragraph position="0"> of TAGs with Unification Although Turing capacity is not a wishful property, unification seems to be a good extension for TAGs as well. In this section, two different definitions for TAGs with Unification (UTAGs) are presented in a common terminology to simplify their comparison.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 Tile Definition of tile Grammar for </SectionTitle> <Paragraph position="0"> TAGs with Unification Same as to define specification rules according to a context-free rule, where the relation between both sets is represented by unique node numbers to refer to the different constituents in a rule. Here, according to an initial or an auxiliary tree, these specifications are defined between father and sons or between brothers via unique node numbers all over the trees. The trees c~, 131 and ~2 in Figure 1 (now interpreting the unique node numbers of the elementary trees) together with the according specifications in Figure 3 describe an example TAG with Unification, which produces the propagation of some syntactic and senmntic information from the lexical items to the root node of the different subsentences. Note that the unique node numbers are also helpful to identify the individual adjoined trees in the derivation tree. To prevent the ambiguity resulting from adjoining the same tree more than once, the node number of the eliminated node is taken as a prefix for all new nodes.</Paragraph> <Paragraph position="1"> ((syntrole i d obj) Marie)((sem_role recipient) Marie)))) ... same information for Jan and Piet ...</Paragraph> <Paragraph position="2"> (zwemmen ((V ((syntax verbform) inf)((syntrole verb) zwemmen) ((synt role subject))((synt_role d_obj) NONE) ((syntrole i d obj) NONE)((sem_role action ) zwemmen)) (laten ((V ((syntax verbform) inf) ((syntrole verb)laten)syntrole d_obj) NONE) ((syntrole i d obj) NONE)(((sem role action) laten))) (zag ((V ((syntax verbform) fin) ((syntax pers) 3)((syntax hum) sing) ((syntrole verb) zagen)((synt_role d_obj) NONE) ((synt_role i d obj) NONE)((sem_role action) zagen))) To get an impression of what that grammar does, one can read all relations between father and sons in all initial and auxiliary trees as context-free rules annotated with the corresponding specification rules.</Paragraph> <Paragraph position="3"> To realize this partial interpretation of specification rules, the following sets for each node x are defined: l&quot;x := the set of all specification rules with x father or brother of the other mentioned node in that rule, Sx := the set of all specification rules, where x is a son of the further mentioned node in that rule and 0x := the set of all specification rules, where a value of x is defined.</Paragraph> <Paragraph position="4"> It is easy to see that for each node these sets can be automatically computed. E.g, for the node 01 in c~, 1&quot;01 := {(((00 fset)(01 fset)), ((01 fset)(02 fset))}, $01 := {(((01 fest)(012 fset))}, 001 := ~. Vijay-Shanker and Joshi prefer to write the grammar in the 1&quot; (TOP, or briefly called t) and $ (BOTTOM, b) terminology, which allows slide differences in expressiveness, e.g., a non-empty TOP set of the root node of an auxiliary tree can be specified.</Paragraph> <Paragraph position="5"> The Unification Grammar, consisting of all rules built by interpreting each father and all its sons in all elementary trees as context-free rules, together with all TOP sets, can be interpreted as described in section 2.2. But it is important to note, the two resulting grammars are not equivalent, because the context-free grammar doesn't require the derivation of the whole tree, if one context-free rule out of it is in use.</Paragraph> <Paragraph position="6"> Considering this simple and intuitive interpretation of a UTAG, the problem of defining adjoining and unification directly for such a grammar becomes obvious. Only the relations between father and sons or brothers are interpreted directly, although there are links defined over a whole tree. This is exactly the point, where the two defin~.tions differ.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 The Definition of Vijay-Shanker and Joshi </SectionTitle> <Paragraph position="0"> The definition of Vijay-Shanker and Joshi separates this local information (for a description of their approach see, e.g., \[Vijay-Shanker 87\]). It reminds us of the interpretation of attributes after computing the context-free derivation tree for an Attribute Grammar (e.g., see \[Aho et al. 86\]).</Paragraph> <Paragraph position="1"> In their approach, the 1&quot; and $ sets remain isolated until all adjoinings are made. With this strategy it is clear that the unification cannot be used to reduce the number of structure trees, which are unificationally ill-formed.</Paragraph> <Paragraph position="2"> More formally spoken, the adjoining is defined as described in Figure 4 (for the reason of uniquely referencing to the 1&quot; and $ sets at each node X, t and b with different bar levels are used). After all adjoinings were made, all 1&quot; and $ sets at each node are unified.</Paragraph> <Paragraph position="3"> The disadvantage of this sequential interpretation of adjoining and unification can be demonstrated by the example grammar. In the lexicon, all names, &quot;Marie&quot;, &quot;Piet&quot; and &quot;Jan&quot;, have three different cases (nominative, dative and accusative). Therefore, three structurally equivalent trees are produced for each tree cC/, \[~1 and ~2.</Paragraph> <Paragraph position="4"> These are structurally combined by adjoining. Out of this collection, by the specification rule, which demands a subject, unification selects one correct reading. But this is checked after building nine different</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Vijay-Shanker and Joshi </SectionTitle> <Paragraph position="0"> One way to handle that problem is to use unification with disjunction to reduce the set of structurally equivalent trees. But this doesn't tackle the problem fundamentally, because it cannot reduce ambiguities, which only can be eliminated by interpreting the specification rules at once. This integration is realized in our new approach. The different approaches are abbreviated with SUTAG for the more sequential approach of Vijay-Shanker and Joshi, and IUTAG for a more integrated approach, which is presented now.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 All Integrated Definition for TAG and Unification </SectionTitle> <Paragraph position="0"> The basis for our definition is a UTAG given in the notation as described in section 3.1. For each elementary tree, a set of specification rules is defined, which can be interpreted as unified DAGs over the whole tree. E.g., the path &quot;fset sem_role action&quot; of node 01 has the value &quot;zwemmen&quot;.</Paragraph> <Paragraph position="1"> If an adjoining should happen in a node X, for this node the sets '\['X, SX and <)X are computed because this node and with it &quot;all links from its DAG to other DAGs are replaced by an auxiliary tree. Structurally, the adjoining looks like the original one for TAGs (and same as for SUTAGs). In the case of an adjoining, a node in an (possibly modified) initial tree should be replaced by a whole auxiliary tree \[3, and all links that node had have to be modified by the information of ~.</Paragraph> <Paragraph position="2"> You can imagine the new linked DAGs all over the adjoined Iree as being a filter for the former propagated information. E.g., information passing a node X, where an adjoining will take place, must not be supported by the path from the root to the foot node of the adjoined auxiliary tree. So the propagation is stopped somewhere in the tree.</Paragraph> <Paragraph position="3"> More formally spoken, the definition of adjoining can be given as described in Figure 5. The DAG of the node, in which the adjoining will take place, is represented by $ and $ sets.</Paragraph> <Paragraph position="4"> Here, r and f stand for the whole DAG of the root and foot node of the auxiliary tree, which will be adjoined. But it is obvious that the 1&quot; set of r is empty as well as the $ set of f. This is clear because there exists no father of the root and no son of the foot node where these links can end.</Paragraph> <Paragraph position="5"> Using the same terminology for the auxiliary tree as in Figure 4 (r is separated in t&quot; and b', and f in t&quot; and b&quot;), one can write instead of &quot;UNIFY(t,r)&quot; as well 17o 4 &quot;UNIFY(t,b')&quot; and for &quot;UNIFY(b,f)&quot; as well &quot;UNIFY(b,t&quot;)&quot;. What becomes obvious is that in the :cesulting m~e each node has a DAG, which is connected with the DAGs of its neighbors in the tree. This was lhe aim of the new definition, always to produce linked DAGs all over the derivation tree to test for failure of of the value definition of the eliminated node.</Paragraph> <Paragraph position="6"> t~eginning with the elimination of links, the separation into q&quot; and $ at a node X, an adjoining will take place, means that all information propagated along the node X is eliminated (e.g., one can imagine a reason maintenance system to keep track of that task, so that unification is no more commutative and independent from the time of introducing information). E.g., information t'rom a leaf is propagated to the root of the whole tree. This propagation can be interrupted, modit'ied or kept tmtouched if an adjoining takes place. Intuitively, the adjoined tree can be imagined as a filter for the propa?~ation. null Some extra computation must be done for tile reintroduction of 'din value inl\-)rmation (0X) in the following way. Since tim node, in which a value is defined, will be eliminated in the case of an adjoining, the question arises, what should happen to the value definition? tlere, it was decided to find a point for the reintroduction of that information. The first idea can be to say, i.~y definition add it to the root node. But now a fail can t,,e produced in the case ltmt lhe adjoined tree adds parts ~f the path to reach that value (e.g., if the value definition in node X with the node number 01 is ((01 ilset) val) and in tile root node (10) of the adjoined tree, dm specification rule exists ((10 fset next) (11 fset)), a tail is produced because fset has at the same time a feature and a value as successo0. To allow this property to be interpreted without failure, which is desirable for ihe idea of defining a filter via adjoining, a computation t)f such paris of paths in the adjoined tree is done to find the maximal extension of the path, behind which the value can be added without producing a fail.</Paragraph> <Paragraph position="7"> in this process, called computation of the inheritance history, all maximal prefixes of paths in the adjoined auxiliary tree are computed. Out of this set, those candidates are chosen, which have the value definition as prefix p. Behind these maximal paths the value is reintroduced. Because this selection does not always have a unique path at exactly one node as result (e.g., if ~-oot and foot node add two different feature names behind p, but between both no propagation occurs), at the end of both paths the value is reintroduced by definition. It is clear that the computation of the inheritance history can be done once for all paths in all auxiliary trees, so that this part of the definition doesn't extend the execution time very much. The most elaborate work has to be done in reconstructing the correct links all over the derivation tree after an adjoining. In the worst case, changes of propagation all over the derivation tree are required.</Paragraph> <Paragraph position="8"> To give an idea of how this definition works, in Figure 6 the changes during the adjoining of \[31 and 132 in a are represented. Here only the nominatve reading is in use, because the lexical reading with case dative and accusative produces a fail in unification with the valency description of the verbs. Concentrating on the feature path &quot;fset sere_role&quot; (note, we don't claim that this is a serious semantics of the sentence!), first the meaning of &quot;Marie zwemmen&quot; is produced (action is &quot;zwemmen&quot; and actor is &quot;Marie&quot;), which is modified during the adjoining by &quot;Piet laten&quot; and &quot;Jan zag&quot;.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.4 Tile Comparison of tile Properties of </SectionTitle> <Paragraph position="0"/> </Section> </Section> class="xml-element"></Paper>