File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/86/c86-1046_concl.xml
Size: 11,085 bytes
Last Modified: 2025-10-06 13:56:09
<?xml version="1.0" standalone="yes"?> <Paper uid="C86-1046"> <Title>DEPENDENCY UNIFICATION GRAMMAR</Title> <Section position="8" start_page="196" end_page="198" type="concl"> <SectionTitle> 6. Slots </SectionTitle> <Paragraph position="0"> The notion of dependency is closely related to the idea of intrinsic combination capabilities of tile lexical elements. This capability is traditionally referred to as valency, although this view has often been restricted to verbs. DUG generalizes this lexicalistic approach with respect to all syntagmatic relationships. Syntax is completely integrated in the lexicon. The natural way to state valencies is by assigning slots to possibly dominating terms. A slot is a template of the list that would be an appropriate complement. As a rule, only the }lead of this list has to be described, because head-feature-convention (as known from GPSG) is a general principle in dependency representation. The fo\].lowing is a description of the valency of LIKES: (7) (*: like: verb fin<l> num<l> per<3> (SUBJECT: .. : noun num<l> per<3> adj<l>) (OBJECT: ._ : noun seg<2>)); Slots are the places where roles are introduced into the formalism. As a matter of fact:, it is the task of roles to differentiate conlplements. The lexematic character of the complements is usually unrestricted and, therefore, represented by a variable. Morpho-syntactics categories express the formal requirements, including positional attributes, that the filler must meet.</Paragraph> <Paragraph position="1"> A direct assignment of slots to a specific \].exical item is good policy only in the case of idiosyn-, cratic complements. Complements such as subject and object that are shared by many other verbs should be described in a more general way. The solution is to draw up completion patterns once and to refer to those patterns from the various J.ndividual lexemes. A separate pattern should be set up for each syntagmatic relationship. For example: (8) (*: ~subjeet: verb fin<l> (SUBJECT; : noun num<C> per<C> adj<\].>)); (9) (*: +object ( OBJECT : _ : noun seq<2>)); The following entries in the valency lexicon illustrate references to these patterns: (\].0) (: -> (*: squeak) (: +subject)); (l\].) (: -> (~: like) (& (:-Isubject) (: +object))); In the case of LIKES the effect of (11) is identical t:o ('~).</Paragraph> <Paragraph position="2"> Certain provisions allow for a maximal generality of patterns. The symbol &quot;C&quot; as subcategory value in (8) indicates that the respective values of a potential filler and the head of the \]Jst must match whatever these values may be irl tile concrete case,. }fence, pattern (8) covers subjects with any number and person features and, at the same time, controls their agreement with the dominating predicate. Horphological features in the head term restrict tile applJcabi\].ity of the pattern. In the case of (8) the dominating verb must be finite (fin<\]>), because Jt cannot have a subject as complement in the JnfirlitJve. The object pattern, on the contrary, is applicable without restrJctJons.</Paragraph> <Paragraph position="3"> An analogy to feature disjunction on the paradigmatic level is slot disjunction on the syntagmatic level. It is the means t:o formalize syntact:ic alternatives. The following improved patterns for subjects and objects include slots for relative pronouns in their appropriate leftmost position: (12) (*: +subject: verb ~in<l> per<3> (, (SUBJECT: _. : pron rel<l,C> lim<l>) (SUBJECT:_ : noun num<C> per<C> adj<l>))); (13) (~: +object (, (OBJECT: _ : pron rel<l,C> lira<l>) (OBJECT: : noun seq<2>))); (\]_2) provides for &quot;that \].ikes fish&quot; and (13) for&quot;that the (:at chased&quot; in Pereira's example. The feature &quot;re\].<l>&quot;, which is intrinsic to the relative pronoun, is to be passed on to the dominating verb as is Jndic, ated by &quot;C&quot;. This is the prerequisite to identifying the verb as the \]lead of' a relative clause. The pattern for the relative clause could look like this: (14) (*: +relative clause: noun (ATTRIBUTE.- _ ; verb rel<l> fin<l> adj<2>)) The fell.owing patterns and references complete tile small grammar that is needed for Pereira's sentence. null (15 (~: +determiner: noun (DETERMINER: _ : dete seq<l>)); (16) (: -> (~: mouse) (& (: +determiner) (: +relative clause))); (17) (: -> (*: cat) (& (: +determiner) (: +relative clause))); (18) (ILLOCUTION: assertion: clse typ<l> (PREDICATE: _ : verb fin<l> adj<l>)); Completion patterns capture the same syntactic regularities as rules in other formalisms. The peculiarity of DUG is that it breaks down the complex syntax of a language into many atomic syntactic relationships. This has several advantages. Valency descriptions are relatively easy to draw up. They are to a great extent independent of each other so that changes and additions normally have no side effects. Although the grammar is wholly integrated in the lexicon, the structure of lexical entries is rather simple. Any new combination of complements which may be encountered is simply a matter of lexical reference, while in rule-based grammars a new rule has to be created whose application subsequently has to be controlled. null 7. Parsin~ by Unification In logic, unification is defined as a coherent replacement of symbols within two formulas so that both formulas become identical. The same principle can be applied advantageously in grammar. The basis of the mechanism is the notion of subsumption. There are two occurrences of subsumption in DRL. Firstly, attribute symbols subsume all of the appertaining values. For example, a role variable covers any role, a morpho-syntactic subcategory covers any element of the defined set of features. Secondly, structure descriptions subsume structures. DRL comprises variables which refer to various substructures of trees. In the present context we consider only direct subordination of slots.</Paragraph> <Paragraph position="4"> It must be the strategy of the grammar writer to keep any single description as abstract as possible so that it covers a maximum number of cases. In the course of the analysis, the unification of expressions leads to the replacement of the more general by the more specific. As opposed to simple pattern matching techniques, replacements of the symbols of two expressions occur in both directions. Continued unification in the syntagmatic framework leads to an incremental precision of the attributes of all of the constituents.</Paragraph> <Paragraph position="5"> A prerequisite for a unification-based parser is the control of the expressions which are to be unified. The control structure depends on the grammar theory which is at the basis. The PLAIN parser runs through three phases: (i) the consultation of the base lexicon yielding a lexeme and a morpho-syntactic characterization for each basic segment in the utterance, (ii) the consultation of the valency lexicon yielding a description of the combination capabilities of the basic terms, (iii) a reconstruction of the syntactic relationships in the utterance by a bottom-up slot-filling mechanism. Throughout the whole process previous expressions are unified with subsequent ones.</Paragraph> <Paragraph position="6"> Let us first consider the lexicon phases. The word forms in the utterance are taken as the starting points. According to the base lexicon, they are replaced by terms which show the identity and the divergence of their attributes. With respect to identity, this is a step similar to unification. Compare, for example, the terms associated with the word forms of &quot;to like&quot; in (6), which share the role, the lexeme and the word class properties, with respect to divergence, the base lexicon contains just as many features as can be attributed to a word form out of the syntactic context. The valency lexicon, on the other hand, abstracts just from those features which are not distinctive for a specific syntactic relationship. null The parser combines the information from both lexica by means of unification. At first, the terms derived from the base lexicon are unified with the left-hand side of the valency references. The resulting specification is transferred to all terms on the right-hand side of the reference. Each of these terms, in turn, is unified with the heads of the com ~ pletion patterns. The specifications produced in the course of these operations are brought into agreement with the original terms and, eventually, the appropriate slots are subordinated to these terms. Once the initial lists are produced comprising the combined information from both lexica, the detection of the syntactic structure of the utterance is a fairly simple process. Each of the lists, starting with the leftmost one, tries to find a slot in another list. If a searching list can be unified with the attributes in a slot, a new list is formed which comprises both lists as well as the result of their mutual specifications. The new list is stored at the end of the line and, when it is its turn, looks for a slot itself. This process continues until no more lists are produced and no slots are untried. Those lists that comprise exactly one term for each input segment are the final parsing results.</Paragraph> <Paragraph position="7"> I would like to stress a few properties that this parsing algorithm owes to DUG. Similar to unification in theorem proving, the process relies completely on the unification of potential representations of parts of the utterance. No reference to external resources, such as rules, taint the mechanism. The control is thus extremely data- directed. On the other hand, the unification of DRL lists is an instrument with an immense combinatorial power.</Paragraph> <Paragraph position="8"> within any term the agreement of function, lexical selection and morpho-syntactic features is forced. In addition to this horizontal linkage, the attributes of the dominating term as well as the attributes of the dependent terms are also subject to unification. The attributes of dependent terms are delineated by the valency description. According to congruence conditions heads and dependents continue to mutually specify each other. Feature unification and slot disjunction also restricts the co-occurrence of dependents. In addition, positional features are continuously made to tally with the corresponding sequence of segments in the input string. This network of relationships prevents the parser from producing inappropriate lists. At the same time it results in incremental specification, which facilitates the work of the lexicon writer. What may be theoretically the most interesting is the fact that functional, lexical, morphological and positional features can be processed smoothly in parallel.</Paragraph> </Section> class="xml-element"></Paper>