File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/c86-1046_metho.xml

Size: 11,734 bytes

Last Modified: 2025-10-06 14:11:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1046">
  <Title>DEPENDENCY UNIFICATION GRAMMAR</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. Dependency Representation Language (DRL)
</SectionTitle>
    <Paragraph position="0"> Grammar formalisms and computer languages are usually developed independently. DRL is both at the same time. In the same spirit as PROLOG is tailor-made for the purposes of logic, DRL has been particularly adapted to represent linguistic structures. Whereas the interpreter for PROLOG includes a theorem prover, the interpreter for DRL is linked with a parser. (DRL also serves for the purpose of knowledge representation within the deduction component of PLAIN. This aspect will not be discussed here.) DRL consists of bracketed expressions which are lists in the sense of list processing. Conceptually, they represent tree diagrams with nodes and directed arcs. It is the characteristic feature of DRL that each node refers to a lexically defined atomic unit of an utterance and that the arcs represent direct relationships between these atomic units. According to the hierarchical structure of tree diagrams, one element in each relationship is dominant, the other one is dependent. Dependency grammar assumes that this asymmetry reflects the actual situation in natural language.</Paragraph>
    <Paragraph position="1"> Asymmetries between constituents are commonly conceded in modern grammar theory. It seems to be certain that only via the head-complement distinction can adequate constraints for the construction of natural language expressions be defined. Unfortunately, phrase structure, which prevails in most grammar formalisms, is at odds with the direct asymmetric relations between immediate constituents. A logical consequence would be to chose dependency as the primary principle of representing syntactic structure (see the arguments ill Hudson 1984). Nevertheless, this proposal still encounter:!{ quite a bit of scepticism.</Paragraph>
    <Paragraph position="2"> The implementation of an efficient parser (see Hellwig 1980) has proven the practicability of the dependency approach. However, the formalism for dependency grammars has had to be substantially augmented.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="195" type="metho">
    <SectionTitle>
3. Faetorization of Grammatical Information
</SectionTitle>
    <Paragraph position="0"> When designing a computer language that is to serve as a grammatical formalism, it is crucial to provide for a factorization of information that is at the same time convenient and adequate. I have stressed that DRL terms are in a one-to-one relationship with the basic elements of a natural language. Since the features of these elements are numerous and varied, every DRL term must be multi-labeled. As is common in unification grammars, each feature is coded as an attribute-value pair. The attribute states the feature type, the values represent the concrete features. The division into attributes and values allows for very general descriptions, since relationships can now be formulated on the level of the attributes, no matter which values apply in the individual cases.</Paragraph>
    <Paragraph position="1"> A complex category consist of any number of attributes or&amp;quot; attribute-value assignments.</Paragraph>
    <Paragraph position="2"> Faced with the unlimited expressiveness of complex categories, the key issue now is to carefully select and group the attributes in such a way that the linguistic phenomena are represented as adequately and transparently as possible. DUG assumes that a distinction must be made among three dimensions in which each element of an utterance participates: lexical meaning, syntagmatic function and outward form. Correspondingly, three types of attributes are grouped together in each DRL-term: a lexeme, a syntagmatic role and a complex morpho-syntactic category. To glve an example:  (i) The cat likes fish.</Paragraph>
    <Paragraph position="3"> This sentence is represented in DRL as follows, disregarding positional attributes for the moment: (2) (ILLOCUTION: assertion: clse typ&lt;l&gt; (PREDICATE: like: verb fin&lt;l&gt; hum&lt;l&gt; per&lt;l&gt; (SUBJECT: cat: noun num&lt;l&gt; per&lt;3&gt; (DETERMINER: the: dete)) (OBJECT: fish: noun))); We cannot avoid going into a few notational details. Each term, printed on a separate line, corresponds to a word in (1). The first term is correlated to the period, which is also treated as a word. The parentheses depict the dependency structure. The first attribute in each term is the role, the second the lexeme. Both are identified by position, i.e. their values are simply written at the first and second position in the term. Roles and lexemes constitute the semantic representation. They are more or less equivalent to f-structures in LFG. The third part of each term contains a description of the surface properties of the corresponding segments in the utterance. It consists of a main category, generally a word class such as verb, noun, determiner, followed by a sequence of attribute-value subcategories which represent grammatical features such as finiteness, number, person. The format of subcategories is standardized in order to facilitate processing. Attributes are symbolized by three character-long key words, values are coded as numbers in angled brackets.</Paragraph>
    <Paragraph position="4"> The salient point of this formalism is that the functional, the lexematic and the morpho-syntactic properties coincide in every term, as they do in the elements of natural language. To put it in the terminology of LFG: f-structure and c-structure are totally synchronized. Since this cannot be achieved in a phrase structure representation, it is often as ~ sumed that there is a fundamental divergence between form and function in natural language. Admittedly, one prerequisite for a uniform function-form correspondence still has to be mentioned. Since non-terminal constituents are not basic, they are usually not represented by terms in DRL. However, there must be something to denote the suprasegmental meaning that a clause conveys in addition to the semantics of its constituents. As a necessary extension of dependency grammar, the yield of a clause is - so to speak lexicalized in DUG and represented by a term that dominates the corresponding list. Compare the first term in (2). Punctuation ill written language can be interpreted as a similar lexicalization of clausal semantics.</Paragraph>
  </Section>
  <Section position="6" start_page="195" end_page="196" type="metho">
    <SectionTitle>
4. Positional Features
</SectionTitle>
    <Paragraph position="0"> An important augmentation of dependency grammar is the decision to treat positional phenomena in DUG as morpho-syntactic features and, as a consequence, to represent them by subcategories in the same way as number, person and gender. The mechanism of unification can be applied to word order attributes just as advantageously as to other categories. The only difference is that the values appertaining to the elements of an utterance are not taken from the lexicon, but are drawn from the situation in the input string.</Paragraph>
    <Paragraph position="1"> One has to visualize this as follows.</Paragraph>
    <Paragraph position="2"> Each term in a dependency representation corresponds to a segment of the input string. Each subtree also corresponds to a segment which is composed of the segments corresponding to the terms which form the tree. Breaking down a dependency tree into  subtrees thus imposes an implicit constituent structure on the input string. Incidentally, the constituent corresponding to a dependency tree does not need to be continuous. The positions of the constituents relative to each other can be determined and included as the values of positional attributes in the terms of the dependency trees. It is stipulated that a positional attribute refers to the implicit constituent corresponding to the subtree in whose dominating term the feature is specified. Tlle attribute expresses a sequential relationship between this constituent and the segment which corresponds to the superordinated term.</Paragraph>
    <Paragraph position="3"> Any sequential order of constituents which can be defined can be included in the set of attributes.</Paragraph>
    <Paragraph position="4"> Suppose, for example, that D is a string corresponding to a subtree and H is the string that corresponds to the term superordinated to that subtree. Let us define the attribute &amp;quot;sequence&amp;quot; (seq) as having the values i: C precedes H, and 2: C follows H. Let us establish &amp;quot;adjacency&amp;quot; (adj) with the values i: C immediately precedes 1{, and 2: C immediately follows H, Finally, let us introduce &amp;quot;delimitation&amp;quot; (lim) with the values i: C is the leftmost of all of the strings corresponding to dependents of H, and 2: C is the rightmost of of all of the dependents of H. For the sake of comparison, let us consider the following example which Pereira 1981 uses in order to illustrate  Extraposition Grammar: (3) The mouse that the cat that likes fish chased  squeaks, The following DRL-tree depicts the dependencies and the word order of this sentence by means of the at-</Paragraph>
    <Paragraph position="6"> There is exactly one sequence of words that is in agreement with all of the attribute-values in the tree. It is likely that appropriate attributes can also be defined for more difficult cases of extraposition. Since the dislocated elements continue to be subordinated to their heads in their original role, no &amp;quot;gaps&amp;quot;, &amp;quot;holes&amp;quot; or 'Ltraces&amp;quot; are part of tile DRLformalism. The possibility to do without such entities is attractive. It arises from the fact that the ratio of constituency and dependency is reversed in DUG. It seems to be easier to augment dependency trees by c'onstituency information than to process dependency features within phrase markers.</Paragraph>
  </Section>
  <Section position="7" start_page="196" end_page="196" type="metho">
    <SectionTitle>
5. Morpho-syntactie Description
</SectionTitle>
    <Paragraph position="0"> Within DRL terms, the following means exist for generalization. There are variables for roles, lexemes and morpho-syntactic main categories, Subcategories allow a disjunction of values as their specification.</Paragraph>
    <Paragraph position="1"> The ANY-value is assumed whenever a subcategory attribute is \]eft out completely. These means are applied in the so-called base lexicon. 'The base lexicon creates the \].in\]{ between the segments of ti~e input language and the terms of DRL. A few results of this assignment are to be given just to illustrate the  (*: cat: noun num&lt;\]&gt; per&lt;B&gt;); (*: cat: noun hum&lt;2&gt; per&lt;B&gt;); (*: like: verb per&lt;\] ,2&gt;); (*: l.Jke: verb num&lt;l&gt; per&lt;3&gt;); (*: like: verb num&lt;2&gt; per&lt;B&gt;); (*: fish: noun per&lt;3&gt;);  The roles of all lexical items are \].eft open. Their values are a matter of the syntactic frames Jn which the items occurs J.n an utterance. The same lexeme applies to all inflectional, forms of a word. The values of person and number of CAT and CATS are indicated because they are specific. FISH, on the other hand, can be both singular and plural. Hence the numberattribute is omitted altogether. 'PShe feattu'es first and second person of LIKE are combined by dJsjunc.. tion. The choice between both values as well as between the ANY--values of number is left to the context. Ill case of the third person items it cannot be avoided to be more specific.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML