XML Viewer - j01-1004

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/j01-1004_intro.xml
Size: 11,320 bytes
Last Modified: 2025-10-06 14:01:03
<?xml version="1.0" standalone="yes"?>
<Paper uid="J01-1004">
  <Title>D-Tree Substitution Grammars</Title>
  <Section position="2" start_page="0" end_page="91" type="intro">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> There is considerable interest among computational linguists in lexicalized grammatical frameworks. From a theoretical perspective, this interest is motivated by the widely held assumption that grammatical structure is projected from the lexicon. From a practical perspective, the interest stems from the growing importance of word-based corpora in natural language processing. Schabes (1990) defines a lexicalized grammar as a grammar in which every elementary structure (rules, trees, etc.) is associated with a lexical item and every lexical item is associated with a finite set of elementary structures of the grammar. Lexicalized tree adjoining grammar (LTAG) (Joshi and Schabes 1991) is a widely studied example of a lexicalized grammatical formalism. 1 In LTAG, the elementary structures of the grammar are phrase structure trees.</Paragraph>
    <Paragraph position="1"> Because of the extended domain of locality of a tree (as compared to a context-free string rewriting rule), the elementary trees of an LTAG can provide possible syntactic contexts for the lexical item or items that anchor the tree, i.e., from which the syntactic structure in the tree is projected. LTAG provides two operations for combining trees: substitution and adjunction. The substitution operation appends one tree at a frontier node of another tree. The adjunction operation is more powerful: it can be used to insert one tree within another. This property of adjoining has been widely used in the LTAG literature to provide an account for long-distance dependencies. For example, * ATT Labs-Research, B233 180 Park Ave, PO Box 971, Florham Park, NJ 07932-0971, USA. E-mail: rambow@research.att.com t Department of Computer and Information Science University of Delaware Newark, Delaware 19716. E-mail: vijay@udel.edu School of Cognitive and Computing Sciences University of Sussex Brighton, BN1 6QH E. Sussex UK. E-mail: david.weir@cogs.susx.ac.uk 1 Other examples of lexicalized grammar formalisms include different varieties of categorial grammars and dependency grammars. Neither HPSG nor LFG are lexicalized in the sense of Schabes (1990).</Paragraph>
    <Paragraph position="2">  Example of adjunction.</Paragraph>
    <Paragraph position="3"> Figure I shows a typical analysis of topicalization. 2 The related nodes for the filler and the gap in the elementary tree c~ are moved further apart when the tree 7 is obtained by adjoining the auxiliary tree fl within ~. This shows that adjunction changes the structural relationship between some of the nodes in the tree into which adjunction occurs.</Paragraph>
    <Paragraph position="4"> In LTAG, the lexicalized elementary objects are defined in such a way that the structural relationships between the anchor and each of its dependents change during the course of a derivation through the operation of adjunction, as just illustrated. This approach is not the only possibility. An alternative would be to define the relationships between the nodes of the elementary objects in such a way that these relationships hold throughout the derivation, regardless of how the derivation proceeds.</Paragraph>
    <Paragraph position="5"> This perspective on the LTAG formalism was explored in Vijay-Shanker (1992) where, following the principles of d-theory parsing (Marcus, Hindle, and Fleck 1983), LTAG was seen as a system manipulating descriptions of trees rather than as a tree 2 The same analysis holds for wh-movement, but we use topicalization as an example in order to avoid the superficial complication of the auxiliary needed in English questions. Sometimes, topicalized sentences sound somewhat less natural than the corresponding wh-questions, which are always structurally equivalent.</Paragraph>
    <Paragraph position="6">  rewriting formalism. Elementary objects are descriptions of possible syntactic contexts for the anchor, formalized in a logic for describing nodes and the relationships (dominance, immediate dominance, linear precedence) that hold between them.</Paragraph>
    <Paragraph position="7"> From this perspective, instead of positing the elementary tree ~ in Figure 1, we can describe the projection of syntactic structure from the transitive verb. This description is presented pictorially as c~ ~ in Figure 2. The solid lines indicate immediate domination, whereas the dashed lines indicate a domination of arbitrary length. The description a ~ not only partially describes the tree c~ (by taking the dominations to be those of length 0) but also any tree (such as &amp;quot;~) that can be derived by using the operations of adjunction and substitution starting from c~. In fact, (~t describes exactly what is common among these trees.</Paragraph>
    <Paragraph position="8"> By expressing elementary objects in terms of tree descriptions, we can describe syntactic structure projected from a lexical item in a way that is independent of the derivations in which it is used. This is achieved by employing composition operations that produce descriptions that are compatible with the descriptions being combined.</Paragraph>
    <Paragraph position="9"> For instance, adjoining, seen from this perspective, serves to further specify the underspecified dominations. In Figure 2, the description -y~ is obtained by additionally stating that the domination between the two nodes labeled S in c~ ~ is now given by the domination relation between the two nodes labeled S in fl~.</Paragraph>
    <Paragraph position="10"> As we will explore in this paper, changing the way the lexicalized formalism is viewed, from tree rewriting to tree descriptions, raises questions as to the desirability  A problem for LTAG.</Paragraph>
    <Paragraph position="11"> of certain aspects of the formalism. Specifically, we claim that the following two aspects of LTAG appear unnecessarily restrictive from the perspective of tree description: .</Paragraph>
    <Paragraph position="12"> .</Paragraph>
    <Paragraph position="13"> In LTAG, the root and foot of auxiliary trees must be labeled by the same nonterminal symbol. This is not a minor issue since it derives from one of the most fundamental principles of LTAG, factoring of recursion. This principle states that auxiliary trees express factored out recursion, which can be reintroduced via the adjunction operation. It has had a profound influence on the way that the formalism has been applied linguistically. 3 An example of how this can create problems is shown in Figure 3. In this case, the &amp;quot;adjoined&amp;quot; tree has a root labeled S and a foot labeled VP, something that is not permissible in LTAG. Note that without this constraint, the combination would appear to be exactly like adjoining.</Paragraph>
    <Paragraph position="14"> We consider this aspect in more detail in Section 4.1.</Paragraph>
    <Paragraph position="15"> The adjunction operation embeds all of the adjoined tree within that part of the tree at which adjunction occurs. This is illustrated in -y' (Figure 2) where both parts (separated by domination) of fl~ appear within one underspecified domination relationship in c~'.</Paragraph>
    <Paragraph position="16"> The foot node of tree fl in Figure 1 corresponds to a required argument of the lexical anchor, thought. The adjunction operation accomplishes the role of expanding this argument node. Unlike the substitution operation, where an entire tree is inserted below the argument node, with adjunction, only a subtree of ~ appears below the argument node; the remainder appears in its entirety above the root node of ft. However, if we view the trees as descriptions, as in Figure 2, and if we take the expansion of the foot node as the main goal served by adjunction, it is not clear why the composition should have anything to say about the domination relationship between the other parts of the two objects being combined. In the description approach, in order to obtain 3/we (in a 3 Note that in feature-based LTAG there is no restriction that the two feature structures be the same, or even that they be compatible.</Paragraph>
    <Paragraph position="17">  to be happy sense to be made precise later) substitute the second component of ~ (rooted in S) at the foot node of fl'. This operation does not itself entail any further domination constraints between the components of ~P and fl~ that are not directly involved in the substitution, specifically, the top components of o/and fl'. In the trees described it is possible for either one to dominate the other. 4 However, adjunction further stipulates that the rest of ~' will appear above all of fl'. This additional constraint makes certain analyses unavailable for the LTAG formalism (as is well known). For instance, given the two lexical projections in Figure 4, the subtrees must be interleaved in a fashion not available with adjoining to produce the desired result. This aspect of adjoining is the focus of the discussion in Section 4.2.</Paragraph>
    <Paragraph position="18"> In this paper, we describe a formalism based on tree descriptions called d-tree substitution grammars (DSG). 5 The elementary tree descriptions in DSG can be used to describe lexical items and the grammatical structure they project. Each elementary tree description can be seen as describing two aspects of the tree structure: one part of the description specifies phrase structure rules for lexical projections, and a second part of the description states domination relationships between pairs of nodes. DSG inherits from LTAG the extended domain of locality of its elementary structures, and, in DSG as in LTAG, this extended domain of locality allows us to develop a lexicalized grammar in which lexical items project grammatical structure, including positions for arguments. But DSG departs from LTAG in that it does not include factoring of recursion as a constraint on the makeup of the grammatical projections. Furthermore, in DSG, arguments are added to their head by a single operation that we call generalized substitution, whereas in LTAG two operations are used: adjunction and substitution. DSG is intended to be a simple framework with which it is possible to provide analyses for those cases described with LTAG as well as for various cases in which extensions of LTAG have been needed, such as different versions of multicomponent</Paragraph>
    <Paragraph position="20"> Figure 5 A pair of tree descriptions (which are also d-trees). LTAG. Furthermore, because the elementary objects are expressed in terms of logical descriptions, it has been possible to investigate the characteristics of the underspecification that is used in these descriptions (Vijay-Shanker and Weir 1999). In Section 2, we give some formal definitions and in Section 3 discuss some of the formal properties of DSG. In Section 4, we present analyses in DSG for various linguistic constructions in several languages, and compare them to the corresponding LTAG analyses. In Section 5, we discuss the particular problem of modeling syntactic dependency. We conclude with a discussion of some related work and summary.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML