File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/95/p95-1021_intro.xml
Size: 11,823 bytes
Last Modified: 2025-10-06 14:05:51
<?xml version="1.0" standalone="yes"?> <Paper uid="P95-1021"> <Title>D-Tree Grammars</Title> <Section position="3" start_page="0" end_page="152" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> We define a new grammar formalism, called D-Tree Grammars (DTG), which arises from work on Tree-Adjoining Grammars (TAG) (Joshi et al., 1975). A salient feature of TAG is the extended domain of locality it provides. Each elementary structure can be associated with a lexical item (as in Lexicalized TAG (LTAG) (Joshi ~ Schabes, 1991)). Properties related to the lexical item (such as subcategorization, agreement, certain types of word order variation) can be expressed within the elementary structure (Kroch, 1987; Frank, 1992). In addition, TAG remain tractable, yet their generative capacity is sufficient to account for certain syntactic phenomena that, it has been argued, lie beyond Context-Free Grammars (CFG) (Shieber, 1985). TAG, however, has two limitations which provide the motivation for this work. The first problem (discussed in Section 1.1) is that the TAG operations of substitution and adjunction do not map cleanly onto the relations of complementation and modification. A second problem (discussed in Section 1.2) has to do with the inability of TAG to provide analyses for certain syntactic phenomena. In developing DTG we have tried to overcome these problems while remaining faithful to what we see as the key advantages of TAG (in particular, its enlarged domain of locality). In Section 1.3 we introduce some of the key features of DTG and explain how they are intended to address the problems that we have identified with TAG.</Paragraph> <Section position="1" start_page="0" end_page="151" type="sub_section"> <SectionTitle> 1,1 Derivations and Dependencies </SectionTitle> <Paragraph position="0"> In LTAG, the operations of substitution and adjunction relate two lexical items. It is therefore natural to interpret these operations as establishing a direct linguistic relation between the two lexical items, namely a relation of complementation (predicateargument relation) or of modification. In purely CFG-based approaches, these relations are only implicit. However, they represent important linguistic intuition, they provide a uniform interface to semantics, and they are, as Schabes ~ Shieber (1994) argue, important in order to support statistical parameters in stochastic frameworks and appropriate adjunction constraints in TAG. In many frameworks, complementation and modification are in fact made explicit: LFG (Bresnan & Kaplan, 1982) provides a separate functional (f-) structure, and dependency grammars (see e.g. Mel'~uk (1988)) use these notions as the principal basis for syntactic representation. We will follow the dependency literature in referring to complementation and modification as syntactic dependency. As observed by Rambow and Joshi (1992), for TAG, the importance of the dependency structure means that not only the derived phrase-structure tree is of interest, but also the operations by which we obtained it from elementary structures. This information is encoded in the derivation tree (Vijay-Shanker, 1987).</Paragraph> <Paragraph position="1"> However, as Vijay-Shanker (1992) observes, the TAG composition operations are not used uniformly: while substitution is used only to add a (nominal) complement, adjunction is used both for modification and (clausal) complementation. Clausal complementation could not be handled uniformly by substitution because of the existence of syntactic phenomena such as long-distance wh-movement in English. Furthermore, there is an inconsistency in the directionality of the operations used for complementation in TAG@: nominal complements are substituted into their governing verb's tree, while the governing verb's tree is adjoined into its own clausal complement. The fact that adjunction and substitution are used in a linguistically heterogeneous manner means that (standard) &quot;lAG derivation trees do not provide a good representation of the dependencies between the words of the sentence, i.e., of the predicate-argument and modification structure.</Paragraph> <Paragraph position="2"> tion structure shown on the left in Figure 11 .</Paragraph> <Paragraph position="3"> (1) Small spicy hotdogs he claims Mary seems to adore When comparing this derivation structure to the dependency structure in Figure 2, the following problems become apparent. First, both adjectives depend on hotdog, while in the derivation structure small is a daughter of spicy. In addition, seem depends on claim (as does its nominal argument, he), and adore depends on seem. In the derivation structure, seem is a daughter of adore (the direction does not express the actual dependency), and claim is also a daughter of adore (though neither is an argument of the other).</Paragraph> <Paragraph position="4"> practice and annotate nodes with lexemes and arcs with grammatical function: by distinguishing between the adjunction of modifiers and of clausal complements. This gives us the derivation structure shown on the right in Figure 1. While this might provide a satisfactory treatment of modification at the derivation level, there are now three types of operations (two adjunctions and substitution) for two types of dependencies (arguments and modifiers), and the directionality problem for embedded clauses remains unsolved.</Paragraph> <Paragraph position="5"> In defining DTG we have attempted to resolve these problems with the use of a single operation (that we call subsertion) for handling Ml complementation and a second operation (called sisteradjunction) for modification. Before discussion these operations further we consider a second problem with TAG that has implications for the design of these new composition operations (in particular, subsertion).</Paragraph> </Section> <Section position="2" start_page="151" end_page="152" type="sub_section"> <SectionTitle> 1.2 Problematic Constructions for TAG </SectionTitle> <Paragraph position="0"> TAG cannot be used to provide suitable analyses for certain syntactic phenomena, including long-distance scrambling in German (Becket et hi., 1991), Romance Clitics (Bleam, 1994), wh-extraction out of complex picture-NPs (Kroch, 1987), and Kashmiri wh-extraction (presented here). The problem in describing these phenomena with TAG arises from the fact (observed by Vijay-Shanker (1992)) that adjoining is an overly restricted way of combining structures. We illustrate the problem by considering Kashmiri wh-extraction, drawing on Bhatt (1994). Wh-extraction in Kashmiri proceeds as in English, except that the wh-word ends up in sentence-second position, with a topic from the matrix clause in sentence-initial position. This is illustrated in (2a) for a simple clause and in (2b) for a complex clause.</Paragraph> <Paragraph position="1"> (2) a. rameshan kyaa dyutnay tse RameshzRG whatNOM gave yOUDAT What did you give Ramesh? b. rameshan kyaal chu baasaan \[ ki RameshzRG what is believeNperf that me kor ti\] IZRG do What does Ramesh beheve that I did? Since the moved element does not appear in sentence-initial position, the TAG analysis of English wh-extraction of Kroch (1987; 1989) (in which the matrix clause is adjoined into the embedded clause) cannot be transferred, and in fact no linguistically plausible TAG analysis appears to be available.</Paragraph> <Paragraph position="2"> In the past, variants of TAG have been developed to extend the range of possible analyses. In Multi-Component TAG (MCTAG) (Joshi, 1987), trees are grouped into sets which must be adjoined together (multicomponent adjunction). However, MC-TAG lack expressive power since, while syntactic relations are invariably subject to c-command or dominance constraints, there is no way to state that two trees from a set must be in a dominance relation in the derived tree. MCTAG with Domination Links (MCTAG-DL) (Becker et al., 1991) are multi-component systems that allow for the expression of dominance constraints. However, MCTAG-DL share a further problem with MCTAG: the derivation structures cannot be given a linguistically meaningful interpretation. Thus, they fail to address the first problem we discussed (in Section 1.1).</Paragraph> </Section> <Section position="3" start_page="152" end_page="152" type="sub_section"> <SectionTitle> 1.3 The DTG Approach </SectionTitle> <Paragraph position="0"> Vijay-Shanker (1992) points out that use of adjunction for clausal complementation in TAG corresponds, at the level of dependency structure, to substitution at the foot node s of the adjoined tree. However, adjunction (rather than substitution) is used since, in general, the structure that is substituted may only form part of the clausal complement: the remaining substructure of the clausal complement appears above the root of the adjoined tree. Unfortunately, as seen in the examples given in Section 1.2, there are cases where satisfactory analyses cannot be obtained with adjunction. In particular, using adjunction in this way cannot handle cases in which parts of the clausal complement are required to be placed within the structure of the adjoined tree.</Paragraph> <Paragraph position="1"> The DTG operation of subsertion is designed to overcome this limitation. Subsertion can be viewed as a generalization of adjunction in which components of the clausal complement (the subserted structure) which are not substituted can be interspersed within the structure that is the site of the subsertion. Following earlier work (Becket et al., 1991; Vijay-Shanker, 1992), DTG provide a mechanism involving the use of domination links (d-edges) that ensure that parts of the subserted structure that are not substituted dominate those parts that are. Furthermore, there is a need to constrain the way in which the non-substituted components can be interspersed 3. This is done by either using appropriate feature constraints at nodes or by means of subsertion-insertion constraints (see Section 2).</Paragraph> <Paragraph position="2"> We end this section by briefly commenting on the other DTG operation of sister-adjunction. In TAG, modification is performed with adjunction of modifier trees that have a highly constrained form. In particular, the foot nodes of these trees are always daughters of the root and either the leftmost or rightmost frontier nodes. The effect of adjoining a 2In these cases the foot node is an argument node of the lexical anchor.</Paragraph> <Paragraph position="3"> SThis was also observed by Rambow (1994a), where an integrity constraint (first defined for an tD/LP version of TAG (Becket et aJ., 1991)) is defined for a MCTAG-DL version called V-TAG. However, this was found to be insufficient for treating both long-distance scrambling and long-distance topicalization in German. V-TAG retains adjoining (to handle topicalization) for this reason.</Paragraph> <Paragraph position="4"> tree of this form corresponds (almost) exactly to the addition of a new (leftmost or rightmost) subtree below the node that was the site of the adjunction.</Paragraph> <Paragraph position="5"> For this reason, we have equipped DTG with an operation (sister-adjunction) that does exactly this and nothing more. From the definition of DTG in Section 2 it can be seen that the essential aspects of Schabes & Shieber (1994) treatment for modification, including multiple modifications of a phrase, can be captured by using this operation 4.</Paragraph> <Paragraph position="6"> After defining DTG in Section 2, we discuss, in Section 3, DTG analyses for the English and Kashmiri data presented in this section. Section 4 briefly discusses DTG recognition algorithms.</Paragraph> </Section> </Section> class="xml-element"></Paper>