File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-3045_metho.xml
Size: 19,707 bytes
Last Modified: 2025-10-06 14:12:30
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-3045"> <Title>Synchronous Tree-Adjoining Grammars</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Synchronous TAGs--An Infor- </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> mal Description </SectionTitle> <Paragraph position="0"> Language interpretation tasks can be thought of as associating a syntactic analysis of a sentence with some other stmcture,---a logical form representation or an analysis of a target language sentence, perhaps. Synchronous TAGs are defined so as to make such associations explicit. The original language and its associated structures are both defined by grammars stated in a TAG formalism; the two TAGs are synchronous in the sense that adjunction and substitution operations are applied simultaneously to related nodes in pairs of trees, one for each language.</Paragraph> <Paragraph position="1"> For convenience, we will call the two languages source and target languages, although the formalism is not inherently directional.</Paragraph> <Paragraph position="2"> As an example, consider the task of relating a fragment of English with a simple representation of its predicate-argument structure. A synchronous TAG for this purpose is given in Figure 1. Each element of the</Paragraph> <Paragraph position="4"> synchronous TAG is a pair consisting of two elementar2,' trees, one from tlie source language (English) and one from the target (logical form \[LF\]). Nodes, one from each tree, may be linked; ~ such links are depicted graphically as thick lines. If we project the pairs onto their first or second components (ignoring the cross links), the projections are TAGs for an English fragment and an LF fragment, respectively, qhese grammars are themselves written in a particular variant of TAGs; the choice of this base formalism, as we will call it, is free. In the case at hand, we have chosen single-component lexicalized TAGs with adjunction and substitution (Schabes et el., 1988). Later examples are built on other bases.</Paragraph> <Paragraph position="5"> The elementary operation in a synchronous TAG is supervenient on the elementary operations in the base formalism. A derivation step from a pair of trees (cq, a2) proceeds as follows:</Paragraph> <Paragraph position="7"> Nondeterministically choose a link in the pair connecting two nodes (say, nl in cq and no in c~2).</Paragraph> <Paragraph position="8"> Nondeterministically choose a pair of trees (3~, 32) in the grammar.</Paragraph> <Paragraph position="9"> . Form the resultant pair </3t(oq, nl), ;32(~2, n2)) where 3(c~, n) is the result of performing a primitive operation in the base formalism on a at node n using 3 (e.g., adjoining or substituting 3 into at n). 3 Synchronous TAG derivation then proceods by choos~ ing a pair of initial trees (cq, o~2) that is an element of the grammar, and repeatedly applying derivation steps as above.</Paragraph> <Paragraph position="10"> As an example, suppose we start with the tree pair c~ in Figure 1. 4 We choose the link from the subject NP to T and the tree pair fl to apply to its nodes. The resultant, by synchronous substitution, is the tree pair:</Paragraph> <Paragraph position="12"> Note that the links from a are preserved in the resultant pair cq except for the chosen link, which has no counterpart in the result.</Paragraph> <Paragraph position="13"> Using tree pair 7 on the remaining link from NP to T in oq yields</Paragraph> <Paragraph position="15"> This pairing manifests the correspondence between the sentence &quot;George hates broccoli&quot; and its logical form hates' (george' , broccoli') (as written in a more traditional notation). Here we see that the links in the operadeg tor trees (those in 7) are preserved in the resultant pair, accounting for the sole remaining link. Tile trees in 7 are linked in this way so that other tree pairs can modify the N.</Paragraph> <Paragraph position="16"> We can continue the derivation, using 5 and ~ to generate the pair given in Figure 2 thereby associating the meaning violently' ( hates' (george', cooked'( broccol i') ) ) ) with the sentence &quot;George hates cooked broccoli violently.&quot; null A subtle issue mises with respect to link updating in the resultant pair if two links impinge on the same node. When one of the links is chosen and an adjunction performed at the node, the other link must appear in the resultant. The question as to whether that link should now end at the root or foot of the adjoined tree can be resolved in several ways. Although the choice of method does not affect any of the examples in this paper, we mention our current resolution of this problem here. If the remaining link is connected initially to the top of and second trees to differ, one being a substitution and the other an adjunetion, for example.</Paragraph> <Paragraph position="17"> aWe uge standard TAG notation, marking foot nodes in auxiliary trees with '*' and nodes where substitution is m occur with '1/. The nonterminal names in the logical form grammar are mnemonic for Formula, Relation (or function) symbol, Term, and Quantifier. the node serving as the adjunction site, it will connect to the top of the root node of the adjoined auxiliary nee after the adjunction has been performed; conversely, if it is connected initially to the bottom of the node, it will connect to the bottom of the foot node of the auxiliary tree. In all of the examples in this paper, the links may be thought of as connecting to the tops of nodes. The issue has important ramifications. For instance, the link updating process allows for different derivations of a single derivation in the source language to correspond to derivations of different derivations in the &quot;target lan~ guage; that is, derivation order in synchronous TAGs is in this respect crucial, unlike in the base TAG formalisms. We rely on this property in the analysis of quantifier scope in Section 4.2.</Paragraph> </Section> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Why Synchronous TAGs? </SectionTitle> <Paragraph position="0"> We turn to the question of why, in augmenting TAGs for the purposes of encoding semantic information, it is preferable to use the synchronous TAG method over more conventional methods, such as semantic rules involving logical operations (as in Montague grammar or generalized phrase-structure grammar) or complexfeature-structure encodings (as in unification-based or logic grammar formalisms), First, the arguments for factoring recursion and dependencies as TAGs do for the syntax of natural language have their counterparts in the semantics. The structure of TAGs allows syntactic dependencies--agreement, subcategorization, and so forth--to be localized in the primitives of a grammar, the elementary trees. This is most dramatically evident in the case of long-distance dependencies, such as that between a wh-phrase and its associated gap. Similarly, using TAGs to construct logical forms allows the localization of semantic dependencies in the logical forms of natural language expressions, dependencies such as the signature requirements (argument type and arity) of function and relation symbols, and even the long-distance dependencies between a whquantifier and its associated bound variable. With other methods of semantics, these dependencies cannot be localized; the semantic aspects of filler-gap dependencies must be passed among the features of various nodes in a parse tree or otherwise distributed over the entire derivation. null Second, the use of the synchronous TAG augmentation allows ,an even more radical reduction in the role of features in a TAG grammar. Because of the extended domain of locality that TAGs possess, the role of features and unification is reduced from its role in context-free based systems. Only finite-valued features are needed, with the possible exception of a feature whose value encodes an expression's logical form. In removing the conslz'uction of logical forms from the duties delegateA to features, we can maintain a strictly finiteovalued-and therefore formally dispensable---feature system Ibr TAGs.</Paragraph> <Paragraph position="1"> As a side note, we mention a ramification of the synchronous TAG analysis concerning the claim of Kaplan and Zaenen (1989) that the paths over which long-distance dependencies operate (in the f-structure of lexieal-functional grammatical theory) form a regular language. Vijay-Shanker and Joshi (1989) provide an argument that this claim follows from several assumptions concerning how a feature system for TAGs might be constrained. Vijay-Shanker (personal communication) has noted that by placing a simple assumption on the elementary trees in the logical form component of a synchronous TAG, the proof of this claim becomes immediate. Any TAG in which all foot nodes are immediate children of their associated root generates a tree path language that is regular. ~ Thus, a synchronous TAG (like the grammar presented in Figure 1) whose semantic component forms a TAG with this property necessarily obeys the regular language constraint on long-distance semantic dependencies.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Applications </SectionTitle> <Paragraph position="0"> To exemplify the formalism's utility, we briefly and informally describe its application to the semantics of idioms and quantifiers. A companion paper (Abeill6 et al., 1990) uses a mapping between two TAGs for automatic translation between natural languages, and constitutes a further application of the synchronous TAG concept.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4,1 Idioms </SectionTitle> <Paragraph position="0"> Abeill6 and Schabes (1989) note that lexicalized TAGs are an appropriate representation language for idiomatic constructions, as their expanded domain of locality can account for many syntactic properties of idioms. It seems natural to generalize beyond syntax, as they do, to the claim that lexicalized 'FAGs allow one to deal with semantic noncompositionality. Their argument to this claim is based on an intuition that semantics depends on the TAG derivation structure, an intuition that synchronous TAGs makes precise. For example, the idiomatic construction &quot;kick the bucket&quot; cashes out as the following tree pair, under its idiomatic interpretation:</Paragraph> <Paragraph position="2"> whereas the literal usage of &quot;kick&quot; is associated with a tree pair similar to that of &quot;hates&quot; in Figure 1. Two derivations of the sentence &quot;George kicked the bucket&quot; are possible, each using a different one of these two elementary tree pairs, but both yielding identical derived constituency trees for the English. They will be associated, of course, with two different readings, corresponding to the idiomatic (die'(yeorge')) and literal (kick'(george ~, bucket')) interpretations, respectively.</Paragraph> <Paragraph position="3"> All of the arguments for the TAG analysis of idioms and light verb constructions can then be maintained in a formalism that allows for semantics for them as well.</Paragraph> <Paragraph position="4"> In particular, * Discontinuous syntactic constituents can be semantic'ally localized.</Paragraph> <Paragraph position="5"> * Nonstandard long-distance dependencies are statable without resort to reanalysis.</Paragraph> <Paragraph position="6"> * Both frozen and flexible idioms can be easily characterized. null</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Quantifiers </SectionTitle> <Paragraph position="0"> In order to characterize quantifier scoping possibilities, we use a synchronous TAG whose base formalism is multi-component TAGs (Joshi, 1987), in which the primitive operation is incorporation (by multiple substitutions and adjunctions) of a set of elementary trees at once. In synchronous multi-component TAGs, the links between trees connect, in general, a set of nodes in one tree with a set in another. In particular, an NP will be linked both to a formula in the semantics (the quantifier's scope) and a term (the position bound by the quantifier). We will begin a derivation with just such a pair of elementat3, trees, depicted as at in Figure 3.</Paragraph> <Paragraph position="1"> To distinguish two separate links from a single link among several nodes, we use a coindexing--rather than graphical~-notation for links. Thus, the subject NP node on the left is linked with both the F and first T node on the right, as indicated by the boxed index 1. The inteqgretation of such &quot;hyper-links&quot; is that when a pair is chosen to operate at the link, it must have sets of the correct sizes as its left and right component (1 and 2 in the case at hmad) and the sets are simultaneously used at the various nodes as in a multi-component &quot;lAG. For instance, a quantifiable noun will be paired with a set of two trees: 6</Paragraph> <Paragraph position="3"> Applying the latter multi-component tree pair fll to the initial tree pair al, we derive the next stage in the derivation o~2. We have highlighted the link being operated on at this and later steps by using thick lines for the index boxes of the selected link.</Paragraph> <Paragraph position="4"> The determiner can be introduced with the simple pair leading to the derivation step a3. Completing the derivation using analogous elementary tree pairs, we might generate the final tree pair a4 of Figure 3. This final pairing associates the meaning By : vegetablc' (y).Vx : politician' ( z).hates' ( z, y) with the sentence &quot;Every politician hates some vegetable.&quot; It should be clear that in a structure such as this with multiple NPs, the order of substitution of NPs determines the relative scope of the quantifiers, although it has no effect whatsoever on the syntactic structure. Developing this line of reasoning has led to several detailed predictions of this analysis of quantifier scope, which is beyond this paper's purview. In summary, however, the analysis is slightly more restrictive than that of Hobbs and Shieber (1987), making predictions regarding the scope of topicalized or wh-moved constituents, relative scope of embedded quantifiers, and possibly even syntactic structure of complex NPs.</Paragraph> </Section> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Using Synchronous TAGs </SectionTitle> <Paragraph position="0"> The synchronous TAG formalism is inherently nondirec- null a tin'get expression from a source or vice versa. Thus, it can be used to characterize both of these mappings. Furthermore, the existence of a parsing algorithm for the base formalism of a synchronous TAG is a sufficient condition for interpreting a synchronous TAG grammar.</Paragraph> <Paragraph position="1"> Schabes and Joshi (1988) and Vijay-Shanker and Joshi (1985) provide parsing algorithms for TAGs that could serw:: to parse the base formalism of a synchronous TAG. Given such an algorithm, semantic interpretation can be performed by parsing the sentence according to the source grammar; the pairings then determine a derivation in the target language for tile logical form. Generation from a logical form proceeds by the converse process of parsing the logical form expression thereby determining the derivation for the natural language senfence. Machine translation proceeds akmg similar lines by mapping two 'FAGs directly (Abeill6 et al., 1990), In previous work, one of us noted that generation according to an augmented context-free grammar can be made more efficient by requiring the grammar to be se- mantically monotonic (Shieber, 1988); the derived semantics for an expression must include, in an appropriate sense, the semantic material of all its subconstituents. It is interesting to note that synchronous &quot;FAGs are inherently semantically monotonic. Furthermore, it is reasonable to require that the semantic component of a synchronous TAG t~ lexicalized (in the sense of Schabes et al. (1988)), allowing for more efficient parsing according to the semantic grammar and, consequenlly, more efficient generation. In the case of augmented context-free grammars, the semantic monotonicity requirement precludes &quot;lexicalization&quot; of the semantics. It is not possible to require nontrivial semantics to be associated with each lexical item. In summary, just as lexicalizalion of the syntactic grammar aids parsing (Schabes and Joshi, 1990), so lexicalization of the semantic gra.,nmz:r aids generation.</Paragraph> <Paragraph position="2"> Tile description of parsing and germration above rnay seem to imply that these processes cannot be pcrlormcd incrementally, that is, an entire source derivation must be recovered before the corresponding target derivation can be computed. The issue deserves clarification.</Paragraph> <Paragraph position="3"> In the case wltere the synchronous TAG is order-independent (that is, the order of derivation in one TAG does not effect the result in the other, as when no two links share an endpoint) there is a one-to-one mapping between the source and target derivation. When partial source derivations are recognized by the parser, the corresponding partial target derivation (for example semantic inteq)retation) can be incrementally compuled: as the input is read from left to right, interpretations of the partial target derivations corresponding to partial source derivations can be combined in one step to buikl a larger partial target derivation.</Paragraph> <Paragraph position="4"> 5 257 When the synchronous TAG is order-sensitive, however, there may be a many-to-many correspondence between source derivations and target derivations. This is the case, for instance, in a grammar in which alternative quantifier scopings may be generated for a single sentence. In this case, it is unclear what should even be meant by incremental computation. For instance, midway in parsing a sentence, at a point at which a single quantified NP has been analyzed, the incremental interpretation could not possibly represent all possible scopings that that quantifier might end up taking, as it is not known what the quantifier might be required to scope with respect to. At the point in the parse where the scoping decision can be made, it is not clear whether an inerementality requirement would mean that the variant scopings must all be explicitly generated at that point, or only implicitly generable.</Paragraph> <Paragraph position="5"> With respect to synchronous TAGs, these considerations are reflected in choice of parsing algorithm. Efficiency of parsing necessitates that only one canonical derivation (say leftmost or rightmost) need to be computed; all other derivations yield the same object. Standard parsing algorithms for both TAGs and CFGs rely on this optimization. If incrementality requires that we generate explicit representations of all possible interpretations (i.e., target derivations) of the string seen so far, then this optimization cannot be used, and parsing will be highly inefficient. If the representation can be left implicit, the optimization can be maintained, but retrieval of explicit representations will be combinatorially more complex.</Paragraph> </Section> class="xml-element"></Paper>