Synchronous Tree-Adjoining Grammars 
Stuart M. Shieber 
Computer Science Department 
Hmward University 
Cambridge, MA USA 
Yves Schabes 
Computer and Information Sciences Department 
University of Pennsylvania 
Philadelphia, PA USA 
Abstract 
The unique properties of lree-adjoining grammars (TAG) 
present a challenge for the application of 'FAGs beyond 
the limited confines of syntax, for instance, to the task of 
semantic interpretation or automatic translation of nat- 
ural h'mguage. We present a variant of "FAGs, called 
synchronous TAGs, which chmacterize correspondences 
between languages. "lq\]e formalism's intended usage is 
to relate expressions of natural languages to their associ- 
ated semantics represented in a logical tbrm language, or 
to their translates in another natural language; in sum- 
mary, we intend it to allow TAGs to be used beyond 
their role in syntax proper. We discuss the application 
of synchronous TAGs to concrete examples, mention- 
ing primarily in passing some computational issues that 
tu:ise in its interpretation. 
1 Introduction 
Tree-adjoining grammars (TAG) constitute a grammat- 
ical formalism with attractive properties for the strong 
characterization of the syntax of natural langtmges, that 
is, characterization of the analysis trees of the expres- 
sions in the language (Kroch and Joshi, 1985; Kroch, 
1989)) Among these properties are that 
o The domain of locality in TAGs is larger than 
lot formalisms lhat augment context-free grammars 
(such as lexical-functkmal, or generalized or head- 
driven phrase-structure grammar), and 
® The statements of dependencies and recursion pos- 
sibilities in a tree are factored, the former following 
from primitive dependencies in elementary trees, 
the latter a consequence of an operatkm of adjunc- 
tion of trees. 
These unique properties of TAGs present a challenge 
tot the application of TAGs beyond the limited confines 
of syntax, for instance, to the task of semantic interpre- 
tation or automatic tr~mslation of natural language. The 
slandm'd methods of moving beyond syntax to interpre- 
tation make use in one way or another of the compo- 
sitional structure of the analysis tree that is manifested 
in the tree's derivation. Any version of compositional 
1We assume familiarity throughout the paper with previous work 
on TAGs. See, for instance, the introduction by Joshi (1987). 
semantics, or syntax.directed translation relies on such 
a methodology to some extent. However, in the case of 
TAGs, the compositional structure of the tree is not miro 
rored by its derivational structure, so that a method for 
constructing semantics based on the compositional syn- 
tactic structure will be inherently nonderivational, that 
is, construction of the semantics will be independent of 
the derivation of the tree, and therefore subsequent. 
On the other hand, a method mirroring the deriva- 
tional structure will not necessarily be compositional 
with respect to tile derived structures of expressions. AI+ 
tl~ough such a method would be quite different from ttle 
primarily compositional methods previously postulated, 
it may have advantages, given that certain aspects of 
language seem to be noncompositional. (See Section 4.) 
In this paper, we present a varim~t of TAGs, called 
synchronous TAGs, which characterize correstxmdences 
between languages. The formalism's intended usage is 
to relate expressions of natural languages to their asso- 
ciated semantics represented in a logical form language, 
or to their translations in another natural language; in 
summary, we intend the formalism to allow TAGs to be 
used beyond their role in syntax proper. We also discuss 
its application to concrete examples, and mention some 
computational issues that arise in its interpretation. 
2 Synchronous TAGs--An Infor- 
mal Description 
Language interpretation tasks can be thought of as asso- 
ciating a syntactic analysis of a sentence with some other 
stmcture,---a logical form representation or an analysis of 
a target language sentence, perhaps. Synchronous TAGs 
are defined so as to make such associations explicit. The 
original language and its associated structures are both 
defined by grammars stated in a TAG formalism; the 
two TAGs are synchronous in the sense that adjunction 
and substitution operations are applied simultaneously 
to related nodes in pairs of trees, one for each language. 
For convenience, we will call the two languages source 
and target languages, although the formalism is not in- 
herently directional. 
As an example, consider the task of relating a frag- 
ment of English with a simple representation of its 
predicate-argument structure. A synchronous TAG for 
this purpose is given in Figure 1. Each element of the 
1 253 
NP V~ R ~ T T 
\ V NP$ hates' // 
I I George george' 
N Jbroccoli) \ 
br~coli \[ 
vP F \ 
p, A vp F,\] 
violently violently' I 
cooked cooked' I 
Figure 1: A sample synchronous TAG. 
synchronous TAG is a pair consisting of two elemen- 
tar2,' trees, one from tlie source language (English) and 
one from the target (logical form \[LF\]). Nodes, one from 
each tree, may be linked; ~ such links are depicted graph- 
ically as thick lines. If we project the pairs onto their 
first or second components (ignoring the cross links), the 
projections are TAGs for an English fragment and an LF 
fragment, respectively, qhese grammars are themselves 
written in a particular variant of TAGs; the choice of this 
base formalism, as we will call it, is free. In the case 
at hand, we have chosen single-component lexicalized 
TAGs with adjunction and substitution (Schabes et el., 
1988). Later examples are built on other bases. 
The elementary operation in a synchronous TAG is su- 
pervenient on the elementary operations in the base for- 
malism. A derivation step from a pair of trees (cq, a2) 
proceeds as follows: 
1. 
. 
Nondeterministically choose a link in the pair con- 
necting two nodes (say, nl in cq and no in c~2). 
Nondeterministically choose a pair of trees (3~, 32) 
in the grammar. 
. Form the resultant pair </3t(oq, nl), ;32(~2, n2)) 
where 3(c~, n) is the result of performing a primi- 
tive operation in the base formalism on a at node 
n using 3 (e.g., adjoining or substituting 3 into 
at n). 3 
2We will generalize the links later to allow sets of nodes from one 
tree to be linked to sets from the other. 
3The definition allows for the operations performed on the first 
Synchronous TAG derivation then proceods by choos~ 
ing a pair of initial trees (cq, o~2) that is an element of 
the grammar, and repeatedly applying derivation steps 
as above. 
As an example, suppose we start with the tree pair 
c~ in Figure 1. 4 We choose the link from the subject 
NP to T and the tree pair fl to apply to its nodes. The 
resultant, by synchronous substitution, is the tree pair: 
i Ny T T,\ 
I I I \George V "/ ~P, hates' georgeJ / 
Note that the links from a are preserved in the resul- 
tant pair cq except for the chosen link, which has no 
counterpart in the result. 
Using tree pair 7 on the remaining link from NP to T 
in oq yields 
o~ 2 \] NP VP~..~ ~ R T .~. T 
\ George y ~P hare'george')broccoli' 
\ hates 
broccoli 
This pairing manifests the correspondence between the 
sentence "George hates broccoli" and its logical form 
hates' (george' , broccoli') (as written in a more tradi- 
tional notation). Here we see that the links in the opera° 
tor trees (those in 7) are preserved in the resultant pair, 
accounting for the sole remaining link. Tile trees in 7 
are linked in this way so that other tree pairs can modify 
the N. 
We can continue the derivation, using 5 and ~ to gen- 
erate the pair given in Figure 2 thereby associating the 
meaning 
violently' ( hates' (george', cooked'( broccol i') ) ) ) 
with the sentence "George hates cooked broccoli vio- 
lently." 
A subtle issue mises with respect to link updating in 
the resultant pair if two links impinge on the same node. 
When one of the links is chosen and an adjunction per- 
formed at the node, the other link must appear in the 
resultant. The question as to whether that link should 
now end at the root or foot of the adjoined tree can be re- 
solved in several ways. Although the choice of method 
does not affect any of the examples in this paper, we 
mention our current resolution of this problem here. If 
the remaining link is connected initially to the top of 
and second trees to differ, one being a substitution and the other an 
adjunetion, for example. 
aWe uge standard TAG notation, marking foot nodes in auxiliary 
trees with '*' and nodes where substitution is m occur with '1/. The 
nonterminal names in the logical form grammar are mnemonic for 
Formula, Relation (or function) symbol, Term, and Quantifier. 
254 2 
F 
George VP ADVP violently' T ~ ~ 
hates N,,....._ /cooked" broccoli' 
I 
cooked broccoli 
Figure 2: Derived tree pair for "George hates cooked broccoli violently." 
the node serving as the adjunction site, it will connect 
to the top of the root node of the adjoined auxiliary nee 
after the adjunction has been performed; conversely, if 
it is connected initially to the bottom of the node, it will 
connect to the bottom of the foot node of the auxiliary 
tree. In all of the examples in this paper, the links may 
be thought of as connecting to the tops of nodes. The 
issue has important ramifications. For instance, the link 
updating process allows for different derivations of a 
single derivation in the source language to correspond 
to derivations of different derivations in the "target lan~ 
guage; that is, derivation order in synchronous TAGs 
is in this respect crucial, unlike in the base TAG for- 
malisms. We rely on this property in the analysis of 
quantifier scope in Section 4.2. 
3 Why Synchronous TAGs? 
We turn to the question of why, in augmenting TAGs 
for the purposes of encoding semantic information, it 
is preferable to use the synchronous TAG method over 
more conventional methods, such as semantic rules in- 
volving logical operations (as in Montague grammar 
or generalized phrase-structure grammar) or complex- 
feature-structure encodings (as in unification-based or 
logic grammar formalisms), 
First, the arguments for factoring recursion and depen- 
dencies as TAGs do for the syntax of natural language 
have their counterparts in the semantics. The structure of 
TAGs allows syntactic dependencies--agreement, sub- 
categorization, and so forth--to be localized in the prim- 
itives of a grammar, the elementary trees. This is most 
dramatically evident in the case of long-distance depen- 
dencies, such as that between a wh-phrase and its as- 
sociated gap. Similarly, using TAGs to construct logi- 
cal forms allows the localization of semantic dependen- 
cies in the logical forms of natural language expressions, 
dependencies such as the signature requirements (argu- 
ment type and arity) of function and relation symbols, 
and even the long-distance dependencies between a wh- 
quantifier and its associated bound variable. With other 
methods of semantics, these dependencies cannot be lo- 
calized; the semantic aspects of filler-gap dependencies 
must be passed among the features of various nodes in a 
parse tree or otherwise distributed over the entire deriva- 
tion. 
Second, the use of the synchronous TAG augmenta- 
tion allows ,an even more radical reduction in the role 
of features in a TAG grammar. Because of the extended 
domain of locality that TAGs possess, the role of features 
and unification is reduced from its role in context-free 
based systems. Only finite-valued features are needed, 
with the possible exception of a feature whose value 
encodes an expression's logical form. In removing the 
conslz'uction of logical forms from the duties delegateA 
to features, we can maintain a strictly finiteovalued-- 
and therefore formally dispensable---feature system Ibr 
TAGs. 
As a side note, we mention a ramification of the syn- 
chronous TAG analysis concerning the claim of Ka- 
plan and Zaenen (1989) that the paths over which 
long-distance dependencies operate (in the f-structure 
of lexieal-functional grammatical theory) form a regu- 
lar language. Vijay-Shanker and Joshi (1989) provide 
an argument that this claim follows from several as- 
sumptions concerning how a feature system for TAGs 
might be constrained. Vijay-Shanker (personal commu- 
nication) has noted that by placing a simple assumption 
on the elementary trees in the logical form component 
of a synchronous TAG, the proof of this claim becomes 
immediate. Any TAG in which all foot nodes are im- 
mediate children of their associated root generates a tree 
path language that is regular. ~ Thus, a synchronous TAG 
(like the grammar presented in Figure 1) whose semantic 
component forms a TAG with this property necessarily 
obeys the regular language constraint on long-distance 
semantic dependencies. 
4 Applications 
To exemplify the formalism's utility, we briefly and in- 
formally describe its application to the semantics of id- 
ioms and quantifiers. A companion paper (Abeill6 et al., 
1990) uses a mapping between two TAGs for automatic 
translation between natural languages, and constitutes 
a further application of the synchronous TAG concept. 
5This is a folk theorem whose straighlforward proof is left as an 
exercise for the reader, 
3 255 
More expansive descriptions of these analyses will be 
forthcoming in joint work with Anne Abeilld (idioms 
and translation) and Anthony Kroch (quantifiers). 
4,1 Idioms 
Abeill6 and Schabes (1989) note that lexicalized TAGs 
are an appropriate representation language for idiomatic 
constructions, as their expanded domain of locality can 
account for many syntactic properties of idioms. It 
seems natural to generalize beyond syntax, as they do, 
to the claim that lexicalized 'FAGs allow one to deal 
with semantic noncompositionality. Their argument to 
this claim is based on an intuition that semantics de- 
pends on the TAG derivation structure, an intuition that 
synchronous TAGs makes precise. For example, the id- 
iomatic construction "kick the bucket" cashes out as the 
following tree pair, under its idiomatic interpretation: 
a3 d}e' $ 
whereas the literal usage of "kick" is associated with 
a tree pair similar to that of "hates" in Figure 1. Two 
derivations of the sentence "George kicked the bucket" 
are possible, each using a different one of these two 
elementary tree pairs, but both yielding identical de- 
rived constituency trees for the English. They will be 
associated, of course, with two different readings, cor- 
responding to the idiomatic (die'(yeorge')) and literal 
(kick'(george ~, bucket')) interpretations, respectively. 
All of the arguments for the TAG analysis of idioms 
and light verb constructions can then be maintained in 
a formalism that allows for semantics for them as well. 
In particular, 
• Discontinuous syntactic constituents can be seman- 
tic'ally localized. 
• Nonstandard long-distance dependencies are stat- 
able without resort to reanalysis. 
• Both frozen and flexible idioms can be easily char- 
acterized. 
4.2 Quantifiers 
In order to characterize quantifier scoping possibilities, 
we use a synchronous TAG whose base formalism is 
multi-component TAGs (Joshi, 1987), in which the prim- 
itive operation is incorporation (by multiple substitutions 
and adjunctions) of a set of elementary trees at once. In 
synchronous multi-component TAGs, the links between 
trees connect, in general, a set of nodes in one tree with 
a set in another. In particular, an NP will be linked both 
to a formula in the semantics (the quantifier's scope) and 
a term (the position bound by the quantifier). We will 
begin a derivation with just such a pair of elementat3, 
trees, depicted as at in Figure 3. 
To distinguish two separate links from a single link 
among several nodes, we use a coindexing--rather than 
graphical~-notation for links. Thus, the subject NP node 
on the left is linked with both the F and first T node 
on the right, as indicated by the boxed index 1. The 
inteqgretation of such "hyper-links" is that when a pair 
is chosen to operate at the link, it must have sets of the 
correct sizes as its left and right component (1 and 2 in 
the case at hmad) and the sets are simultaneously used 
at the various nodes as in a multi-component "lAG. For 
instance, a quantifiable noun will be paired with a set of 
two trees: 6 
politician R T x 
politician 
Applying the latter multi-component tree pair fll to the 
initial tree pair al, we derive the next stage in the deriva- 
tion o~2. We have highlighted the link being operated on 
at this and later steps by using thick lines for the index 
boxes of the selected link. 
The determiner can be introduced with the simple pair 
leading to the derivation step a3. Completing the deriva- 
tion using analogous elementary tree pairs, we might 
generate the final tree pair a4 of Figure 3. This final 
pairing associates the meaning 
By : vegetablc' (y).Vx : politician' ( z).hates' ( z, y) 
with the sentence "Every politician hates some veg- 
etable." It should be clear that in a structure such as this 
with multiple NPs, the order of substitution of NPs de- 
termines the relative scope of the quantifiers, although it 
has no effect whatsoever on the syntactic structure. De- 
veloping this line of reasoning has led to several detailed 
predictions of this analysis of quantifier scope, which is 
beyond this paper's purview. In summary, however, the 
analysis is slightly more restrictive than that of Hobbs 
and Shieber (1987), making predictions regarding the 
scope of topicalized or wh-moved constituents, relative 
scope of embedded quantifiers, and possibly even syn- 
tactic structure of complex NPs. 
5 Using Synchronous TAGs 
The synchronous TAG formalism is inherently nondirec- 
tional. Derivation is not defined in terms of constructing 
6The subscript x on certain nodes is the value of a feature on 
the nodes corresponding to the variable bound by the quantifier. The 
technique of using metavariables to encode object variables is familiar 
from the logic and unification-based grammar literatures, Variable 
renaming with respect to these variables proceeds as usual. 
256 4 
% 
I S 
V NP~\] 1 
hates 
NP 
V NP~ I 
politician hates 
mm F \ 
j--T'~.... 
NINF / 
IIiq~, ~x F 
R T x R T x NT, I" I 1 
politician' hates' 
% 
% 
( 
f 
S 
NP 
D N V NPDI I I t 
every politician hates 
S .--ti'--"----. 
NP 
D N V NP 
every politician hates D N 
I I a vegetable 
F F 
VR T x R T x rylT/ I I '~/ 
politician" hates' / 
\ 
9Y F F 
vegetable V ,R T x R T x T, / I I Y/ 
politician' hates" / 
Figure 3: Sample synchronous TAG derivation steps for "Every politician hates a vegetable." 
a tin'get expression from a source or vice versa. Thus, 
it can be used to characterize both of these mappings. 
Furthermore, the existence of a parsing algorithm for 
the base formalism of a synchronous TAG is a sufficient 
condition for interpreting a synchronous TAG grammar. 
Schabes and Joshi (1988) and Vijay-Shanker and Joshi 
(1985) provide parsing algorithms for TAGs that could 
serw:: to parse the base formalism of a synchronous TAG. 
Given such an algorithm, semantic interpretation can 
be performed by parsing the sentence according to the 
source grammar; the pairings then determine a deriva- 
tion in the target language for tile logical form. Gen- 
eration from a logical form proceeds by the converse 
process of parsing the logical form expression thereby 
determining the derivation for the natural language sen- 
fence. Machine translation proceeds akmg similar lines 
by mapping two 'FAGs directly (Abeill6 et al., 1990), 
In previous work, one of us noted that generation ac- 
cording to an augmented context-free grammar can be 
made more efficient by requiring the grammar to be se- mantically monotonic 
(Shieber, 1988); the derived se- 
mantics for an expression must include, in an appropri- 
ate sense, the semantic material of all its subconstituents. 
It is interesting to note that synchronous "FAGs are in- 
herently semantically monotonic. Furthermore, it is rea- 
sonable to require that the semantic component of a syn- 
chronous TAG t~ lexicalized (in the sense of Schabes et 
al. (1988)), allowing for more efficient parsing accord- 
ing to the semantic grammar and, consequenlly, more 
efficient generation. In the case of augmented context- 
free grammars, the semantic monotonicity requirement 
precludes "lexicalization" of the semantics. It is not 
possible to require nontrivial semantics to be associated 
with each lexical item. In summary, just as lexicaliza- 
lion of the syntactic grammar aids parsing (Schabes and 
Joshi, 1990), so lexicalization of the semantic gra.,nmz:r 
aids generation. 
Tile description of parsing and germration above rnay 
seem to imply that these processes cannot be pcrlormcd 
incrementally, that is, an entire source derivation must 
be recovered before the corresponding target derivation 
can be computed. The issue deserves clarification. 
In the case wltere the synchronous TAG is order- 
independent (that is, the order of derivation in one TAG 
does not effect the result in the other, as when no two 
links share an endpoint) there is a one-to-one mapping 
between the source and target derivation. When par- 
tial source derivations are recognized by the parser, the 
corresponding partial target derivation (for example se- 
mantic inteq)retation) can be incrementally compuled: 
as the input is read from left to right, interpretations 
of the partial target derivations corresponding to partial 
source derivations can be combined in one step to buikl 
a larger partial target derivation. 
5 257 
When the synchronous TAG is order-sensitive, how- 
ever, there may be a many-to-many correspondence be- 
tween source derivations and target derivations. This is 
the case, for instance, in a grammar in which alterna- 
tive quantifier scopings may be generated for a single 
sentence. In this case, it is unclear what should even be 
meant by incremental computation. For instance, mid- 
way in parsing a sentence, at a point at which a single 
quantified NP has been analyzed, the incremental inter- 
pretation could not possibly represent all possible scop- 
ings that that quantifier might end up taking, as it is not 
known what the quantifier might be required to scope 
with respect to. At the point in the parse where the 
scoping decision can be made, it is not clear whether an 
inerementality requirement would mean that the variant 
scopings must all be explicitly generated at that point, 
or only implicitly generable. 
With respect to synchronous TAGs, these considera- 
tions are reflected in choice of parsing algorithm. Ef- 
ficiency of parsing necessitates that only one canonical 
derivation (say leftmost or rightmost) need to be com- 
puted; all other derivations yield the same object. Stan- 
dard parsing algorithms for both TAGs and CFGs rely 
on this optimization. If incrementality requires that we 
generate explicit representations of all possible interpre- 
tations (i.e., target derivations) of the string seen so far, 
then this optimization cannot be used, and parsing will 
be highly inefficient. If the representation can be left im- 
plicit, the optimization can be maintained, but retrieval 
of explicit representations will be combinatorially more 
complex. 
6 Conclusion 
The use of tree-adjoining grammars for natural- 
language-processing tasks requh'es the ability to move 
beyond a characterization of syntactic structure, Syn- 
chronous TAGs provide a simple mechanism that can 
be used to graft such an ability onto a base TAG for- 
realism. 
Acknowledgements 
This research was partially funded by ARt grant 
DAAG29-84-K-0061, DARPA grant N00014-85-K0018, 
and NSF grant MCS-82-19196 at the University of Penn- 
sylvania. We are indebted to Aravind Joshi for his sup- 
port of this research and to Anne AbeiU6 and Anthony 
Kroch for their collaboration in the genesis of these 
ideas and their comments on earlier versions. K. Vijay- 
Shanker and Marilyn Walker also provided valuable 
comments. All remaining errors are the authors' re- 
sponsibility alone. 

Bibliography 

Anne Abeill6 and Yves Schabes. 1989. Parsing idioms 
in tree adjoining grammars. In Proceedings of the 
Fourth Conference of the European Chapter of the As- 
sociation for Computational Linguistics, Manchester, 
England. 

Anne Abeill6, Yves Schabes, and Aravind K. Joshi. 
1990. Using lexicalized tree adjoining grammm's 
for machine Ixanslation. To appear in the 13 ~h In- 
ternational Conference on Computational Linguistics 
(COLING'90). 

Jerry Hobbs and Stuart M. Shieber. 1987. An algo- 
rithm for generating quantifier scopings, Computa- 
tional Linguistics, 13 (1-2):47-63. 

Aravind K. Joshi and K. Vijay-Shanker. 1989. Un- 
bounded dependencies in tags and lfg: functional un- 
certainty is a corolary in tags. In Proceedings of 
the 27 th Meeting of the Association for Computational 
Linguistics, Vancouver. 

Aravind K. Joshi. 1987. An introduction to tree adjoin- 
ing grammars. In A. Manaster-Ramer, editor, Mathe- 
matics of Language. John Benjamins, Amsterdam. 

Ron Kaplan and Annie Zaenen. 1989. Long-distance 
dependencies as a case of functional uncertainty. In 
M. Baltin and A. Kroch, editors, Alternative Concep- 
tions of Phrase Structure. University of Chicago Press. 

Anthony Kroch and Aravind K. Joshi. 1985. The lin- 
guistic relevance of tree adjoining grammars. Techni- 
cal Report MS-CIS-85-18, Department of Computer 
and Information Science, University of Pennsylvania, 
April. 

Anthony Kroch. 1989. Asymmetries in long dis- 
tance extraction in a tag grammar. In M. Baltin and 
A. Kroch, editors, Alternative Conceptions of Phrase 
Structure, pages 66-98. University of Chicago Press. 

Yves Schabes and Aravind K. Joshi. 1988. An Earley- 
type parsing algorithm for tree adjoining grammars. 
In Proceedings of the 26 ~h Meeting of the Association 
for Computational Linguistics, Buffalo, June. 

Yves Schabes and Aravind K. Joshi. 1990. Parsing 
with lexicalized tree adjoining grammar. In Masaru 
Tomita, editor, Current Issues in Parsing Technolo- 
gies. Kluwer Accademic Publishers. 

Yves Schabes, Anne Abeill6, and Aravind K. Joshi. 
1988. Parsing strategies with 'lexiealized' grammars: 
Application to tree adjoining grammars. In Proceed- 
ings of the 12 th International Conference on Compu- 
tational Linguistics, Budapest, Hungary, August. 

Stuart M. Shieber. 1988. A uniform architecture for 
parsing and generation. In Proceedings of the 12 ~h 
International Conference on Computational Linguis- 
tics, Budapest, August. 

K. Vijay-Shanker and Aravind K. Joshi. 1985. Some 
computational properties of tree adjoining grammars. 
In 23 ~a Meeting of the Association for Computational 
Linguistics, pages 82-93, Chicago, Illinois, July. 
