D-Tree Grammars 
Owen Rambow 
CoGenTex, Inc. 
840 Hanshaw Road 
Ithaca, NY 14850 
owen@cogent ex. com 
K. Vijay-Shanker 
Department of Computer 
Information Science 
University of Delaware 
Newark, DE 19716 
vii ay©udel, edu 
David Weir 
School of Cognitive & 
Computing Sciences 
University of Sussex 
Brighton, BN1 9HQ, UK. 
david, weir~cogs, susx. ac. uk 
Abstract 
DTG are designed to share some of the 
advantages of TAG while overcoming some 
of its limitations. DTG involve two com- 
position operations called subsertion and 
sister-adjunction. The most distinctive fea- 
ture of DTG is that, unlike TAG, there is 
complete uniformity in the way that the 
two DTG operations relate lexical items: 
subsertion always corresponds to comple- 
mentation and sister-adjunction to modi- 
fication. Furthermore, DTG, unlike TAG, 
can provide a uniform analysis for wh- 
movement in English and Kashmiri, des- 
pite the fact that the wh element in Kash- 
miri appears in sentence-second position, 
and not sentence-initial position as in Eng- 
lish. 
1 Introduction 
We define a new grammar formalism, called D-Tree 
Grammars (DTG), which arises from work on Tree- 
Adjoining Grammars (TAG) (Joshi et al., 1975). A 
salient feature of TAG is the extended domain of lo- 
cality it provides. Each elementary structure can 
be associated with a lexical item (as in Lexicalized 
TAG (LTAG) (Joshi ~ Schabes, 1991)). Properties 
related to the lexical item (such as subcategoriza- 
tion, agreement, certain types of word order varia- 
tion) can be expressed within the elementary struc- 
ture (Kroch, 1987; Frank, 1992). In addition, TAG 
remain tractable, yet their generative capacity is suf- 
ficient to account for certain syntactic phenomena 
that, it has been argued, lie beyond Context-Free 
Grammars (CFG) (Shieber, 1985). TAG, however, has 
two limitations which provide the motivation for this 
work. The first problem (discussed in Section 1.1) 
is that the TAG operations of substitution and ad- 
junction do not map cleanly onto the relations of 
complementation and modification. A second pro- 
blem (discussed in Section 1.2) has to do with the 
inability of TAG to provide analyses for certain syn- 
tactic phenomena. In developing DTG we have tried 
to overcome these problems while remaining faith- 
ful to what we see as the key advantages of TAG (in 
particular, its enlarged domain of locality). In Sec- 
tion 1.3 we introduce some of the key features of 
DTG and explain how they are intended to address 
the problems that we have identified with TAG. 
1,1 Derivations and Dependencies 
In LTAG, the operations of substitution and adjunc- 
tion relate two lexical items. It is therefore natural 
to interpret these operations as establishing a di- 
rect linguistic relation between the two lexical items, 
namely a relation of complementation (predicate- 
argument relation) or of modification. In purely 
CFG-based approaches, these relations are only im- 
plicit. However, they represent important linguistic 
intuition, they provide a uniform interface to se- 
mantics, and they are, as Schabes ~ Shieber (1994) 
argue, important in order to support statistical pa- 
rameters in stochastic frameworks and appropriate 
adjunction constraints in TAG. In many frameworks, 
complementation and modification are in fact made 
explicit: LFG (Bresnan & Kaplan, 1982) provides a 
separate functional (f-) structure, and dependency 
grammars (see e.g. Mel'~uk (1988)) use these no- 
tions as the principal basis for syntactic represen- 
tation. We will follow the dependency literature 
in referring to complementation and modification 
as syntactic dependency. As observed by Rambow 
and Joshi (1992), for TAG, the importance of the 
dependency structure means that not only the deri- 
ved phrase-structure tree is of interest, but also the 
operations by which we obtained it from elementary 
structures. This information is encoded in the deri- 
vation tree (Vijay-Shanker, 1987). 
However, as Vijay-Shanker (1992) observes, the 
TAG composition operations are not used uniformly: 
while substitution is used only to add a (nominal) 
complement, adjunction is used both for modifica- 
tion and (clausal) complementation. Clausal com- 
plementation could not be handled uniformly by 
substitution because of the existence of syntactic 
phenomena such as long-distance wh-movement in 
English. Furthermore, there is an inconsistency in 
151 
the directionality of the operations used for comple- 
mentation in TAG@: nominal complements are sub- 
stituted into their governing verb's tree, while the 
governing verb's tree is adjoined into its own clausal 
complement. The fact that adjunction and substitu- 
tion are used in a linguistically heterogeneous man- 
ner means that (standard) "lAG derivation trees do 
not provide a good representation of the dependen- 
cies between the words of the sentence, i.e., of the 
predicate-argument and modification structure. 
adore 
S~ adore 
Mary / OBJ\ seem 
hotdog claim SU~ 
\[MOD I sUBJ Mary / OBJ \ seem 
spicy he hotdog claim 
I MOD MOD~MOD I SUBJ 
small spicy small he 
Figure 1: Derivation trees for (1): original definition 
(left); Schabes & Shieber definition (right) 
For instance, English sentence (1) gets the deriva- 
tion structure shown on the left in Figure 11 . 
(1) Small spicy hotdogs he claims Mary seems to adore 
When comparing this derivation structure to the 
dependency structure in Figure 2, the following pro- 
blems become apparent. First, both adjectives de- 
pend on hotdog, while in the derivation structure 
small is a daughter of spicy. In addition, seem de- 
pends on claim (as does its nominal argument, he), 
and adore depends on seem. In the derivation struc- 
ture, seem is a daughter of adore (the direction does 
not express the actual dependency), and claim is also 
a daughter of adore (though neither is an argument 
of the other). 
claim SUB J~"~OMP 
he seem 
I COMP 
adore SUB~BJ 
Mary hotdog MOD ~.~OD 
spicy small 
Figure 2: Dependency tree for (1) 
Schabes & Shieber (1994) solve the first problem 
1For clarity, we depart from standard TAG notational 
practice and annotate nodes with lexemes and arcs with 
grammatical function: 
by distinguishing between the adjunction of modi- 
fiers and of clausal complements. This gives us the 
derivation structure shown on the right in Figure 1. 
While this might provide a satisfactory treatment of 
modification at the derivation level, there are now 
three types of operations (two adjunctions and sub- 
stitution) for two types of dependencies (arguments 
and modifiers), and the directionality problem for 
embedded clauses remains unsolved. 
In defining DTG we have attempted to resolve 
these problems with the use of a single operation 
(that we call subsertion) for handling Ml comple- 
mentation and a second operation (called sister- 
adjunction) for modification. Before discussion 
these operations further we consider a second pro- 
blem with TAG that has implications for the design 
of these new composition operations (in particular, 
subsertion). 
1.2 Problematic Constructions for TAG 
TAG cannot be used to provide suitable analyses 
for certain syntactic phenomena, including long- 
distance scrambling in German (Becket et hi., 1991), 
Romance Clitics (Bleam, 1994), wh-extraction out of 
complex picture-NPs (Kroch, 1987), and Kashmiri 
wh-extraction (presented here). The problem in de- 
scribing these phenomena with TAG arises from the 
fact (observed by Vijay-Shanker (1992)) that adjoi- 
ning is an overly restricted way of combining structu- 
res. We illustrate the problem by considering Kash- 
miri wh-extraction, drawing on Bhatt (1994). Wh- 
extraction in Kashmiri proceeds as in English, ex- 
cept that the wh-word ends up in sentence-second 
position, with a topic from the matrix clause in 
sentence-initial position. This is illustrated in (2a) 
for a simple clause and in (2b) for a complex clause. 
(2) a. rameshan kyaa dyutnay tse 
RameshzRG whatNOM gave yOUDAT 
What did you give Ramesh? 
b. rameshan kyaal chu baasaan \[ ki 
RameshzRG what is believeNperf that 
me kor ti\] 
IZRG do 
What does Ramesh beheve that I did? 
Since the moved element does not appear in 
sentence-initial position, the TAG analysis of English 
wh-extraction of Kroch (1987; 1989) (in which the 
matrix clause is adjoined into the embedded clause) 
cannot be transferred, and in fact no linguistically 
plausible TAG analysis appears to be available. 
In the past, variants of TAG have been develo- 
ped to extend the range of possible analyses. In 
Multi-Component TAG (MCTAG) (Joshi, 1987), trees 
are grouped into sets which must be adjoined to- 
gether (multicomponent adjunction). However, MC- 
TAG lack expressive power since, while syntactic re- 
lations are invariably subject to c-command or do- 
minance constraints, there is no way to state that 
152 
two trees from a set must be in a dominance rela- 
tion in the derived tree. MCTAG with Domination 
Links (MCTAG-DL) (Becker et al., 1991) are multi- 
component systems that allow for the expression of 
dominance constraints. However, MCTAG-DL share a 
further problem with MCTAG: the derivation struc- 
tures cannot be given a linguistically meaningful in- 
terpretation. Thus, they fail to address the first pro- 
blem we discussed (in Section 1.1). 
1.3 The DTG Approach 
Vijay-Shanker (1992) points out that use of ad- 
junction for clausal complementation in TAG corre- 
sponds, at the level of dependency structure, to sub- 
stitution at the foot node s of the adjoined tree. Ho- 
wever, adjunction (rather than substitution) is used 
since, in general, the structure that is substituted 
may only form part of the clausal complement: the 
remaining substructure of the clausal complement 
appears above the root of the adjoined tree. Un- 
fortunately, as seen in the examples given in Sec- 
tion 1.2, there are cases where satisfactory analyses 
cannot be obtained with adjunction. In particular, 
using adjunction in this way cannot handle cases in 
which parts of the clausal complement are required 
to be placed within the structure of the adjoined 
tree. 
The DTG operation of subsertion is designed to 
overcome this limitation. Subsertion can be viewed 
as a generalization of adjunction in which com- 
ponents of the clausal complement (the subserted 
structure) which are not substituted can be inters- 
persed within the structure that is the site of the 
subsertion. Following earlier work (Becket et al., 
1991; Vijay-Shanker, 1992), DTG provide a mecha- 
nism involving the use of domination links (d-edges) 
that ensure that parts of the subserted structure 
that are not substituted dominate those parts that 
are. Furthermore, there is a need to constrain the 
way in which the non-substituted components can 
be interspersed 3. This is done by either using ap- 
propriate feature constraints at nodes or by means 
of subsertion-insertion constraints (see Section 2). 
We end this section by briefly commenting on the 
other DTG operation of sister-adjunction. In TAG, 
modification is performed with adjunction of modi- 
fier trees that have a highly constrained form. In 
particular, the foot nodes of these trees are always 
daughters of the root and either the leftmost or 
rightmost frontier nodes. The effect of adjoining a 
2In these cases the foot node is an argument node of 
the lexical anchor. 
SThis was also observed by Rambow (1994a), where 
an integrity constraint (first defined for an tD/LP version 
of TAG (Becket et aJ., 1991)) is defined for a MCTAG-DL 
version called V-TAG. However, this was found to be in- 
sufficient for treating both long-distance scrambling and 
long-distance topicalization in German. V-TAG retains 
adjoining (to handle topicalization) for this reason. 
tree of this form corresponds (almost) exactly to the 
addition of a new (leftmost or rightmost) subtree 
below the node that was the site of the adjunction. 
For this reason, we have equipped DTG with an ope- 
ration (sister-adjunction) that does exactly this and 
nothing more. From the definition of DTG in Sec- 
tion 2 it can be seen that the essential aspects of 
Schabes & Shieber (1994) treatment for modifica- 
tion, including multiple modifications of a phrase, 
can be captured by using this operation 4. 
After defining DTG in Section 2, we discuss, in 
Section 3, DTG analyses for the English and Kash- 
miri data presented in this section. Section 4 briefly 
discusses DTG recognition algorithms. 
2 Definition of D-Tree Grammars 
A d-tree is a tree with two types of edges: domi- 
nation edges (d-edges) and immediate domination 
edges (i-edges). D-edges and i-edges express domi- 
nation and immediate domination relations between 
nodes. These relations are never rescinded when d- 
trees are composed. Thus, nodes separated by an 
i-edge will remain in a mother-daughter relationship 
throughout the derivation, whereas nodes separated 
by an d-edge can be equated or have a path of any 
length inserted between them during a derivation. 
D-edges and i-edges are not distributed arbitrarily 
in d-trees. For each internal node, either all of its 
daughters are linked by i-edges or it has a single 
daughter that is linked to it by a d-edge. Each node 
is labelled with a terminal symbol, a nonterminal 
symbol or the empty string. A d-tree containing n 
d-edges can be decomposed into n + 1 components 
containing only i-edges. 
D-trees can be composed using two operations: 
subsertion and sister-adjunction. When a d-tree 
a is subserted into another d-tree/3, a component of 
a is substituted at a frontier nonterminal node (a 
substitution node) of/3 and all components of a 
that are above the substituted component are in- 
serted into d-edges above the substituted node or 
placed above the root node. For example, consider 
the d-trees a and /3 shown in Figure 3. Note that 
components are shown as triangles. In the compo- 
sed d-tree 7 the component a(5) is substituted at 
a substitution node in /3. The components, a(1), 
a(2), and a(4) of a above a(5) drift up the path 
in/3 which runs from the substitution node. These 
components are then inserted into d-edges in/3 or 
above the root of/3. In general, when a component 
c~(i) of some d-tree a is inserted into a d-edge bet- 
ween nodes ~/1 and r/2 two new d-edges are created, 
the first of which relates r/t and the root node of 
a(i), and the second of which relates the frontier 
4Santorini and Mahootian (1995) provide additional 
evidence against the standard TAG approach to modifi- 
cation from code switching data, which can be accounted 
for by using sister-adjunction. 
153 
a =~ insertion \[ 
t ~ insertion 
\[ 
i i 
! 
~ substitution 
p i 
! 
t 
! 
Figure 3: Subsertion 
node of a(i) that dominates the substituted com- 
ponent to T/2. It is possible for components above 
the substituted node to drift arbitrarily far up the 
d-tree and distribute themselves within domination 
edges, or above the root, in any way that is compati- 
ble with the domination relationships present in the 
substituted d-tree. DTG provide a mechanism called 
subsertion-insertlon constraints to control what 
can appear within d-edges (see below). 
The second composition operation involving d- 
trees is called sister-adjunction. When a d-tree a is 
sister-adjoined at a node y in a d-tree fl the com- 
posed d-tree 7 results from the addition to /~ of 
a as a new leftmost or rightmost sub-d-tree below 
7/. Note that sister-adjunction involves the addition 
of exactly one new immediate domination edge and 
that severM sister-adjunctions can occur at the same 
node. Sister-adjoining constraints specify where 
d-trees can be sister-adjoined and whether they will 
be right- or left-sister-adjoined (see below). 
A DTG is a four tuple G = (VN, VT, S, D) where 
VN and VT are the usual nonterminal and termi- 
nal alphabets, S E V~ is a distinguished nonter- 
minal and D is a finite set of elementary d-trees. 
A DTG is said to be lexicalized if each d-tree in 
the grammar has at least one terminal node. The 
elementary d-trees of a grammar G have two addi- 
tionM annotations: subsertion-insertion constraints 
and sister-adjoining constraints• These will be de- 
scribed below, but first we define simultaneously 
DTG derivations and subsertion-adjoining trees (SA- 
trees), which are partial derivation structures that 
can be interpreted as representing dependency in- 
formation, the importance of which was stressed in 
the introduction 5. 
Consider a DTG G = (VN, VT,S, D). In defining 
SA-trees, we assume some naming convention for 
the elementary d-trees in D and some consistent or- 
dering on the components and nodes of elementary 
d-trees in D. For each i, we define the set of d-trees 
TI(G) whose derivations are captured by SA-trees of 
height i or less. Let To(G) be the set D of elemen- 
tary d-trees of G. Mark all of the components of each 
d-tree in To(G) as being substitutable 6. Only com- 
ponents marked as substitutable can be substituted 
in a subsertion operation. The SA-tree for ~ E To(G) 
consists of a single node labelled by the elementary 
d-tree name for a. 
For i > 0 let ~(G) be the union of the set ~-I(G) 
with the set of all d-trees 7 that can be produced as 
follows. Let a E D and let 7 be the result of subser- 
ting or sister-adjoining the d-trees 71,- •., 7k into a 
where 71, • -., 7k are all in Ti- I(G), with the subser- 
tions taking place at different substitution nodes in 
as the footnote. Only substitutable components 
of 71,..-, 3'k can be substituted in these subsertions. 
Only the new components of 7 that came from a are 
marked as substitutable in 7. Let Vl,..., ~'k be the 
SA-trees for 71,..-,7k, respectively. The SA-tree r 
for 7 has root labelled by the name for a and k sub- 
trees rt,. •., rk. The edge from the root of r to the 
root of the subtree ri is labelled by li (1 < i < k) de- 
fined as follows. Suppose that 71 was subserted into 
a and the root of r/is labelled by the name of some 
c~ ~ E D. Only components of a ~ will have been mar- 
ked as substitutable in 7/- Thus, in this subsertion 
some component cJ(j) will have been substituted at 
a node in a with address n. In this case, the la- 
bel l~ is the pair (j, n). Alternatively, 7i will have 
S I)ue to space limitations, in the following definiti- 
ons we are forced to be somewhat imprecise when we 
identify a node in a derived d-tree with the node in the 
elementary d-trees (elementary nodes) from which it was 
derived. This is often done in TAG literature, and hope- 
fully it will be clear what is intended. 
eWe will discuss the notion of substitutability further 
in the next section. It is used to ensure the $A-tree 
is a tree. That is, an elementary structure cannot be 
subserted into more than one structure since this would 
be counter to our motivations for using subsertion for 
complementation. 
154 
been d-sister-adjoined at some node with address n 
in a, in which case li will be the pair (d, n) where 
d e { left, right }. 
The tree set T(G) generated by G.is defined as 
the set of trees 7 such that: 7' E T/(G) for some i 
0; 7 ~ is rooted with the nonterminal S; the frontier of 
7' is a string in V~ ; and 7 results from the removal of 
all d-edges from 7'. A d-edge is removed by merging 
the nodes at either end of the edge as long as they are 
labelled by the same symbol. The string language 
L(G) associated with G is the set of terminal strings 
appearing on the frontier of trees in T(G). 
We have given a reasonably precise definition of 
SA-trees since they play such an important role in 
the motivation for this work. We now describe infor- 
mally a structure that can be used to encode a DTG 
derivation. A derivation graph for 7 E T(G) results 
from the addition of insertion edges to a SA-tree r 
for 7. The location in 7 of an inserted elementary 
component a(i) can be unambiguously determined 
by identifying the source of the node (say the node 
with address n in the elementary d-tree a') with 
which the root of this occurrence of a(i) is merged 
with when d-edges are removed. The insertion edge 
will relate the two (not necessarily distinct) nodes 
corresponding to appropriate occurrences of a and 
a' and will be labelled by the pair (i, n). 
Each d-edge in elementary d-trees has an associa- 
ted subsertion-insertion constraint (SIC). A SIC is a 
finite set of elementary node addresses (ENAs). An 
I=NA ~} specifies some elementary d-tree a E D, a 
component of a and the address of a node within 
that component of a. If a ENA y/is in the SIC asso- 
ciated with a d-edge between 7z and r/2 in an elemen- 
tary d-tree a then ~/cannot appear properly within 
the path that appears from T/t to T/2 in the derived 
tree 7 E T(G). 
Each node of elementary d-trees has an associa- 
ted sister-adjunction constraint (SAC). A SAC is a 
finite set of pairs, each pair identifying a direction 
(left or right) and an elementary d-tree. A SAC gi- 
ves a complete specification of what can be sister- 
adjoined at a node. If a node ~/is associated with 
a SAC containing a pair (d, a) then the d-tree a can 
be d-sister-adjoined at r/. By definition of sister- 
adjunction, all substitution nodes and all nodes at 
the top of d-edges can be assumed to have SACs that 
are the empty-set. This prevents sister-adjunction at 
these nodes. 
In this section we have defined "raw" DTG. In a 
more refined version of the formalism we would as- 
sociate (a single) finite-valued feature structure with 
each node 7. It is a matter of further research to de- 
termine to what extent SICs and SACs can be stated 
globally for a grammar, rather than being attached 
7Trees used in Section 3 make use of such feature 
structures. 
to d-edges/nodes s. See the next section for a brief 
discussion of linguistic principles from which a gram- 
mar's SICs could be derived. 
3 Linguistic Examples 
In this section, we show how an account for the data 
introduced in Section 1 can be given with DTG. 
3.1 Getting Dependencies Right: English 
S I 
a 
o 
! 
s 
NP VP \[fro: +1 
0 ! 
vP\[fin: +\] 
v s v 
I I 
claim-, seems 
S' 
NP S o 
(hotdogs) t 
S. 
~P vP\[~: +l 
(Mary) l 
vP\[fin: -I 
V NP 
I I 
to adore e 
s 
i ! 
! vP\[rm: +1 
vP\[f~: -I 
Figure 4: D-trees for (1) 
In Figure 4, we give a DTG that generates sent- 
ence (1). Every d-tree is a projection from a lexical 
anchor. The label of the maximal projection is, we 
assume, determined by the morphology of the an- 
chor. For example, if the anchor is a finite verb, it 
will project to S, indicating that an overt syntactic 
("surface") subject is required for agreement with 
it (and perhaps case-assignment). Furthermore, a 
finite verb may optionally also project to S' (as in 
the d-tree shown for claims), indicating that a wh- 
moved or topicalized element is required. The fi- 
nite verb seems also projects to S, even though it 
does not itself provide a functional subject. In the 
case of the to adore tree, the situation is the in- 
verse: the functional subject requires a finite verb 
Sin this context, it might be beneficiM to consider 
the expression of a feature-based lexicalist theory such 
as HPSG in DTG, similar to the compilation of HPSG to 
TAG (Kasper et al., 1995). 
155 
to agree with, which is signaled by the fact that its 
component's root and frontier nodes are labelled S 
and VP, respectively, but the verb itself is not finite 
and therefore only projects to VP\[-fin\]. Therefore, 
the subject will have to raise out of its clause for 
agreement and case assignment. The direct object 
of to adore has wh-moved out of the projection of 
the verb (we include a trace for the sake of clarity). 
S' 
NP S 
N' NP VP 
AdjP AdjP N he V S 
I i I I 
Adj Adj hotdogs claims NP VP 
small spicy Mary seems VP 
V NP 
I t 
to adore e 
Figure 5: Derived tree for (1) 
We add SlCs to ensure that the projections are 
respected by components of other d-trees that may 
be inserted during a derivation. A SIC is associa- 
ted with the d-edge between VP and S node in the 
seems d-tree to ensure that no node labelled S ~ can 
be inserted within it - i.e., it can not be filled by 
with a wh-moved element. In contrast, since both 
the subject and the object of to adore have been 
moved out of the projection of the verb, the path to 
these arguments do not carry any SIC at all 9. 
We now discuss a possible derivation. We start 
out with the most deeply embedded clause, the ad- 
ores clause. Before subserting its nominal argu- 
ments, we sister-adjoin the two adjectival trees to 
the tree for hotdogs. This is handled by a SAC asso- 
ciated with the N' node that allows all trees rooted 
in AdjP to be left sister-adjoined. We then sub- 
sert this structure and the subject into the to adore 
d-tree. We subsert the resulting structure into the 
seems clause by substituting its maximal projection 
node, labelled VP\[fin: -\], at the VP\[fin: -\] frontier 
node of seems, and by inserting the subject into the 
d-edge of the seems tree. Now, only the S node of 
the seems tree (which is its maximal projection) is 
substitutable. Finally, we subsert this derived struc- 
9We enforce island effects for wh-movement by using 
a \[+extract\] feature on substitution nodes. This corre- 
sponds roughly to the analysis in TAG, where islandhood 
is (to a large extent) enforced by designating a particular 
node as the foot node (Kroch & Joshi, 1986). 
ture into the claims d-tree by substituting the S node 
of seems at the S complement node of claims, and 
by inserting the object of adores (which has not yet 
been used in the derivation) in the d-edge of the 
claims d-tree above its S node. The derived tree is 
shown in Figure 5. The SA-tree for this derivation 
corresponds to the dependency tree given previously 
in Figure 2. 
Note that this is the only possible derivation invol- 
ving these three d-trees, modulo order of operations. 
To see this, consider the following putative alternate 
derivation. We first subsert the to adore d-tree into 
the seems tree as above, by substituting the anchor 
component at the substitution node of seems. We 
insert the subject component of fo adore above the 
anchor component of seems. We then subsert this 
derived structure into the claims tree by substitu- 
ting the root of the subject component of to adore 
at the S node of claims and by inserting the S node 
of the seems d-tree as well as the object component 
of the to adore d-tree in the S'/S d-edge of the claims 
d-tree. This last operation is shown in Figure 6. The 
resulting phrase structure tree would be the same as 
in the previously discussed derivation, but the deri- 
vation structure is linguistically meaningless, since 
to adore world have been subserted into both seems 
and claims. However, this derivation is ruled out by 
the restriction that only substitutable components 
can be substituted: the subject component of the 
adore d-tree is not substitutable after subsertion into 
the seems d-tree, and therefore it cannot be substi- 
tuted into the claims d-tree. 
S ~ 
NP S i 
(hotdogs) t 
S 
! 
S Substitution 
NP ~+l 
(Mary) 
V VP\[fin: -\] 
seems V NP 
I I 
to adore e 
Insertions S' 
S 
NP VP \[fm: +1 
J i 
VP \[fin: +\] 
V S 
' t claims 
Figure 6: An ill-formed derivation 
In the above discussion, substitutability played a 
156 
central role in ruling out the derivation. We observe 
in passing that the SIC associated to the d-edge in 
the seems d-tree also rules out this derivation. The 
derivation requires that the S node of seems be in- 
serted into the SI/S d-edge of claims. However, we 
would have to stretch the edge over two components 
which are both ruled out by the SIC, since they vio- 
late the projection from seems to its S node. Thus, 
the derivation is excluded by the independently mo- 
tivated Sits, which enforce the notion of projection. 
This raises the possibility that, in grammars that ex- 
press certain linguistic principles, substitutability is 
not needed for ruling out derivations of this nature. 
We intend to examine this issue in future work. 
3.2 Getting Word Order Right: Kashmiri 
\[ twO~:: -~ NP VP 
! (ramesha~) ' F top:-'1 
' / wS:-I 
Aux VP 
(chu) 
NP VP 
e V VP _ 
I 
baasaan 
fin:" 
\[ tw°~:: q\] NP VP 
(kyaa) ' I" tol): "1 
COMP VP 
(ki) 
NP VP 
(m~) ~ 
NP VP 
I I e V 
I 
kor 
Figure 7: D-trees for (2b) 
Figure 7 shows the matrix and embedded clauses 
for sentence (2b). We use the node label VP throug- 
hout and use features such as top (for topic) to diffe- 
rentiate different levels of projection. Observe that 
in both trees an argument has been fronted. Again, 
we will use the SlCs to enforce the projection from a 
lexical anchor to its maximal projection. Since the 
direct object of kor has wh-moved out of its clause, 
the d-edge connecting it to the maximal projection 
of its verb has no SIC. The d-edge connecting the 
maximal projection of baasaan to the Aux compo- 
nent, however, has a SIC that allows only VP\[wh: +, 
top: -\] nodes to be inserted. 
v r +1 ~L fi,,: +J 
. VP ~n5: rameshas ~fin: +J 
Aux VP 
¢hu 
I vp 
e ~f~Vp \[ fin:' -t,,J tw°~:: -\] 
baaaaan 
COMP VP 
ki NP VP 
me NP VP 
I I 
e V 
I 
kor 
Figure 8: Derived d-tree for (2b) 
The derivation proceeds as follows. We first sub- 
sert the embedded clause tree into the matrix clause 
tree. After that, we subsert the nominal arguments 
and function words. The derived structure is shown 
in Figure 8. The associated SA-tree is the desired, 
semantically motivated, dependency structure: the 
embedded clause depends on the matrix clause. 
In this section, we have discussed examples where 
the elementary objects have been obtained by pro- 
jecting from lexical items. In these cases, we over- 
come both the problems with TAG considered in 
Section 1. The SlCs considered here enforce the 
same notion of projection that was used in obtai- 
ning the elementary structures. This method of ar- 
riving at SlCs not only generalizes for the English 
and Kashmiri examples but also appears to apply to 
the case of long-distance scrambling and topicaliza- 
tion in German. 
157 
4 Recognition 
It is straightforward to ".~lapt the polynomial-time 
El<Y-style recognition algorithm for a lexicalized 
UVG-DI. of Rarnbow (1994b) for DTG. The entries 
in this array recording derivations of substrings of 
input contain a set of elementary nodes along with a 
multi-set of components that must be in~rted above 
during bottom-up recognition. These components 
are added or removed at substitution and insertion. 
The algorithm simulates traversal of a derived tree; 
checking for SICS and SACs can be done easily. Bec- 
anse of lexicalization, the size of these multi-sets is 
polynomially bounded, from which the polynomial 
time and space complexity of the algorithm follows. 
For practical purposes, especially for lexicalized 
grammars, it is preferable to incorporate some ele- 
ment of prediction. We are developing a polynomial- 
time Earley style parsing algorithm. The parser re- 
turns a parse forest encoding all parses for an input 
string. The performance of this parser is sensitive to 
the grammar and input. Indeed it appears that for 
grammars that lexicalize CFG and for English gram- 
mar (where the structures are similar to the I_TAG 
developed at University of Pennsylvania (XTAG Re- 
search Group, 1995)) we obtain cubic-time comple- 
xity. 
5 Conclusion 
DTG, like other formalisms in the TAG family, is lexi- 
calizable, but in addition, its derivations are them- 
selves linguistically meaningful. In future work we 
intend to examine additional linguistic data, refining 
aspects of our definition as needed. We will also 
study the formal properties of DTG, and complete 
the design of the Earley style parser. 
Acknowledgements 
We would like to thank Rakesh Bhatt for help with 
the Kashmiri data. We are also grateful to Tilman 
Becker, Gerald Gazdar, Aravind Joshi, Bob Kasper, 
Bill Keller, Tony Kroch, Klans Netter and the ACL- 
95 referees. R, ambow was supported by the North 
Atlantic Treaty Organization under a Grant awar- 
ded in 1993, while at TALANA, Universitd Paris 7. 

References 
T. Becket, A. Joshi, & O. Rainbow. 1991. Long distance 
scrambling and tree adjoining grammars. In EACL- 
91, 21-26. 
R. Bhatt. 1994. Word order and case in Kashmiri. 
Ph.D. thesis, Univ. Illinois. 
T. Bleam. 1994. Clitic climbing in spanish: a GB per- 
spective. In TAG+ Workshop, Tech. Rep. TALANA- 
RT-94-01, Universit~ Paris 7, 16-19. 
J. Bzesnan & R. Kapl~n. 1982. Lexical-functional gram- 
mar: A formM system for grammatical representa~ 
tion. It, J. Bresnan, ed., The Mental Representation 
o\] Grammatical Relations. MIT Press. 
R. Frank. 1992. Syntactic Locality and Tree Adjoining 
Grammar: Grammatical, Acquisition and Processing 
Perspectives. Ph.D. thesis, Dept. Comp. & Inf. Sc., 
Univ. Pennsylvania. 
A. Joshi. 1987. An introduction to tree adjoining gram- 
mars. In A. Manaster-Ramer, ed., Mathematica o\] 
Language, 87-114. 
A. Joshi, L. Levy, & M. Takahashi. 1975. Tree adjunct 
grammars. J. Comput. Syst. Sci., 10(1):136-163. 
A. Joshi & Y. Schabes. 1991. Tree-adjoining grammars 
and lexicalized grammars. In M. Nivat & A. Podelski, 
eds., Definability and Recognizability o/Sets of Trees. 
R. Kasper, E. Kiefer, K. Netter, & K. Vijay-Shanker 
1995. Compilation of HPSG to TAG. In ACL-95. 
A. Kroch. 1987. Subjacency in a tree adjoining gram- 
mar. In A. Manaster-Ramer, ed., Mathematics o/Lan- 
guage, 143-172. 
A. Kroch. 1989. Asymmetries in long distance extrac- 
tion in a Tree Adjoining Grammar. In Mark Baltin 
& Anthony Kroch, editors, Alternative Conceptions of 
Phrase Structure, 66-98. 
A. Kroch & A. Joshi. 1986. Analyzing extraposition in 
a tree adjoining grammar. In G. Huck & A. Ojeda, 
eds., Syntax ~ Semantics: Discontinuous Constitu- 
ents, 107-149. 
I. Mel'~uk. 1988. Dependency Syntax: Theory and Prac- 
tice. 
O. Rambow. 1994. Formal and Computational Aspects 
olNaturol Language Syntax. Ph.D. thesis, Dept. Corn- 
put. & Inf. Sc., Univ. Pennsylvania. 
O. Rambow. 1994. Multiset-Valued Linear Index Gram- 
mars. In ACL-94, 263-270. 
O. Rainbow & A. Joshi. 1992. A formal look at de- 
pendency grammars and phrase-structure grammars, 
with special consideration of word-order phenomena. 
In 1stern. Workshop on The Meaning-Text Theory, 
Darmstadt. Arbeitspapiere der GMD 671, 47-66. 
B. Santorini & S. Mahootian. 1995. Codeswitching and 
the syntactic status of adnominal adjectives. Lingua, 
95. 
Y. Schabes & S. Shieber. 1994. An alternative con- 
ception of tree-adjoining derivation. Comput. Ling., 
20(1):91-124. 
S. Shieber. 1985. Evidence against the context-freeness 
of natural language. Ling. ~ Phil., 8:333-343. 
K. Vijay-Shanker. 1987. A Study o\] Tree Adjoining 
Grammars. Ph.D. thesis, Dept. Comput. & Inf. Sc., 
Univ. Pennsylvania. 
K. Vijay-Shanker. 1992. Using descriptions of trees in 
a tree adjoining grammar. Comput. Ling., 18(4):481- 
517. 
The XTAG Research Group. 1995. A lexicalized tree ad- 
joining grammar for English. Tech. Rep. IRCS Report 
95-03, Univ. Pennsylvania. 
