SOME COMPUTATIONAL PROPERTISS 
OF TREE ADJOINING GRAMM.~.S* 
K. Vijay-Shank~" and Aravind K. Jouhi 
Department of Computer and Information ~eience 
Room 288 Moore School/D2 
University of Pennsylvania 
Philadelphia~ PA 191Ct 
ABSTRACT 
Tree Adjoining Grammar (TAG) is u formalism for natural 
language grammars. Some of the basic notions of TAG's were 
introduced in \[Jo~hi,Levy, mad Takakashi I~'Sl and by \[Jo~hi, l~l. 
A detailed investigation of the linguistic relevance of TAG's has been 
carried out in IKroch and Joshi,1985~. In this paper, we will describe 
some new results for TAG's, espe¢ially in the following areas: (I) 
parsing complexity of TAG's, (2) some closure results for TAG's, and 
(3) the relationship to Head grammars. 
1. INTRODUCTION 
lnvestigatiou of constrained grammatical system from the 
point of view of their linguistic &leqnary and their computational 
tractability has been a mnjor concern of computational linguists for 
the last several years. Generalized Phrase Structure grammars 
(GPSG), Lexical Functional grunmmm (LFG), Phrm~ Linking 
grammars (PLG), and Tree Adjoining grammars (TAG) are some 
key examples of grammatical systems that have been and still 
continue to be investignted along theme lines. 
Some of the bask notions of TAG's were introduced in \[Joahi, 
Levy, and Takahashi,1975\] and \[Jo~hi,198,3 I. Some pretiminav/ 
investigations of the linguistic relevance and some computational 
properties were also carried out in \[Jo~hi, l~S3 I. More recently, a 
detailed iuvestigution of the linguistic relevance of TAG's were 
carried out by \[Kro~h and Joshi, 19851. 
In this paper, we will des¢ribe some new results for TAG's, 
especially in the following areas: (I) parsing complexity of TAG's, (2) 
some closure results for TAG's, and (3) the relationship to Head 
grammar*. These topics will be covered in Sections 3, 4, and $ 
respectively. In section 2, we will give an introduction to TAG's. In 
section 6, we will state some properties not discussed here. A detailed 
exposition of these results is given in \[Vijay-Sbuh~ and Joahi,1985\[. 
*This work wu ptrtisJ~ su.~ported by NSP Gr~u~* Mk'TS-4~IOII6.~'~R, 
MCS42-07.~94. We wtat to thank Clr| Pol!ard. Kelly Rozeh, David Se~ tad 
David Weu'. We have beDeflt~l enormously I:y v*/uablo di~*eo~iotc with them. 
82 
2. TREE ADJOINING GRAMMARS--TAG's 
We now introduce tree adjoining grammars (TAG's). TAG's 
are more powerful than CFG's, botb weakly and strongly, l TAG's 
were first introduced in \[Joshi, Levy, and Takahashi,1975J and 
\[Joehi,1983 I. We include their description in this ~*ction to make the 
paper ~lf-contalned. 
We can define a tree adjoining grammar as follows. A tree 
adjoining grammar G is a paw (i,A) where i is a set of initial trees, 
and A is a set of auxiliary trees. 
A tree a ls an initial tree if it is of the form 
GI I 
S 
I\ 
I \ eE. r~ 
l \ 
I \ 
l 
That m, the root node of a is labelled S and the frontier nodes 
are all terminal symbob. The internal nodes are ~11 non-terminals. 
A tree ~ is an acxiliar? tree if it is of the form 
~= X 
I \ 
I \ 
I \ wle= E 
I \ 
..... X ..... 
V ! V~ 
That is, the root node of ~ is labelled with a :on-terminal X 
and the frontier nodes are all labelled with terminals symbols except 
one which is labelled X. The node labelled by X on the frontier will 
be c~dl~l the foot node of ~. The frontiers of initial trees belong to 
r-*, whereas the frontiers of the auxiliary trees belong to ~ N ~ U 
~'+ N '-'*. 
~/e will now define a compoeition operation called adjoining, 
(or adlunetion) which compo6es an auxiliary tree ~ with a tree 3'. 
Let 3' be a tree with a node n labelled X and let ~ be an auxiliary 
tree with the root labelled with the same symbol X. (Note that 
mnst have, by definition, a node (and only one) labelled X on the 
frontier.) 
IGr~nm~u Ol tad G2 mm w*aJtly equivuJ*a* if the forint ItaCU*ll* of GI, 
I~Gi} m tim J~in¢ lua¢un4pD ot G~ ~G2b GI tad G:I *.,,* ,troo¢ly *quivuJeot 
they m mmkl7 eq~,ivuJeIt tad for etch w UI E,(GI) ~e L(G2), both Gi tad G2 
the strne itI~l~urld delleriptioll to v. A ~mr G is ~ly uleqoa~ 
for t IPtriD|l llMl~ql~ ~* if UGI am L G ~1 Itt'OO¢~ I~deql\]otdl for b if L(G) m h 
tad for elg'b w is I~ G *~iglm am °*ppmpdm e ,ttuctural description to m. The 
8oti~a 0( ItrOu¢ *dequtcT ~ undoobtodlY not pmciN becsmn it deport ,4* ol the 
notion 0~ zpp~pfiato *tntttu~ de~.*riptioml 
Adjoining can now be defined as follows. If # is adjoined to 
at the node n then the resulting tree "Tt' is as shown in Fig. 2.1 
below. 
7 = ~: 
$ X /\ /\ 
/ \ / \ 
node / X \ / \ 
n I I \ \ ---X--- 
t 
3" = 
S 
/\ 3' 
/ \~'~vithout 
IX\ t --/ \-- 
/ \ 
--x-- 
/\ / \+-- 
FiKure 2.1 
The tree t dominnted by X in 3' is excised, ~ is inserted at the 
node n in "7 and the tree t is attached to the foot node (lab*lled X) of 
~, i.e., ~ is inserted or adjoined to the node n in 3' pushing t 
downwards, Note that ~ljoinmg is not a suJmtitutioa operation. 
We will now define 
T(G): The set of alJ trees derived in G starting from initial 
trees in I. This set will be called the tree set of G. 
L(G): The set of all terminal strinp which uppe'mr in the 
frontier of the trees in TIG). This set will be called the string 
language (~r langtiage) of G. If L is the string language of s TAG G 
then we say that L is a Tree-Adjoinin~ I.angllage (TAL). The 
relationship between TAG's , context-free grammmm, and the 
corresponding string languages can be summarised as follows (\[Joehi, 
Levy, and Takahashi, 1975\], \[Joshi, 19831). 
Theorem 2.1: For every context-free grammar, G', there is so 
equivalent TAG, G, both weakly and strongly. 
Theorem 2.2: For every TAG, G, we have the following 
sitoatious: 
a. LeG) is context-free 3nd there is a context-free grammar 
G' that is strongly (cud therefore weakly) equivalent to 
G. 
b. 
C. 
L(G) is context-free and there is 4o coutext~free gramma~ 
G' that is equivalent to G. Of course, there must be n 
context-free grmmmar that is weakly equivalent to G. 
L(G) is strictly context-sensitive. Obviously in this cue, 
there is no context-freo grammar that is weakly 
equivalent to G. 
Part8 Ca) ~d (e) of Theorem 2.2 appear in (\[Jushi, Levy, and 
Tskahacbi, 19T5\]). Pact (b) is implicit im that paper, but it is 
impor*ut to state it explicitly as we have done here because of it8 
linguistic significance. ~mmple 2.1 illustrates part Ca). We will now 
illustrate p,1~ (b) and (e). 
Example 2.2: Let G J (I,A) where 
! : 
A • 
~t = 
~t : 
5 
I 
e 
$ T 
I\ I\ 
n T t S 
I\ I\ 
lb Ib 
S T 
Let us look st some dertvttlons tn G. 
"TO : ~ : 
Se 
I 
e 
3'2 = 
S 
a/T\ 
/I\ 
/ n S\~= 
' I\ \ 
I I b \ 
¢ T __~ .... I~ 
Ib 
S 
I 
e 
~t 
$ 
/\ 
u T 
I\ $b 
i 
U 
~t 
71 == 3'0 with ~I 3'= =* 3'1 with ~ 
adjoined at S am indicated in "f0. adjoined at T as indicated in ~.. 
Clearly. L(G), the string language of G is 
L-- {,.eb. / Q>o } 
which is a context-free language. Thus, there must exist a context- 
tree grammar, G', which is at least we~tkly equivalent to G. \[t cam be 
shown however that there is no context.flee grammar G' which is 
strongly equivalent to G, i.e., T(G) I- T(G'). This follows from the 
fat that the set T(G) (the tree ~et of G) is non-r~o,~nizable. *.e., 
there is an finite st~e bottom-up tree automaton that can recognize 
precisely T(G). Thus s TAG ma~" ~ _z context-free language, 
~ign structural de~riptious to the strinAs that cannot be 
usi~ned by ~ context-free ~rammnr. 
F.~xample 2.3: Let G ,m (I,A) where 
$ 
I 
@ 
#t = #= = 
S T 
I\ I\ 
m T a S 
II\ II\ 
II\ II\ 
b S c b T c 
8,3 
The precise definition of L(G) is as follows: 
L(G) =- L t =. {w • ca / n > o, w is a string of a's and b's such that 
(1) the number o( u's I=, the number o( b's -- n, and 
(2) for any initial subetriag of w, the number 
of a's > the number o( b's. } 
L I is a strictly context-sensitive language (i.e., s context,, 
sensitive language that i, not context-free). This can be shown as 
follows. Intersecting L with the regular language a* b* • c* results in 
the language 
1~== { a abnec a/ n>>_o} =-L t Na'b'ec" 
i~ i~ well-known strictly context-sensitive language. The result 
of intersecting a context-free language with a regular language is 
always a context-free language; hence, L t is not a context-free 
language. It is thus a strictly context-feusitive language. Example 
2.3 thus illustrates part (e) of Theorem 2.2. 
TAG's have more power than CFG's. However, the extra 
power is quite limited. The language L t bag equal number of a's, b's 
a~d c's; however, the s's and b's are mixed in a certain way. The 
Itmguage I~ is similar to Lt, except that a's come before all b's. 
TAG's as defined so far are not powerful enough to generate L t. 
This can be seen as follows. Clearly, for any TAG for I.~, each 
initial tree must contain equal number of a's, b's and c's (including 
sero), sod each auxiliary tree must also contain equal number of a's, 
b's and c's. Further in each cue the a's must precede the b's. Then 
it i~ easy to see from the grammar of Example 2.3, that it will not be 
po~ible to avoid getting the a's and b's mixed. However, L t can be 
generated by a TAG with local constraints (see Section 2.1} The so- 
called copy language. 
t.- {wewlw,{~b}" } 
also cannot be generated by s TAG, however, again, with local 
constraints. It is thus clear that TAG's can generate more than 
context-free languages. It can be shown that TAG's cannot generate 
all context,-sensitive languages \[Jmhi ,lg84J. 
Although TAG's are more powerful than CFG's, this extra 
power is highly constrained and apparently it is just the right kind 
for characterizing certain structural descriptions. TAG's share almost 
all the formal properties of CFG's (more precisely, the corresponding 
classes of language,). ~. we shalJ see in Netin* 4 of this paper and 
\[Vijay-Shankar and Joehi,1985J. In addition,the string languages of 
TAG's can also be parsed in polynomial time, in partkular is O(nS}. 
The parsing algorithm is described is detail in section 3. 
|.1. TAG's with Lanai Constraints on Ad, Jolnln| 
The adjoining operation as def'med in Seetion 2.1 is "context- 
free'. Au auxiliary tree, say, 
X /\ 
I \ 
I \ 
---X--- 
is adjoinable to s tree t at a node, say, n, if the label of that 
node is X. Adjoining does not depend on thn context (tree context) 
around the node n. In this sense, adjoining is context-free. 
In \[Jmhi ,19831, I~al constraints on adjoining similar to those 
investigated by \[Joshi and Levy ,1977\] were considered.These are a 
generalization of the context-sensitive constraints studied by \[Peters 
and Ritchie ,1~9\]. It was soon recognized, however, that the full 
power of these constraints was never fully utilized, both in the 
linguistic context as well as in the "formal languages' of TAG's. 
The so-called proper analysis contexts and domination contexts (as 
defined in \[Jmhi and Levy ,197T\]) as used in \[Joshi ,1983J always 
turned out to be such that the context elements were always in a 
specific elementary tree i.e., they were further localized by being in 
the same elementary tree. Based on this observation and a 
suggestion in \[Jaehi, Levy and Takahashi ,1975\], we will deseribe a 
new way of introducing local constraints. This approach not only 
captures the insight stated above, but it is truly in the spirit of 
TAG's. The earlier approach was not so, although it was certainly 
adequate for the investigation in \[Jmhi ,1983J. A precise 
characterization of that approach still remains an open problem. 
G -- (I,A) be a TAG with local constraints if for each 
elementary tree t E l t.J A, and for each node, n, in t, we specify the 
set ~ of auxiliary trees that nan be adjoined at the node n. Note 
that if there is no constraint then all auxiliary trees are adjoinable at 
n (of course, only those whose root has the same label as the label of 
th* node s). Thus, in general, ~ is a subset o( the set of all the 
auxiliary trees adjoiuable at n. 
We will adopt the following conventions. 
1. Since. by definition, no auxiliary trees are adjoinable to a 
node labelled by a terminal symbol, no constraint has to 
be stated for node labelled by a terminal. 
2. If there is no constraint, i.e., all auxiliary trees (with the 
appropriate root label} are adioinable at a node, say, u, 
then we will not state this explicitly. 
3. if no auxiliary trees are adjoinable at a node n, then we 
will write the constraint as ($~, where $ denotes the null 
set. 
We will alE.~ allow for the possibility that for a node at 
least one adjoining is obligatory, of course, from the set 
of all ixxmible auxiliary trees adjoiuable at that node. 
Hence, a TAG with Meal constraints is defined as follows. G = 
(I, A) is a TAG with local constraints dr for each node, n. in each tree 
t, be speeify one (and only one) of the following constraints. 
1. S, Ioetive Adjoinin~ ~.qA:) Only u specified subset of the 
set of all auxiliary trees are adjoinable at u. SA is 
w-linen aa (C), where C is u subset of the set of all 
auxiliary trees adjoisable at n. 
If C equals the set of all auxiliary trm adjoinable at n, 
then we do not explkitly state this at the node n. 
2. Null Adjoining; (NA:) No auxiliary tree ia adjoinable at 
the ,,ode N. NA will be written u (~). 
3. Obli~atin~ Adjoining; {OA:) At least one (out of all the 
auxiliary trees adjoissble at a) must be adjoined at n. 
OA is written as (OA). or as O(C) where C is a subeet of 
the set of all suxifiacy trees adjoisable at u. 
I~--~amp~ 2.4: Let G == (I~.) be u TAG with I~ constraints where 
I: a It 
S C~) /\ 
~t s S (B2) 
I I 
a b 
84 
s (~t) s (~=) 
I\ I\ 
I \ I \ 
a S (¢~) (¢~) S h 
In a t no anxiliary trees can be adjoined to the root node. Only 
~t is adjoinable to the left S node at depth 1 and only ~= is 
adjoinable to the right S node at depth 1. In ~t only BI is adjoinuhie 
at the root node and uo auxiliary trees ate adjoinable at the ~.~,~' 
node. Similarly for ~2. 
We must now modify our definition of adjoining to take care o( 
the local constraints, given a tree "7 with a node, say, is, labelled A 
and given an auxiliary tree, say,/J, with the root node labelled A, we 
define adjoining as follows. ~ is adjoinable to "y at the node n if B E 
~, where ~ is the constraint associated with the node u in "7. The 
result of adjoining d to ~ will be as defined in earlier, except that the 
constraint C ~.~sociated with u will be replaced by C', the constraint 
•ssociated with the root node orb and by C', the constraint 
associated with the foot node of ~. Thus, given 
"T: ~= 
S 
/ \ node n 
I k (C) 
I/\ 
I/ \\ 
II \\ 
The resultant tree "7' is 
k (C') /\ 
/ \ 
/ \ 
/ \ 
/ \ 
(C') 
q,' I 
S /\ 
/ \ 
/ \ 
/ k CC') / /\ \ 
---/ \--- 
/ \ 
/ A (C') / /\ \ 
--./ \--- 
/ \ 
We abo adopt the convention that any derived tree with a node 
which has an OA constraint associated with it will not be included in 
the tree set associated with a TAG, G. The string language L of G is 
then defined as the get of all terminal strings at all trees derived in G 
(starting with initial tre~) whkh have on OA constraints left-in 
them. 
Example 2.5: Let G == (I,A) be a TAG with local constraints 
where 
: Of -- 
A: 8= 
S (~) 
/I 
/I 
a S 
/1\ 
/1\ 
h I ¢ 
S (¢~) 
There are no constraints in a t. In ~ no auxiliasT trees are adjoinabie 
at the root node and the foot node and for the center S node there 
are an constraints. 
Starting with a t and adjoining ~ to a t at the root node we 
obtain 
? = 
S (~) 
II 
II 
a S 
II\ 
II\ 
b I c 
S (¢) 
I 
S 
Adjoining ~ to the ceuter S node (the only node at which 
adjunction can be made) we have 
"I' :am 
S (~) 
II 
II ,~ ..~j" (~,) 
,'/I " 
t a S ~ ~ 
/ It\ 
; /1\ 
/ b I ¢ / t 
'- - - - ?'1~ - - 
/1\ 
h I e 
S (¢~) 
I 
l 
It ia easy to ~.e that G generates the string language 
L = { a°b'ec'lu>O} 
Other languages such as L'=={a al In ~_~1}, L" == {a a= I n ~__ 1} 
aim cannot be generated by TAG's. This is because the strings of a 
TAL grow linearly (for a detailed definite of the property called 
"contact growth" property, see \[Jmhi ,1983 I. 
For those familiar with \[Joehi, 19&3\], it is worth pointing out 
that the SA constraint is only abbreviating, i.e., it does not affect the 
power of TAG's. The NA and OA constraints however do affect the 
power of TAG's. This way of looking at local constraints has only 
greatly simplified their statement, but it has also Mlowed us to 
capture the insight that the 'locality' of the constraint in statable in 
terms of the elemental/ trees themselves! 
S.2. Simple Llngulntle Exmmphm 
We now give a couple of Unguistie examples. Readers may refer 
~o \[Krocb and Joshi, 1985\] for detads. 
I, Starting with ~fl ~m at which is an initial tree and then adjoining 
~1 (with appropriate lexieaJ insertions) at the indicated node in at, 
we obtain "~:~. 
85 
"~t = Ot = 
S /\ 
~. VP 
/\ l\ 
DET ~1 V IP 
I I I I\ 
I I I I\ 
~hn girl I DET I 
tm I I 
n sealer 
the gXrl ~n t sen/or 
~1 = mid 
/\ 
MP $ /\ 
/\ 
~P VP 
I /\ 
• Y Mp 
I I 
ant, l 
I 
BL11 
$ / \ 
/ \ 
~Mp~ VP 
/\ ~ I\ 
~\\ ~ I \ 
MP \\ V ~P 
/\ , S ~ I /\ 
DET 11 ; / \~ts VET ! 
I itlm S\ I I 
the girl I lVp/ \ \\ a sen/or 
VP \ 
I I /\ x 
I I \ 
not I x ~" pt \ 
\ I I 
The glrl who net BLll t,* n sealer 
2. Starting with the initial tree 3't =a ~ and adjoining 0~ at 
the indicated node in a, we obtain 7~- 
3'1 = (~2 = 
"~2 = 
02 = 
* S 0(02) S , 
/\ /\ 
MP ~p liP VP 
I /\ I /1\ 
PRO TO ~P W / I \ 
/\ I V MP S (h) 
V h'P John I \ 
I I l \ 
tnvlr, n I persuaded g 
I I 
Iltry B111 
PRO to invite 
"1\\ I Np yp ~ 
I 
/ ! II\ ', 
I I V MP, ~' (@) 
J Join I I. 7 \ 
i I g~w v~ 
I persuaded I. ~ I /\ 
X I~ I TOVP 
\ i~PRO /\ % 
.. Bill~ V l(P 
.... I I Lnvtt, 
1 
I 
iltr~ 
John pomaded eLI1 ~o XnvLte M~ry 
John persuaded B211 S 
Note that the initial tree cz 2 is not a matrix sentence. In order 
for it to become a matrix sentence, it must undergo am adjuuction at 
its root node, for example, by the auxiliary tree ~2 as shown above. 
Thus. for a 2 we will specify a local constraint O(~2) for the root 
node, indicating that a 2 requires for it to undergo am adjuuction at 
the mot node by an auxiliary tree 02. In a fuller grammar there will 
be, of course, some alternatives in the scope of O(). 
3. PARSING TREE-ADJOINING 
LANGUAGES 
a.l. l)eflnltlonm 
We will give a few additional definitioM. These sre not 
necessaW for defining derivations in a TAG as defined in section 2. 
However, they are introduced to help explain the parsing algorithm 
and the proofs for some of the closure properties of TAL's. 
DEFINITION 3.1 Let 3',3" be two tre~.We say "r \[--" 3" if in 3' we 
adjoin an auxiliary tree to obtain 3". 
I'-* is the reflexive,transitive closure of \]---. 
DEFINITION 3.2 3" is called a derived tree if 7 I--* 3" for some 
elementary tree % 
' We then say "~' E D('I). 
The frontier of any derived tree 3' belongs to either ~ ~ ~ U 
N ~ if 7E D(,~) for some auxiliary tree 0. or to ~ if 3' E Dqcr) 
for some initial tree a. Note if ";, E D(a) for some initial tree ~, then 
3' is aim a sententtal tree. 
If 0 is an auxiliary tre~, "7 E D(0) and the frontier of 3' is w I X 
w 2 {X is a nooterminsJ.wl.w 2 E ~ r~') then the le~ node having this 
non-terminal symbol X at the frontier is called the foot of 3'. 
Sometimes we will be loosely using the phrase "adjoining with 
a derived tree" "7 E D(~) for some auxiliary tree 0. What we mean is 
that suppose we sdjoin d at some nc~le and then sLdjoin within t~ and 
so on, we can derive the desired derived tree E D(0) which uses the 
same adjoining sequence and use this resulting tree to "adioin" at 
the original node. 
3.3. The Psrsi.s Alsorlthm 
The ~igorithm, we present here to parse Tree-Adjoining 
Languages {TAL~), is s modification of the CTK algorithm (which is 
described in detail iu \[Abe and UIIman,1073 D, which uses ,, dynamic 
programming technique to parse CFL's. For the sake of making our 
description of the parsing algorithm simpler, we shall present the 
algorithm for parsing without considering local constraints. We will 
later show how to handle local constraints. 
We shall s.~ume that any node in the elementary trees in the 
grammar has atmos¢ two children. Thm assumption c~m be made 
without any loss of generality, because it can be easily shown that 
for any TAG G there m an equivalent TAG G I such that amy node in 
amy elementary tree in G t has atmmt two children. A similar 
assumption is made in CYK algorithm. We use the terms ancestor 
rand descend~at, throughout the paper ms & transitive and reflexive 
relation, for example, the foot node may be called the ancestor of the 
foot ands. 
The ~lgoritbm works am follows. Let st... % be the input to be 
posed. We use a fom~limeoaioaal array A; each element of the 
srrny cont4uiu a subset of the nodes o( derived trm. We say a node 
X of a derived tree 3" belongs to A(i,j.k,lJ iJr X dominates a sub-tree o( 
3' whose frontier m given by either =q+a...aq Y ak+i... ~ (where the 
foot node of 3' ~ labelled by Y) or ~q+t--.~ (i.e., j ,,- k. ~;- 
86 
corresponds to the case when T is a sentential tree). The indices 
(i,j,k,I) refer to the positions between the input symbols and range 
over 0 through u. If i == 5 say. the,, it refers to the gap between a s 
and a s. 
Initially, we fill Ali,i+l,t+l,i+l \] with those nodes in the 
frontier of the elementary trees whose label is the same as the input 
ai+ t for 0 < i < n-l. The foot nodes of auxiliary trees will belong to 
MI A(i,i,j,jl, such that i _< j. 
We are now in a position to fill in 311 the elements of the array 
A. There are five c~mes to be considered. 
Case 1. We know that if a node X in a derived tree is the 
ancestor of the foot node, and node Y is its right sibling, such that X 
E A\[i,j,k,II and Y E A\[l,m.m,nJ, then their parent, say. Z should 
belong to A(i,j,k,n\[, see Fig 3.1a. 
Case 2. If the right sibling Y is the ancestor of the foot node 
such that it belongs to All,m,n,pJ and its left sibling X belongs to 
A i.j.j.lJ, then we know that the parent Z of X and Y belongs to 
A i,m,n.p, see Fig 3.1b 
Case 3. If neither X nor its right sibling Y are the ancestors of 
the foot node ( or there is no foot node) then if X E A\[i,J,j,ll and Y E 
A\[I.m.m,nJ then their parent Z belongs to A\[ioj,j,n\[. 
Came 4. If • node Z has only one child X, and if X E A\[i,j,k,l\], 
then obviously Z E A{i,j,k,ll. 
Ca~e 5. If 3 node X E AIi.j,k,ll, and the root Y of a derived 
tree "7 having the same label as that of X, belongs to A\[m,i,l.u I, then 
adjoining "t at X makes the resulting node to be in AIm,Lk,nl, see Fig 
3.1c. 
(,) X" 
I\ 
I \ 
I \ 
I \ 
I Z' \ / /\ \ 
I / \ \ 
I I \ \ • 
/ V' Y' \ / /\ /\ \ 
/ / \ / \ \ 
I I \I \ \ 
I ! I I I I 
t j k 1 • • 
(b) x' 
I\ 
I \ 
I \ 
I \ 
I Z' \ / /\ \ 
/ / \ \ 
/ / \ \ 
/ V' Y' \ 
/ /\ I\ \ / / \ / \ \ 
I / \I \ \ 
................. X ' ........ 
I I I I I J 
i J 1 an p 
(c) Y /% 
/ \ 
/ \ 
/ \ 
/ \ 
/ \ 
/ \ 
/ \ 
.......... X ........ 
/\ 
I / \ I 
n / \ • / \ 
/ \ 
I I I I 
i J k I 
Pill•re 3._~I 
Although we have stated that the elements of the array 
contain 3 subset of the nodes of derived trees, what really goes in 
there ape the addresses of nodes in the elementary trees. Thus the 
the size of any set is bounded by a constant, determined by the 
grammar. It is hoped that the presentation of the sdgorithm below 
will make it clear why we do so. 
3.3. The adl~orithm 
The complete algorithm is given below 
Step I For i=O to n-I step I do 
Step 2 put all node• in the frontier of elemnntsry 
tr~ whoso l~bel 18 ~*t In A\[i.i÷l.i*l.i*l\]. 
Step 3 For i:O to n-I stop t do 
Step 4 for J:l to n-I stop 1 do 
Step 8 put foot nodes of all auxiliary trees in 
Xtt.:.J.J\] 
Step 6 For 1:0 to n step I do 
Step 7 For i:l to 0 step -I do 
Step 8 For J=i to 1 step I do 
Step 9 For k=l to J step -1 do 
Step I0 do Cue 1 
Step It do Cue 2 
Step 12 do C~O 3 
Step 13 do Cue 5 
Step 14 do Cue 4 
Step 1S Accept if root of somn initial tree E A\[O.J,j,n\], 0~J~_n 
where, 
(a) Case I corresponds to situation where the left sibling is the 
ancestor of the foot node. The parent is put in A\[i,j.k.l I if the left 
sibling is in A\[i,j.k.m I and the right sibling is in A|m.p,p,l|, where k 
~_ m < I, m _~ p, p ~_ I. Therefore Came I m written as 
For ask to 1-I ~top I do 
for p= a to I step I do 
if there is • left sibling in A\[t.J.k.n\] and the 
right sibling in A\[n.p.p.1\] satisfying appropriate 
restrictionn then put their parent 
in A\[i,j,k.i\]. 
(b) Case 2 corresponds to the case where the right sibliog is the 
ancestor ,~f the foot node. If the left sibling is in A\[i,m.m.pl and the 
.ght sibling is in A(p,j,k.I I, i -- m < p and p ~ j, then we put their 
parent in A\[i,j,k,l I. This may be written as 
For n:l to J-t stop 1 do 
For p=u-t to J step 1 do 
for •11 left 8iblinp in A(t.n.n,p\] and riKht 
8iblinp 
in A\[p.J.k.l\] satlsfyins •pproprlatn rHCrlctlon8 put 
~heix parents 
in A{£,j,k.1\]. 
87 
(c) Case 3 corresponds to the cane where •either children ate 
ancestors of the foot •ode. If the left sibling E A\[i,j,j,ml and the right 
sibling E A(m,p,p01\[ then we can pat the parent in A\[i,j,j,lJ if it is the 
c~,.that(i< j _< mori~ j < m) and(m < p ~ lot m _< p < 
|),This may be written ae 
fo~ s : J t,o l-t st,up I do 
for p : J to 1 •~*p t do 
f•r .11 left, sLblLnKg in A\[i.J,J,n\] and 
right, siblings i• A(n,p,p,1\] •at1•fy1.nlg t, he appropriate 
rant,rXcCio•• pot their pgwuat, Xa A(/.J.J.I\]. 
(e) Came 5 correspo•ds to adjoining. If X is n node in A\[m,j,k,pJ and 
Y is the root of a a•xiliary tree with same symbol as that of X, such 
that Y is in A\[i,m,p,I\] ((i <_ m _< p <iori < m_< p <_lJand(m 
< j < k ~ porto ~j ~_k < p)J. This may be writte• as 
for • = £ co J 8t*p t do 
for p = u ~o I stop t do 
tf t node X E A\[a.J.k.p\] and t, he root, of 
tuxllXary tree ~.• In k\[t,a.p,l\] t, heu put, X Xn A(i.J,k,l\] 
Case 4 corresponds to the case where s •ode Y has only one child X 
If X E A~i,j,k,ll then put Y in A\[i,j,k,l\[. Repe~t Case 4 again if Y has 
us siblings. 
3.4. Complexity of the Alsorlthm 
It is obvious that steps I0 through 15 (cases a-e) are completed 
in 0(•-*), beta•an the different cases have at most two nested for 
loop statements, the iterating variables taking values in the range 0 
thro•gh u. They are repeated utmost 0(• 4) times, because o( the 
four loop statements i• steps 6 through 9. The initialization phase 
(steps 1 through 5) has a time complexity of 0(• + •:) == 0(•2). 
Step 15 is completed in O(•). Therefore, the time complexity of the 
parsing algorithm is O(•S). 
3.5. Cot,~.etnem of tha Allorlthm 
The main issue in proving the algorithm correct, is to show 
that while computing the contents of an element of the array A, we 
must have already determined the contents of other elements of the 
array needed to correctly complete this entry. We can show this 
inductively by considering each cue individually. We give an 
;.uformal argument below. 
Case h We need to know the co•tents of A\[i,j,k.m\[, A\[m,p,p,I\] 
where m < I, i < m. when we are trying to compute the co•tents or 
Aii.j,k,l \[. Since I is the y&riable itererated i• the outermost loop (step 
6), we can assume (by indnctio• hypothesis) that for all m < I and 
for all p,q,r, the coate•ts of A\[p,q,r,mJ are already computed. Hence, 
the contents of A\[i,j,k,mJ are known. Similarly, for all m > i, and 
for all p,q, and r <_. l, A\[m,p,q,rJ would have been computed. Thus, 
A\[m,p,p,i I would also have bee• computed. 
Case 2: By s similar ream•lag, the co•tents of A(i,m,m,pJ and 
A\[p,j,k,l I are known since p < I and p > i. 
Case 3: Woe• we are trying to camp•re the contents of some 
Aii,j,j,lJ, we need to know the nodes in A(i,j~i,pJ and A\[p,q,q,l\[. ,Note j 
> i or j < I. tlence, we know that the co•tents of A\[i,j.i,pj and 
A(p,q,q,l\] would have bee• compared already. 
Came 5: The co•tents of A\[i,m,p,iJ and A(m,j,k,pJ must be 
k•own i• order to compote A(i,j,k,l\[, where ( i _< m ~ p < I or i < 
m < p_<l)aad(m_<j_< k < porto <j_< k_<p). Since 
either m > i or p < I, contents of Alm,j,k,pl will be know•. 
Similarly, since either m < j or k < p, the co•re•re of A(i,m,p,l I 
would have been comp•tcd. 
3.S. Pmmlug with Loead Const~mlnt4 
So far,we have a~,samed that the give• grammar has •o local 
constraints, If the grammar has local constraints, it is easy to modify 
the above algorithm to take care of them. Note that in Ca~e 5, if an 
adjunctio• occurs at a •ode X, we add X again to the element of the 
array we are computing. This seems to be in co•trust with our 
definition of how to associate local constraints with the •odes in a 
se•te•tial tree. We should have added the root of the auxiliary tree 
instead to the element of the array being computed, since so far u 
the local constraints are concerned,this •ode decides the local 
constraints at this node in the derived tree. However, this scheme 
cannot be adopted in oar algorithm for obvious reasons. We let pairs 
of the form (g,C) belong to elements of the array, where g is -- 
before and C represents the local constraints to be associated with 
this •ode. 
We then alter the algorithm as follows. If (X,CI) refers to a 
uode at which we attempt to adjoin with an auxiliary tree (whose 
root is denoted by (Y,Cs)). the• adi•nctio• would determined by C I. 
If adjunctio• is allowed, then we can add (X,Cs) in the corresponding 
element of the array. In cases I through 4, we do not attempt to add 
a new element if any one of the children has an obligatory 
constraint. 
Once it has bee• determined that the given string belongs to 
the language, we ca• find the parse i• a way similar to the scheme 
adopted i• CYK algorithm.To make this process simpler and more 
efficient, we can use pointers from the new clement added to the 
elements which caused it to be put there. For example, consider 
Case i of the algorithm (step 10 ). If we add a node Z to A(i.i,k,I I, 
because of the pr~nce of its children X and ¥ i• A\[ij,k,m i and 
A(m,p,p.q respectively, then we add pointers from this node Z i• 
A\[i,j,k,l\] to the nodes X, Y i• A{i,j,k,mj and A\[m,p,p,l\[. Once this has 
been done, the parse c,m be found by traversing the tree formed by 
these pointers. 
A paner based o• the techniques described above is currently 
being implemented mad wiU be reported at time of presentation. 
4. CLOSURE PROPERTIES OF TAG's 
I• this 6ectio•, we present some closure resoits for TALe. We 
now informally sketch the proofs for the closure properties. 
interested readers may refer to \[Vijay-Shaakas mad Jo6hi,1985\] for 
the eL, replete proofs. 
4.1. Closure undem Union 
Let G t and G. z be two TAGs generating L I and l.~ respectively. 
We c~• eonstrnct '~ TAG G snch that L(G)m'L t U L-a- 
Le* G I =- { 11, At, NI, S ), and G 2 = ( I~, A=, N~., S ) 
Without Io~ of senerality, we may assume that the N I N N:e =" h. 
Let G -- ( I l U 12 , At LJ A=, N t U N=, S ). We claim that L(G) :~ L l 
Let x ELt U L-z. Then x ELI or x E I~. If x ELI, then it 
must be possible to generate the string x in G , since 11 , A t are in 
G. Hence x E L(G). Similarly if x E \[q , we can show that x E L(G). 
Hence L t U L~ C L(G). If x E L(G), then x is derived using either 
only Ij, A t or only l~,A:tsince N I I"1 N,j =,, ~. Hence, x ELt or X E 
t~ Thus, L(G} '-- Lt U I~ Therefore, L(G) =- Lt U L~ 
88 
4.2. Clmure under Concatena~on 
Let G t --(lt,At,N~,St), G, ,,, (\[~.~=,N~,S~) be two TAGs 
generating Lt, I~ respectively, such that N I I'1 N= =- ~. We cam 
construct • TAG G =- (I, A, N, S) such that L(G)=,, L! . !~. We 
choo~ S such that S is not in Ns t,J N=. We let N -- N t IJ N, U 
{S}, A ,m A t U An. For all t t E !1, t~ E I,, we add tl:~ to I, as shown 
in Fig 4.2.1. Therefore, ! =- ( tl= / t! E It, t~ ~ l~), where the nodes 
in the subtrees t t and t~ of the tree t~= have the same coustra~atm 
mmocinted with them us in the original grammars G ! and G=. it is 
easy to show that L(G) ,m L I . L~, once we note that there are no 
Nxifia~ trees in G rooted with the symbol S, and that N I f3 N, ,m 
d). 
s~ s~ 
st= I \ t~= I \ 
I \ I \ 
I \ I \ 
f"t2 : 
S /\ 
/ \ 
/ \ 
/ \ 
s, s~ 
I X I X 
/ *,t \ / ~s \ 
Fib, urn 4 2. t 
4.3. Cloeuru under Kle~ne gt.m~ 
Let G t =, (iI,At,NI,S1) be a TAG generating L t. We can show 
that we can construct a TAG G such that L(G) -. Lt*. Let S be a 
symbol not in N t, and let N m N I U {S}. We let the set \[ of initial 
trees of G be (re} . where t e is the tree shown in Fig 4.3~. The set o( 
auxiliary tree, A is defined u 
A = {t~A / t t ¢ It} UAt. 
The tree tlA is u shown in Fig 4.3b, with the coustraintm on 
the root of each tlA being the null adjoining constraint, an 
constraint~ on the foot, and the constraints on the nodes of the 
snbtreee t t of the tre~ ttA being the same sm thee for the 
corresponding nodes in the inithd tree t t of G I. 
To see why L(G) ,m Lt*, consider x ~ L(G). Obviously, the tree 
derived (whose frontier is given by x ) must be of the form ~howu in 
Fig 4.3¢, where each t t' is a sententinJ tree in GI~UCh t I' E D(ti), for 
zn initial tree t i in G t. Thus, L(G) C LI*. 
On the other hand, if x E Ls*, then x =- Wl...wu, w i ~ L t for 1 
_< i < n. Let e,u'h w| then be the frontier of t~Je sententiai tree t i' of 
G t such that t i' ~ D(t;), t I ~ I t. Obviously, we ca8 derive the tree T, 
using the initial tree t,, and have • sequence of adjoining operations 
using the auxiliary trees tl, ~ for I _< i _ n. From T we c,-, obviously 
obtain the tree T' the same am given by Fig 4.3¢, using only the 
mtxifimry tre~ in A t. The fruntiee of T' is obviously wl...w =. Henee, x 
I~G). Therefore, LI* E L(G). Thus L(G) =~ Us*. 
(*) % = S 
I 
n 
(b) ~IA : $ 
IX / \ 
S St 
/\ 
/ \,r t,t 
/ \ 
(c) 
/ 
/ 
S 
IX 
/X 
/~\*'~'t 
$ 
I St 
S I \ 
I I \.- c', 
e 
T ° 
FIgure 4.3 
4.4. Cloeulm under Intemm~tlon with R elgul~ur ImaKuNlem 
Let L T be a TAL and L R be a regular language. Let G be • 
TAG generating L T and M = (Q , ~ , 6 , q0 , QV) be a finite state 
automaton recognizing Lit. We can construct a 8ramma: G and will 
show that L(GI) u L T N L R. 
Let a be an elementary tree in G. We shall associate with each 
node a quadruple (qt,q2,%,q4) where qt,q2,q.l,qi E Q Let (qt,%,q.~,q4) 
be mare)tinted with a node X in (~. Let us assume that a is an 
auxiliary tree, and that X is an ancestor of the foot node of a. and 
hence, the ancestor of the foot node of any derived tree "r in D(a). 
Let Y be the label of the root and foot nodes of (~. If the frontier of 
7 ('T in D(o)) is w t w 2 Y w s w 4, and the frontier of the snbtree of 
rooted at Z, which corresponds to the node X in a is w= Y w~. The 
idea of amso~iating (qt,q~,q3,q~) with X is that it must be the case 
that 6°(qz, w~) =- q~, and ~(q~, w=) =, qs. When ~ becomes a part of 
the seutenti ~I tree ~" whose frontier is given by u w I w 2 v w s w4 w, 
then it must be the case that 6*(q~, v) == cut. Following this 
remmoing, we must make q= == q~, if Z is not the ancestor of the foot 
node of % or if "~ is in D(o) for some initial tree (~ in G. 
We have assumed here, as in the case of the parting algorithm 
presenf~ed earlier, that =ny node in ~y elementary tree has ~tmost 
two children. 
From G we cam obtain GI u follows. For each initial tree a, 
mmociate with the root the quadruple (q0, q, q, qr) where qe is the 
initial state of the ~qni~ state automaton M, and ~ E QF. For each 
auxiliary tree # of G, associate with the root the quadruple 
(ql,q~,qa,q4), where q,ql,q=,ch,q4 a~e some variables which will later 
be given values from Q. Let X be some node in some elementary tree 
cL Let (ql,q=,o.s,q4) be ~umociaU~l with X. Then, we have to consider 
the fol~)'~iag cues 
Cans I" X hi- two chUdreu Y and Z. The left child y is the 
ancestor of the foot node of a. Then zuoeiste with V the quadruple ( 
p, q~, o..I, q ), and ( q, r, r, s ) with Z, and ~ssociate with X: the 
constraint that only throe trees whoue root has the quadruple ( qt, P, 
s, q4 ), among Shone which were allowed in the orism~ grmmmus, 
may be adjoined at this node. If qt pd p, or q4 ~,i s , then the 
constraint associated with X must be made obligatory. Lf in the 
origin.l gruamar X had an obligatory constraint associated with it 
then we retmm the obligatory constraint regarcllelm of the relationship 
between qt and p, mud q4 and s. if the constraint amsccinted with X 
is a null adjoining constraint, we seaociate ( qt, qt, CL,, q ), and ( q, r, 
r, q4 ) with Y and Z resp~tively, and aamcinte the nuU adjoining 
enustramt with X. If the label o( Z is a. where s E ~, then we cboous 
s ~ q such that 6 ( q, a ) I s. In the nu II adjoining constr~nt c~ule, 
q is cheeeu such that 6 ( q, a ) == q4. 
89 
CaN 2: This corresponds to the case where • node X hu two 
childlt~ Y and Z, with (qt,q~,ql0qt) asm¢inted at X. \[st Z ( the right 
child ) be the aucestor of the the foot node the tree a. Then we shall 
smucinte (p,q,q,r), (r,qs,qa,s) with Y and Z. The am•slated cottstraiat 
with X shaft be that only those trees amour those which were 
allowed in the nepal f~nmlmar may be adjoined provided their root 
has the quadruple (ql,p,s,q4) aaso¢inted with it. If qt ~ P or q4 ~ r 
then we make the constraint obligatory. If the original grammar had 
obfiptory constraint we wifl retm the obfiptory constraint. NaB 
constraint in the original grammar will force us to use null constraint 
ud not consider the cases where it is not the case that qt I p and 
q4 m s. If the label of Y is • terminal 'a' then we chouse r such that 
6*(p,a) m r. If the constraint at X is s nuU adjoining constraint, then 
• ¢(qt,a) - r. 
Case 3: This corresponds to the cue where •either the left 
child V nor the right child Z of the node X is the ancestor of the foot 
node of a or if a is a initial tree. Then qs ~ q8 I q. We will 
ammeiate with Y and 7. the quadruples (p,r,r,q) and (q,u,t) reap. The 
constraints are assigned as before , in this cuse it is dictated by the 
quadruple (ql,P,t,qt). \[f it is not the cue that ql " P and q4 um t, 
then it becomes an OA constraint. The OA and NA constraints at X 
are treated similar to the previous eMes, and so is the cue if either 
Y o1' Z is labelled by a terminal symbol. 
Cuss 4: If (ql,qt,q~bqt) is assort•ted with a node X, which hun 
only one child Y, then we can de~ with the various cusee as follows. 
We will annotate with Y the q•adruple (p,qs,qa~t) and the constraint 
that root of the t~,e which can be adjoined at X should have the 
quadruple (qt,P~,qt) amucinted with it amen8 the trees which were 
aflowed in the original grammar, if it is to be adjoined st X. The 
cm where the original grammar bad null or obligatory constraint 
amocinted with this node or Y is labelled with a terminsi symbol, are 
treated similar to how we dealt with them in the previous cuses. 
Once this has been done, let ql,---,qm be the independent 
variables for this elementary tree o, then we produce as many co~ 
of a so that ql,..-,qm take ad possible value8 from Q. The only 
diHerenee •meal the varions copies of cs so produced will be 
eonsteaint8 u~ with the nodes in the trees. Repeat the prose• 
for aft the elementary trees in G a. Once this has been dome and each 
tree |lynn ~ unique name we can write the constraints in terms of 
them names. We will now show why L~G1) m U T ~ L R. 
Let w E I~GI). Then there is a sequence of adjoining 
operations starting with uu inithd tree a to derive w. Obviowdy, w E 
L.F, also since corresponding to ensh tree used in deriving w, there is 
n corresponding tree in G, which diffem only in the constraints 
asm¢inted with its nodes. Note, however, that the coutraints 
aloeinted with the nodes in tre~ in G z are just a reatriction of the 
corresponding om in G, or an obligatory constraint where there wu 
noes in G. Now, if we can amume ( by induction hypothesis ) that if 
after n adjoining operation we cam derive "/' E D(a'). the• there is a 
corresponding tree ~, E D(a) in G, which bus the same tree structure 
as 7' but differm| only in the constraints aasociated with the 
corl~sponding nodes, then if we adjoin at some ..ode in "7' to obtain 
~t'. we can adjoin in "T to obtain "ft (corresponding to "it'). 
Therefore, if w can be derived in Gt, then it eu definitely be derived 
inG. 
If we can abe 8bow that l,(Gt) ~ 14, then we ean conclude 
that L(GI) ~ L T /'1 Lm. We can use induction to prove this. The 
induction hypothesis is that if all derived trees obtained after k <_ n 
adjeininlg operations have the prepethy P then so will the derived 
after n + 1 adjoininp where P is defined as, 
Property P: If any node X in a derived tree -f bus the foot-node of 
the tree 0 to which X belongs labeDed Y as • descendant sucb that 
w z Y w= is the fro•tier of the s•btree of ~ rooted at X, then if 
(ql,q~,q.l,q4) had bee• as•oct•ted with X, 6*(qt,wl) m q= and 
6"(q3,ws) m q4, and if w is the fro•tier of the subtree under the foot 
node of 0 i• "/is then 6*(q~,w) ~ q8- if X is not the ancestor of the 
foot •ode of 0 then the subtree of 0 below is of the form wtw s. 
Suppme X has aso~inted with it (ql,q,q,q2) the• 6*(qt,wl) -- q, 
5*(q,w,) = q,. 
Actually what we mean by an adjoining operation is not 
•eeessarily just one adjoining operation but the minimum number so 
that no obligatory constraints are am•tinted with any nodes in the 
derived trees. Similarly, the base ease need not consider only 
elementary trees, but the smalleat (in terms of the number of 
adjoining operations) tree starting with elementary trees which h,m 
no obligatory constraint annotated with any o( its nodes. The base 
cue can be see• easily considering the why the grammar wse built 
(it can be shown far•ally by induction on the height of the tree) The 
inductive step is obvious. Note that the derived tree we are gong to 
use for adjoining will have the property P, and so will the tree st 
which we adjoin; the former because of the way we dreig•ed the 
grammar and amiped coaatraints, and the latter because of 
induction hypothesis. Thus so will the new derived tree. Once we 
have proved this, all we have to do to show that L(GI) C_ L R is to 
consider those derived trees which axe soots•tint trees and observe 
that the roots of these trees obey property P. 
Now, if n string x E LT f3 Lit, we can show that x E L(G). To 
do that, we make use of the following claim. 
let ~ be sn anxilinry tree in G with root labelled Y and let "r E 
D(B). We claim that the~ is a B' in Gt with the same structure u 0, 
such 'that there is n ~,' in D(beta~))') where q' hu the same structure 
as % such that there is no OA constraint in ~'. let X be a node in 
~t which wu used in deriving ~,. The• there is n node X' in ~' such 
that X' belo•p to the anxilliary tree 0f (with the same structure as 
01- There are several rMes to consider - 
Case 1: X is the ancestor of the foot node of 01, such that the 
fro•tier of the subtree of 0t rooted at X is wsYw 4 and the fro•tier of 
the subtree or 7 rooted at X is W|WlZW~W t. Let 6~(qt,w|) an q, 
6*(q,wt) -- q,, 6*(qa,w2) n r, and 6*(r,wt) -- q4. Then X' will have 
(ql,q,r,qt) aseocinted with it, and there will be no OA constraint in 
Case 2: X is the ancestor of the foot node o( Of and the frontier of 
the subtree of 0t rooted at X is wsYw 4. let the frontier of the 
subtree of "I rooted at X is WsWlW=W t. Then we claim that X' in 7' 
will have amucinted with it the q•adl~tple (qt,q,r,qt), if 6*(ql,wl) m 
q, 6*(q,wl) me p, 60(p,w2) me r, and 6*(r,wt) u q4- 
Case 3: let '.he frontier of the subtree of 0t {and aJeo ~) rooted at X 
is WlW =. Let 6*(q,wl) a p, ~(p,ws) I r. Then X' will have 
associated with it the quadruple (q,p,p,r). 
We shall prove o•r claim by induction o• the number of 
ucljoi•ins operations used to derive "T. The buse case (where -~ == 0} is 
obvious from the way the Irammar (i t wu built. We shall now 
amume that for all derived trees % which have bee• derived from 0 
using k or less adjolnins operatiou, have the property u required ia 
o•r claim, let "f be a derived tree in 0 after k adjuuctious. By our 
inductive hypothesis we may ass•me the existence of the 
corresponding derived tree "T' (E D(0') derived in G t. Let X be n node 
in -y as show• in Fig. 4.4.1. The• the •ode X' in 7' corresponding to 
X will have associated with it the q•adruple (ql',cht',qs',qt"). Note we 
are nan•inn here that the left child Y' of X' is the ancestor of the 
90 
foot node of ~', The quadruples (qt',ql',q~',P) and (P,Pl,Pl,q4") will 
be asao¢inted with ¥' and Z' (by the induction hypothesis). Let "h be 
derived from ~ by adjoining ~1 at X as in Fig. 4.4.2. We have to 
chew the existence of ~t' in G 1 such that the root of this auxiliar7 
tree hu saso¢iatod with it the quadruple (q,qt',q4",r). The exmtence 
el the tree follows from induction hypothesis (k =ffi 0). We have also 
got to show that there exists "/t' with the same structure us "f but 
one that allows ~1' to be adjoined at the required node. But this 
should be 8o, since from the way we obtained the tree, in G1, there 
will exist ~t" such that X I' has the quadruple (q,q~',qa',r) and the 
constraint* at X l' are dictated by the quadruple (q,qt',q4e,r), bat 
such that the two children.of X t' will have the same quadruple as in 
7'. We can now adjoin ~I' in ~I" to obtain "h'. It can be shown that 
~t' has the required property to establish our clam. 
/\ 
/ \ 
/ \ 
/ \ 
/ x \ / /\ \ 
/ / \ \ x / \ y 
/ / \ \ / \ 
/ / \ \ / \ 
/ /\ /\ \ / \ 
/ / \ / \ \ /\ /\ 
/ / \/ \ \ / \ / \ 
........................... / \ I \ 
v'~ T v'= w* t n'= / \ / \ / x/ \ 
lr'! ~ W' 2 e°1 e* 2 
~* (q' t.v' t)=q'=~* (p,v° t)--'pt 
&*(q'a.w'~)---p ~*(Pt.e'=)=q', 
Fl~furn 4.4.1 
/\ 
/ \ 
/ \ 
/ \ 
/ \ 
/ \ 
/ \ 
/ \ 
/ /\ \ 
/ / \ \ 
/ / \ \ 
/ / \ \ 
........ / \ ........ 
I \ 
I \ 
I \ ........ /\ ........ 
~*(q.x) fq't &*(q's.y)--r 
Fi?~urn 4.4.2 
Flatly, any node below the foot of Dr' in 74' will satisfy our 
requieement~ as they are the same as the corresponding nodes in 71 *. 
Since BI' satisfies the requirement, it is simple to obasrve that the 
nodes in ~1' will, even after the adjunctiou of ~1' in "at'. However, 
because the quadruple associated with X I' are different, the 
quadruples of the nodes above X t" must reflect this cbuge. It is easy 
to check the existence of an anxKinr? tree such that the nodes above 
X t' satisfy the requirements as sta~l above. It can alan be argued am 
the basis of the design of gramme GI, that there exisu trees which 
ailow this new auxiliary tree to be adjoined ~t the appropriate place. 
This then allows us to conclude that there exmt a derived tree for 
etch derived tree beiongin to D(~) as in our claim. The next step is 
to extend our claim to take into --count all derived trees (i.e., 
including the sentential trees). This can be done in a manner similar 
to our treatment of derived trees belonging to D(~) for some 
~dlinry tree ~ as above. Of course, we have to consider only the 
cue where the finite state automaton start8 from the ini¢i~d sta~ q0, 
and rez~bes some final state qr ou the input which is the frontier o( 
some esnten*ial tree in G. This, then allowu us to conclude that L~ rl 
'L R C L(G1). Hence, L(Gt) -- L T ~l Lit. 
5. HEAD G~S AND TAG's 
In this section, we attempt to show that Head Grmmmmm (HG) 
are remarkably similar to Tree Adjoining Grammars. It appesn that 
the basic intuition behind the two systems is more or less the same. 
Head Grammars were introduced in (Pollard,1084\], but we follow the 
notations used in \[Roach,10841. It has been observed that TAG's and 
HG's share a lot of common formal properties such as almost 
identical closure results, similar pummping lemma. 
Consider the basic operation in Head Grammars - the Head 
Wrapping operation. A derivation from n non-terminal produces a 
pair (i,a1...ai...a~) (a more convenient representation for this pan is 
al...~ilLl+l...a~ ). The arrow denotes the head of the string, which in 
turn determines where the string is split up when wrapping operation 
takes place. For example, consider X->LL~(A,B), and let A=*whlx 
and B=~*uglv.Then we say, X=*whuglvx. 
We shall define some functions used in the HG formalism, 
which we need here. If A derives in 0 or more steps the headed string 
whx and B derives ugv, then q, q, 
l) if X -> LLI(A.B) L8 a rule ~u the gTtmmmx ~hen 
X dsrlveu vhugvx 
2) L! X -> LL~(A.B) ts * ruln £n ~he grammar ~hnu 
X derlves vhugvx 4. 
3) if X -> LCt(A.B) Ls a rulo In the grammar then 
X dertvnu vhxugv 
4) if X -> LC~(A.B) in a rule \[n the granm~r then 
X durlvee vhxtt~r 4 b 
Nov consider hoe u dertv.tlon Ln TAGs proceeds - 
Let ~ be an auxilliary tree and let ~ be n sentential tree as in 
Fig 5.1. Adjoining ~ at the root of the sub-tree ~ gives us the 
senteutiaJ tree in Fig 5.1. We eros, now see how the string whx has 
• wrapped around* the sub-tree i.e,the string ugv. This seems to 
suggest that there is something similiar m the role played by the foot 
in an auxilliary tree and the head in a Head Grammar how the 
adjoining operations and head-wrapping operations operate on 
strings. We could say that if X is the root of ~ auxilliary tree t~ and 
al...x i X a~+t...a ~ is the frontier o( a derived tree ~ E D(~}, then the 
derivation of 7 would correspond to a derivation from a non-terminal 
X to the string al...a 4 1ai÷t...a~ in HG and the use of 7 in some 
senteutial tree would correspond to how the strings al... a 5 and 
~÷t...a~ are used in deriving, string in HL. 
a= S /\ 
/ \ 
/ X \ / /-\ \ 
/ /-- - \ \~_~_'7 
ugv $ 
/\ 
/ \ 
! \ 
/ x \ ,hT-~-x 
u~ 
~= X /\ 
/ \ 
/ \ 
/ X \ 
v h • 
ri~r, s.J1 
91 
Based on this observation, we attempt ¢4) show the close 
relationship of TAL'o and HL's. It is more convin/ent for us to think 
of the headed string (i,at...sl) as the string •t.--~ with tbe head 
pointing in between the symbok I 4 and 14+t rather than at the 
symbol 14. The defmition of the dehvation •per•tom can be extended 
hi ~ stra/ghtforward manner t4) take this into account. However, we 
c'~" •cheers the S2rne effect by considering the dermitions of the 
Slimier• LLJ~C,ete. Pollard suggests tha* cases such as IJ~,~) be 
IcR u"dermed. We shrift -'~,ume thai if ~" --,.by then LI~.~) ~, 
• ,h~, LC~) -- ~, LC~,~) -- ~, '-C,~L;) -- ~, ~C,(;,X) -- ~, 
=.~ Lc,(~,;) = ~. 
~'e, the~ say that if G is n He~d Grammar, then w I -= w bx belongs 
¢4) L(G) if and only if S derives the headed string wbx'ror whXx. 
With this new definition, we shsil show, without givin~ the proof, 
~hat the ci~ of TAL's is ensnared hi the chum of HL's. by 
systematically coeverthiS any TAG G to n HG G'. We shaft assume, 
without loss of general/t)', that the constra/nts expressed at the nodes 
of elementary trees of G ~re - 
I) Nothing can be •de•heed st • node (NA), 
2) Any appropriat~ tree (~mbob at the node and root of the 
~*uxillimry tree must marsh) can be adjoined (AA), or 
3) Adjoining at the node is •brig•tory (OA). 
It is ea~ ¢4) show that these constra/nts are enough, and that 
selective adjoinhig can be expressed in terms of these and additiomd 
non-terminals. We know give • protednrzi deseriptioe of obtaining 
an equivalent Head Crammat from • Tree*Adjoining Grammar. The 
procedure works u follows. It k a reeumve procedure 
(Couvert_to_HG) which takes in two patametsrs, the first 
representing the node oe which it k being •pplied and the ~e~ood the 
label appearing on the left-hand side of the HG productions for this 
node. ff X is a ~onterminal, for each auxiliary tree ~ whose root hu 
the label X, we obtain • sequence of production- such that the rmst 
one has X on the left-hand side. Using these productions, we can 
defoe the string Wl~W ~ where n derived tree in D(~) has • frontier 
wiYw ~. ff Y is •Ysode with with IsJ)ei X in some tree where 
adjoining is allowed, we introduce the productions 
T' -> L~(x.r) {so then. s derived t.ree with root 
lnbel X nny wrl~ 8~'ovad the 8t.rin4| derived from the nbt.reo 
below •.hie node~ 
r -> L~t(A 1 ..... Aj) {anmu*4q that there 
exo J children of this node and the Ink child t• the 
ancestor of the foot node. By cedllng t.he procedure 
recurstvely for ill the J chLldren of T with At.k 
r~nlrlng frox I throuKh J, ve cns derive from I1' the 
front.£er of the subtreo belo~ Y} 
T' -> I' { thin iu t~ handle t*hn cue where no 
adJuc~on ~d~ns place •t T) 
If G is s TAG then we do the following - 
Repeat for every In£t182 tree 
Convert to RG(root,S') (S" will be the 8t4u-t symbol of 
the nov Heed (;re=,--,'}. 
Repe&t* for o~¢h Amctllta~-y tr~ 
Conret~ m J~ (root. roo~lmlol) 
where Ccarez~ ~o HG(n~te.nwso) In dettnsd -.. follmm 
L! undo 18 an index.aLl node tJmn 
cnsn I I! tJm ~mstr~tnt n~ ~hn node t8 A& 
add product, ions $~I->LLu(node syubol.I'). 
|'->LC t (At', .... At', .... Aj') 
S\]m->LCt (At '. .... At', .... A| ') 
• here II'.A t ',~'....A|' are ,,mr •ou-tenLtna~ 
synbole,A ! ..... A| correspond t.o the J chlldren 
ot the node sad l=l If foot, node is not* • descund..mt* 
of node else =1 •uch t*h&t* the 1 ~ child of •ode is 
ancan~r of foot* node,J=uQber of chiZdreu of •ode 
for Im-I co J sup I do 
Convert, to HG(k ~ child of •ode.At'). 
Cue 2 The conet.r~tnt* •t* ~bn node Ls JUt. 
Sue u Cue 1 except don't* add the product*lone 
S~m->LL t (node 8~mbol.r). 
II*->LCt (AI'. .... A| '). 
Cue 3 The constrnint st, the node l• 0A. 
Stse an Cue I except, that* we don't, a4d 
Syx->t.C t (At',...Aj') 
else if *.he node hu • ternl•s3, synbol •. 
then add the production gyx ->~, 
el•e {it 1• • foot* node } 
it the cons&taint* at, the foot. node is AA then 
,dd the product*ionn _ _ 
Syx ->ill(node syxbol,~)/~ 
it the con•t.rx.iat* t• 0A then add onXy the 
product.ion 
Syu ->l.t1(node slnt~I.~) 
L! the c gnetl"~nt* il gA add the product.ion 
S~w ->.~ 
We sh~dl now xive so example of converting • TAG C to s 
HG. G coeta~s • single initiaJ tr~ o, and • single suxiliar7 tree 
as in Fig. 5.2. 
$ 
a= I ~= 
Ftf~ur. S..~2 
I\ 
I \ 
a S 
It\ 
I I \ 
/ I \ 
b s(~) c 
ObviouS, L(Cl -- {so~c- / • _> 01 
92 
Appbying the procedure Convert to_HG to this grammar we 
obtain the HG whose productions are given by- 
s'-~ LL~(S,A) 
A -> 
s -> L%(B.¢) 
B -> I\[ c 
-> LL~(S,D)/O 
0 -> I.Ct(Z.F.G) ->'~ 
F -> "~ -) -~- 
vhtch can be re~r~tten u 
s' -> s/~ S-> 
LCt(a,X') 
~' -> LL~(S,~c) or ~' ->u~(s,~c) 
It can be vurifte~ Chat Chin grumsr generates exactly 
L(G). 
It is worth emphasising that the main point of this exercise wU 
to show the ~imilarities between He~J Grammars and Tree Adjoining 
Grammars. We have shown how a HG G' (using our extended 
definitions) can be obtained in a systematic fashion from a TAG 
G. It is our belief that the extension of the definition may not 
necessar/. Yet, this conversion process should help us understand the 
similarities between the two formalisms. 
6. OTHER MATHEMATICAL PROPERTIES 
OF TAG's 
Additional formal properties of TAG's have been discussed in 
{Vijay-Shankat and Joshi,1085\]. Some of them are listed below 
t) Pumping lemmn for TAG's 
2) TAL's are closed under sub6titution and homomorphisms 
3) TAL's are not closed under the following operations 
a) intersection vtth TkL'8 
h) |.ntnrsoct~.on ntth CFL'n 
c) coapleMnt, atton 
Some other properties that have been considered in \[Vijay- 
Shank~r ~d Joshi,1985j  re u follows 
t) clomsrn under the folloetng properttan 
a) tnverle hosollorphtei 
b) ~m ~ptn~ 
2) 8eLtltnnLrtty tad PartlrJ~-bouadndaan8. 
Referqene~ 
1. Aho,A.V., and Ullman,J.D., 1073 "Theory nf Parsing, Translation t 
an__{d Compiling, Volume 1: Pxrsinp;, Prentice-Hall, Englewood Cliffs, 
N.J., 1973. 
2. Joshi,A.K., 1983 "How much context-sensitivity is necessary for 
chare~terising structural descriptions - tree adjoining gramman" in 
Natural Lanpiua~ie ~ - Th#oretieal v Computational I and 
~ogieal Perspectives (ed. "D.~owty, L.Karttunen, A.Zwick~, 
Cambridge University Press, New York, (originally presented in 
1983) to appear in 1985. 
3. Joshi,A.K., and LevyJ~.S., 1977 "Constraints on Structural 
Dc,seriptinns: Local Transformations s, SIAM \]ourual of Computinlt; 
June 1977. 
4. Joshi,A.K., Levy :..S., and Takahashi, M., 1975 "Tree adjoining 
gramm=rs', Jo, rual of Comout~r ~"~'ems and Sc.;enees March 1975 
5. Kroch, T, and Joshi, A.K., I~85 °Linguistic relevance of tree 
adjoining grammars', Technical Report i MS-CIS-g.5-18, Dept. of 
Computrr and Á~.formation Scteuee I University of P~nnsvlvania, April 
6. Poll:zrd, C, t984 "Generalized Fhruse Structure Grammars, Head 
Grammars, and Natural l"nggagea, Ph.D dissertation t Stanford 
Univer~itz, August 1984 
7. Ro~h. !<., 1084 "Form~J P.-operties of Head Gra:,~m~rs', 
unpublbhed manu~'ript, Stanford University, also presented at the 
M-.th~.mir~ ,ff l,anguage~ workshop zt the University of Michigan, 
Ann Arbor, Oct. lg~.l. 
8. Vijay-S~',ankar,K., Jnshi.A.K.. 1935 "Formal Properties ot Tree 
Adjolmug Grammars'. Tm'hni~.'il Report, D~pt. hi' Cnmp,,ter nail 
hzformation Srit,~rf.~ Univ@r~ttv of Peoesvlvant~, July 1985. 
93 
