AN EAR.LEY-TYPE PAR.SING ALGOR.ITHM 
FOR. TR.EE ADJOINING GR_kMMAR.S * 
Yves Schabes and Aravind K. Joshi 
Department of Computer and Information Science 
University of Pennsylvania 
Philadelphia PA 19104-6389 USA 
schabes~liac.cis.upenn.edu joshi~cis.upenn.edu 
ABSTR.ACT 
We will describe an Earley-type parser for Tree 
Adjoining Grammars (TAGs). Although a CKY- 
type parser for TAGs has been developed earlier 
(Vijay-Shanker and :Icshi, 1985), this is the first 
practical parser for TAGs because as is well known 
for CFGs, the average behavior of Earley-type 
parsers is superior to that of CKY-type parsers. 
The core of the algorithm is described. Then we 
discuss modifications of the parsing algorithm that 
can parse extensions of TAGs such as constraints 
on adjunction, substitution, and feature structures 
for TAGs. We show how with the use of substi- 
tution in TAGs the system is able to parse di- 
rectly CFGs and TAGs. The system parses unifi- 
cation formalisms that have a CFG skeleton and 
also those with a TAG skeleton. Thus it also al- 
lows us to embed the essential aspects of PATR-II. 
1 Introduction 
Although formal properties of Tree Adjoining 
Grammars (TAGs) have been investigated (Vijay- 
Shanker, 1987)--for example, there is an O(ns)- 
time CKY-like algorithm for TAGs (Vijay-Shanker 
and Joshi, 1985)--so far there has been no at- 
tempt to develop an Earley-type parser for TAGs. 
This paper presents an Earley parser for TAGs 
and discusses modifications to the parsing algo- 
rithm that make it possible to handle extensions 
of TAGs such as constraints on adjunction, sub- 
*This work is partially supported by ARO grant 
DAA29-84-9-007, DARPA grant N0014-85-K0018, NSF 
grants MCS-82-191169 and DCR-84-10413. The authors 
would like to express their gratitude to Vijay-Shankc~r for 
his helpful comments relating to the core of the algorithm, 
Richard Billington and Andrew Chalnlck for their graphi- 
cal TAG editor which we integrated in our system and for 
their programming advice. Tb,m~ are also due to Anne 
Abeill~ and Ellen Hays. 
stitution, and feature structure representation for 
TAGs. 
TAGs were first introduced by Joshi, Levy and 
Takahashi (1975) and Joshi (1983). We describe 
very briefly the Tree Adjoining Grammar formal- 
ism. For more details we refer the reader to Joshi 
(1983), Kroch and Joshi (1985) or Vijay-Shanker (1987). 
Definition 1 (Tree Adjoining Grammar) : 
A TAG is a 5-tuple G -- (VN, VT,S,I,A) where 
VN is a finite set of non-terminal symbols, VT is 
a finite set of terminals, S is a distinguished non- 
terminal, I is a finite set of trees called initial 
trees and A is a finite set of trees called auxiliary 
trees. The trees in I U A are called elementary 
trees. 
Initial trees (see left tree in Figure 1) are char- 
acterized as follows: internal nodes are labeled by 
non-terminals; leaf nodes are labeled by either ter- 
minal symbols or the empty string. 
S 
Li~minill$ 
x /x\ 
tofnflnld$ J Ltef rntnll|$ 
Figure h Schematic initial and auxiliary trees 
Auxiliary trees (see right tree in Figure 1) 
are characterized as follows: internal nodes are la- 
beled by non-terminals; leaf nodes are labeled by 
a terminal or by the empty string except for ex- 
actly one node (called the foot node) labeled by 
a non-terminal; furthermore the label of the foot 
node is the same as the label of the root node. 
We now define a composition operation called 
adjoining or adjunction which builds a new tree 
from an auxiliary tree/9 and a tree ~ (~ is any tree, 
2$8 
initial, auxiliary or tree derived by adjunction). 
The resulting tree is called a derived tree. Let 
c~ be a tree containing a node n labeled by X and 
let fl be an auxiliary tree whose root node is also 
labeled by X. Then the adjunction of fl to a at 
node n will be the tree 7 shown in Figure 2. The 
resulting tree, 7, is built as follows: 
* The sub-tree of a dominated by n, call it t, is 
excised, leaving a copy of n behind. 
• The auxiliary tree fl is attached at n and its root 
node is identified with n. 
• The sub-tree t is attached to the foot node of # 
and the root node n of t is identified with the foot 
node of ft. 
$ 
%, 
(ct} (1~) 
$ 
Figure 2: The mechanism of adjunction 
Then define the tree set of a TAG G, T(G) to 
be the set of all derived trees starting from initial 
trees in I. Furthermore, the string language 
generated by a TAG, L(G), is defined to be the 
set of all terminal strings of the trees in T(G). 
TAGs factor recursion and dependencies by ex- 
tending the domain of locality. They offer novel 
ways to encode the syntax of natural language 
grammars as discussed in Kroch and Joshi (1985) 
and Abeill~ (1988). 
In 1985, Vijay-Shanker and Joshi introduced a 
CKY-like algorithm for TAGs. They therefore es- 
tablished O(n 6) time as an upper bound for pars- 
ing TAGs. The algorithm was implemented, but 
in our opinion the result was more theoretical than 
practical for several reasons. First the algorithm 
assumes that elementary trees are binary branch- 
ing and that there are no empty categories on the 
frontiers of the elementary trees. Second, since it 
works on nodes that have been isolated from the 
tree they belong to, it isolates them from their 
domain of locality. However all important linguis- 
tic and computational properties of TAGs follow 
from this extended domain of locality. And most 
importantly, although it runs in O(n 6) worst time, 
it also runs in O(n s) best time. As a consequence, 
the CKY algorithm is in practice very slow. 
Since the average time complexity of Earley's 
parser depends on the grammar and in practice 
runs much better than its worst time complex- 
ity, we decided to try to adapt Earley's parser 
for CFGs to TAGs. Earley's algorithm for CFGs 
(Earley, 1970, Aho and Ullman, 1973) is a bottom- 
up parser which uses top-down information. It 
manipulates states of the form A -* a.fl\[i\] while 
using three processors: the predictor, the comple- 
tot and the scanner. The algorithm for CFGs runs 
in O(IGl2n s) time and in O(IGI n2) space in all 
cases, and parses unambiguous grammars in O(n 2) 
time (n being the length of the input, IGI the size 
of the grammar). 
Given a context-free grammar in any form and 
an input string al "'an, Earley's parser for CFGs 
maintains the following invariant: 
The state A --* a./3\[i\] is in states set Skiff 
S ::b 6A'r, 6 :bal " "ai and a ~ ai+l ""ak 
The correctness of the algorithm is a corollary of 
this invariant. 
Finding a Earley-type parser for TAGs was a 
difficult task because it was not clear how to 
parse TAGs bottom up using top-down informa- 
tion while scanning the input string from left to 
right. In order to construct an Earley-type parser 
for TAGs, we will extend the notions of dotted 
rules and states to trees. Anticipating the proof 
of correctness and soundness of our algorithm, we 
will state an invariant similar to Earley's original 
invariant. Then we present the algorithm and its 
main extensions. 
2 Dotted symbols, dotted 
trees, tree traversal 
The full algorithm is explained in the next section. 
This section introduces preliminary concepts that 
will be used by the algorithm. We first show how 
dotted rules can be extended to trees. Then we 
introduce a tree traversal that the algorithm will 
mimic in order to scan the input from left to right. 
We define a dotted symbol as a symbol asso- 
ciated with a dot above or below and either to the 
left or to the right of it. The four positions of the 
dot are annotated by In, lb, ra, rb (resp. left above, 
left below, right above, right below): laura lb ~rb • 
Then we define a dotted tree as a tree with 
exactly one dotted symbol. 
Given a dotted tree with the dot above and to 
the left of the root, we define a tree traversal of a 
dotted tree as follows (see Figure 3): 
259 
START "'~ f END 
i'A,; o 
E F G H I 
2.1 2.2 2.3 &1 3.2 
Figure 3: Example of a tree traversal 
• if the dot is at position la of an internal node, 
we move the dot down to position lb, 
• if the dot is at position lb of an internal node, 
we move to position la of its leftmost child, 
• if the dot is at position la of a leaf, we move the 
dot to the right to position ra of the leaf, 
• if the dot is at position rb of a node, we move 
the dot up to position ra of the same node, 
• if the dot is at position ra of a node, there are 
two cases: 
- if the node has a right sibling, then move the 
dot to the right sibling at position la. 
- if the node does not have a right sibling, then 
move the dot to its parent at position rb. 
This traversal will enable us to scan the frontier 
of an elementary tree from left to right while try- 
ing to recognize possible adjunctions between the 
above and below positions of the dot. 
3 The algorithm 
We define an appropriate data structure for the 
algorithm. We explain how to interpret the struc- 
tures that the parser produces. Then we describe 
the algorithm itself. 
3.1 Data structures 
The algorithm uses two basic data structures: 
state and states set. 
A states set S is defined as a set of states. The 
states sets will be indexed by an integer: Si with 
i E N. The presence of any state in states set i 
will mean that the input string al...al has been 
recognized. 
Any tree ~ will be considered as a function from 
tree addresses to symbols of the grammar (termi- 
nal and non-terminal symbols): if z is a valid ad- 
dress in a, then a(z) is the symbol at address z 
in the tree a. 
Definition 2 A state s is defined as a 10-tuple, 
\[a, dot, side,pos, l, ft, fr, star, t~, b~\] where: 
• a: is the name of the dotted tree. 
• dot: is the address of the dot in the tree a. 
• side: is the side of the symbol the dot is on; 
side E {left, right}. 
• pos: is the position of the dot; 
pos E {above, below}. 
• star. is an address in a. The corresponding node 
in a is called the starred node. 
• ! (left), ft (foot left), fr (foot right), t~ (top left 
of starred node), b~ (bottom left of starred node) 
are indices of positions in the input string ranging 
over \[O,n\], n being the length of the input string. 
They will be explained further below. 
3.2 Invariant of the algorithm 
The states s in a states set Si have a common prop- 
erty. The following section describes this invariant 
in order to give an intuitive interpretation of what 
the algorithm does. This invariant is similar to 
Earley's invariant. 
Before explaining the main characterization of 
the algorithm, we need to define the set of nodes 
on which an adjunction is allowed for a given state. 
Definition 3 The set of nodes 7~(s) on which an 
adjunction is possible for a given state 
s - \[a, dot, side, pos, l, fhfi,star, t~,b~\], is de- 
fined as the union of the following sets of nodes 
in a: 
• the set of nodes that have been traversed on the 
left and right sides, i.e., the four positions of the 
dot have been traversed; 
• the set of nodes on the path from the root node 
to the starred node, root node and starred node 
included. Note that if there is no star this set is 
empty. 
Definition 4 (Left part of a dotted tree) 
The left part of a dotted tree is the union of the 
set of nodes in the tree that have been traversed 
on the left and right sides and the set of nodes 
that have been traversed on the left side only. 
We will first give an intuitive interpretation of 
the ten components of a state, and then give the 
necessary and sufficient conditions for membership 
of a state in a states set. 
We interpret informally a state 
s = \[~, dot, side, pos, l, f~, fi, star, t~, b~\] in the fol- 
lowing way (see Figure 4): 
260 
"' 7 
C~ 
^" 
Tit!, 
al ... all atl+l .... ah' 
Figure 4: Meaning of s E Si 
• l is an index in the input string indicating where 
the tree derived from a begins. 
• ft is an index in the input string corresponding 
to the point just before the foot node (if any) in 
the tree derived from a. 
• fi is an index in the input string corresponding 
to the point just after the foot node (if any) in the 
tree derived from a.The pair fi and fi will mean 
that the foot node subsumes the string al,+,...ay,. 
• star:, is the address in a of the deepest node that 
subsumes the dot on which an adjunction has been 
partially recognized. If there is no adjunction in 
the tree a along the path from the root to the dot- 
ted node, star is unbound. 
• t~ is an index in the input string corresponding 
to the point in the tree where the adjunction on 
the starred node was made. If star is unbound, 
then t~ is also unbound. 
• b~ is an index in the input string corresponding 
to the point in the tree just before the foot node of 
the tree adjoined at the starred node. The pair t~ 
and b~ will mean that the string as far as the foot 
node of the auxiliary tree adjoined at the starred 
node matches the substring alT+l...ab7 of the in- 
put string. If star is unbound, then b~ is also 
unbound. 
• s E Si means that the recognized part of the dot- 
ted tree a, which is the left part of it, is consistent 
with the input string from al to aa and from at to 
aI, and from ay. to ai, or from a I to al and from az 
to al when the foot node is not in the recognized 
part of the tree. 
We are now ready to characterize the member- 
ship of s in S~: 
Invariant 1 
A state s = \[a, dot, side,pos, l, fh fr, star, t~, b~\] is 
in Si if and only if there is a derived tree from an 
initial tree such that (see Figure 4): 
1. The tree a is part of the derivation. 
2. The tree derived from a in the derivation tree, 
~, has adjunctions only on nodes in 7~(s). 
3. The part of the tree to the left of the dot in the 
tree derived spans the string al ... ai. 
4. The tree derived from a, E, has a yield that 
starts just after ah ends at ay, before the foot node 
(if ay, is defined), and starts after the foot node 
just after ay, (if aI, is defined). 
5. If there are adjunctions on the path from the 
dotted node to the root of a, then star is the ad- 
dress of the deepest adjunction on that path and 
the auxiliary tree adjoined at that node star has 
a yield that starts just after a,~ and stops at its 
foot node at ab t. 
The proof of this invariant has as corollaries the 
soundness, completeness, and therefore the cor- 
rectness of the algorithm. 
3.3 The recognizer 
The Earley-type recognizer for TAGs follows: 
Let G be a TAG. 
Let al...a, be the input string. 
program recognizer 
beg~ 
So = { \[a, O, left, above, 0 ..... -\] 
\]a is an initial tree } 
For i := 0 to n do 
begin 
Process the states of Si, performing one of 
the following seven operations on each state 
s = \[c~, dot, side,pos, l, f,, fr, star, t~, b~\] 
until no more states can be added: 
I. Sc-~er 
2. Move dot down 
S. Move dot up 
4. Left Predictor 
5. Left Completor 
6. Right Predictor 
7. Right Completor 
If Si+1 is empty and i < n, return rejection. 
en~ 
If there is in S. a state 
s=\[a,O, right, above,O .... ,-\] 
such that ~ is an initial tree 
then return acceptance. 
end. 
261 
The algorithm is a general recognizer for TAGs. 
Unlike the CKY algorithm, it requires no condi- 
tion on the grammar: the trees can be binary or 
not, the elementary (initial or auxiliary) trees can 
have the empty string as frontier. It is an off-line 
algorithm: it needs to know the length n of the 
input string. However we will see later that it can 
very easily be modified to an on-line algorithm by 
the use of an end-marker in the input string. 
We now describe one by one the seven processes. 
The current states set is presumed to be S/and the 
state to be processed is 
s = \[a, dot, side, pos, l, fZ, fr, star, tT\]. 
Only one of the seven processes can be applied 
to a given state. The side, the position, and the 
address of the dot determine the unique process 
that can be applied to the given state. 
Definition 5 (Adjunct(a, address)) Given 
a TAG G, define Adjunct(a, address) as the set 
of auxiliary trees that can be adjoined in the ele- 
mentary tree ct at the node n which has the given 
address. In a TAG without any constraints on 
adjunction, if n is a non-terminal node, this set 
consists of all auxiliary trees that are rooted by a 
node with same label as the label of n. 
3.3.1 Scanner 
The scanner scans the input string. Suppose that 
the dot is to the left of and above a terminal sym- 
bol (see Figure 5). Then if the terminal symbol 
matches the next input token, the program should 
record that a new token has been recognized and 
try to recognize the rest of the tree. 
Therefore "the scanner applies to 
s = \[a, dot, left, above, 1, ft, L, star, t\[, b\[\] 
such that ,',(dot) is a terminal symbol and 
~(dot) = ~+I or ~(dot) is the empey symbol 
• Case 1: a(dot) = ai+l 
The scanner adds 
\[~, dot, right, above, 1, f,, fi, star, t\[ , b\[ \] "co 
SI+I • 
• Case 2: a(dot) = 
The scanner adds 
\[tr, dot, right, above, l, ft, fr, star, t\[ , b\[ \] to 
S,. 
3.3.2 -Move Dot Down 
Move dot down (See Figure 6), moves the dot 
down, from position lb of the dotted node to posi- 
C~e 1:a = ai÷~ 
\[1£1/T, tl*~l*\] 
C~le 2." i m E 
~toSi+l 
\[1~1~,d',b1"\] 
Bjl~,tl'.bl'\] 
Figure 5: Scanner 
\[l,fl,fr,tl*,bi*\] \[l.flJr,tl*~ol*\] 
Figure 6: Move dot down 
tion la of its leftmost child. 
It therefore applies ¢o 
s = \[~, d~, left, below, l, ~, f,, star, t\[, b\[\] 
such that ~he node where the do~ is has a 
lef~most child at address u. 
It adds \[a, u, left, above, I, ~ , re, star, t\[ , b~ \] to 
S,. 
3.3.3 Move DotUp 
Move dot up (See Figure 7), moves the dot "up", 
from position ra of the dotted node to position la 
of its right sibling if it has a right sibling, other- 
wise to position rb of its parent. 
It therefore applies to 
s = \[a, dot, ~ght, above, l, ~, fi, star, t\[, b\[\] 
such that the node on which the dot is 
has a parent node. 
• Case 1: the node where the dot is 
has a right sibling at address r. 
It adds \[ct, r, left, above, l, fz, fr, star, t~ , b~\] 
~o S,. 
• Case 2: the node where the dot is is 
~he rightmost child of the parent 
node p. 
It adds 
\[~, p, right, below, l, f,, re, star, t~, bT\] to S,. 
262 
\[l~lJr, tl*,bl*\] 
add~mS/ 
\[l,fl,f~',tl *,bl*\] 
Clme 92 X ii thv rlohlrn~ child 
\[l.fl,fi',tl',bl'\] \[l.fl,fr, tl*.bl'\] 
Figure 7: Move dot up 
3.3.4 Left Predictor 
Suppose that there is a dot to the left of and above 
a non-terminal symbol A (see Figure 8). Then the 
algorithm takes two paths in parallel: it makes a 
prediction of adjunction on the node labeled by 
A and tries to recognize the adjunction (stepl) 
and it also considers the case where no adjunction 
has been done (step2). These operations are per- 
formed by the Left Predictor. 
It applies to 
s = \[~, dot, left, above, 1, h, fr, aar, t~, b~\] 
such that ~(dot) is a non-terminal. 
• Step I. It adds the states 
(LS,0,1eft, above, i ..... -\] 
\[B E Adjuna(~, dot) } to Si. 
• Step 2. 
-- Case 1: the dot is not on the 
foot node. 
It adds the state 
\[~, dot, left, below, 1, ~ , fi , star, t~ , b~ \] 
to S,. 
-- Case 2: the dot is on the foot 
node. Necessarily, since the 
foot node has not been already 
traversed, ~ and fr are 
unspecified. 
It adds the state 
\[~, dot, left, below, l, i, -, star, t~ , b~ \] to 
S,. 
3.3.5 Left Completer 
Suppose that the auxiliary that we left-predicted 
has been recognized as far as its foot (see Fig- 
ure 9). Then the algorithm should try to recognize 
\[I. n. fr. tl.. bl.\] ~, (i.-.-.-.-\] J 
\[1, fl, fr, tl" ,bl*\] \[1, ft. fr, tl", bl*\] 
£---'A 
\[l.-.-.tl-~l.\] \[ki.-.tt.~l'\] 
Figure 8: Left Predictor 
\[r ,fl',fr',tl*',bl*'\] 
\[l.i.-.tl*,bl*\] \[r,fl',fr',l.i\] 
Figure 9: Left completer 
what was pushed under the foot node. (A star in 
the original tree will signal that an adjunction has 
been made and half recognized.) This operation 
is performed by the Left Completer. 
It applies to 
s = \[a, dot, left, below, l, i, -, star, t~, b~\] 
such that the dot is on the foot node. 
For all 
I I I t I ,n St s = L 8, dot , left, above, l, f;, f~, star, t t , bt \] in 
Sz such that a E Adjunct(B, dot') 
Case I: dot' is on the foot node of 
B. Then necessary, f\[ and f~ are 
unbound. 
It adds the state 
LS, dot',left, below, l',i,-,dot',l,~ to S,. 
Case 2: dot ~ is not on the foot node 
of B. 
It adds the state 
~, dot', left, below, l', f\[, f:, dot', l, ~ to S,. 
263 
Case l 
\[tl*,bl*,-,tl*',bl*'\] 
~*~1"1 
/--.--. A .=..=~ 
\[tI* ,bl" ,l,tl*',bl*'\] 
Case 2 
aldd to~Z. 
p.~.tl*.bl*\] 
Figure I0: Right Predictor 
3.3.6 Right Predictor 
Suppose that there is a dot to the right of and be- 
low a node A (see Figure I0). If there has been 
an adjunction made on A (case I), the program 
should try to recognize the right part of the aux- 
iliary tree adjoined at A. However if there was no 
adjunction on A (case 2), then the dot should be 
moved up. Note that the star will tell us if an ad- 
junction has been made or not. These operations 
are performed by the Right predictor. 
The right predictor applies to 
s = \[a, dot, right, below, l, fz, fr, star, tT, bT\] 
• Case 1: dot = star 
For all states ,t $; 
s = \[/3, dot', left, below, t~, bT, -, star ~-, t t , b t \]. 
in Sb 7 such that ~ ¢ Adjunct(a, dot), 
it adds the state 
L O, dot', right, below,tT, * " *' *' bz ,,,star',t z ,b I \] to 
s,. 
• Case 2: dot ~ star 
It adds the state 
\[a, dot, right, above, l, fl, fr, star, tT , bT \] to 
S,. 
3.3.7 Right Completor 
Suppose that the dot is to the right ot and above 
the root of an auxiliary tree (see Figure 11). Then 
the adjunction has been totally recognized and the 
program should try to recognize the rest of the tree 
in which the auxiliary tree has been adjoined. This 
operation is performed by the Right Completor. 
\[l',fl',fr',tl *'.bl *'\] 
\[I,fl,t~e,-I 
~addtd to$i 
\[l',.~',~'r',tl*'.bl *'\] 
Figure 11: Right Completor 
It applies to 
s = \[a, 0, right, above, l, fz, L, -, -, -\] 
For all states 
s! = \[/3, dot', left, above, l', f\[ , fir, star', t~', b~'\] inS, 
and for all states 
LS, dot',right, below, t',T,,~,dot',Z, fd in aS, 
such that a E Adjunct(E, dot') 
It adds 
Lff , dot', right, above, l',-~l , 7~r, star', t;', 6;'\] to 
S,. 
Nhere 7 = f, if f is bound in state st, 
and f can have any value, if f is unbound 
in state el. 
3.4 Handling constraints on adjunc- 
tion 
In a TAG, one can, for each node of an elementary 
tree, specify one of the following three constraints 
on adjunction (Joshi, 1987): 
• Null adjunction (NA): disallow any adjunc- 
tion on the given node. 
• Obligatory adjunction (OA): an auxiliary 
tree must be adjoined on the given node. 
• Selective adjunction (SA(T)): a set T of aux- 
iliary trees that can be adjoined on the given node 
is specified. 
The algorithm can be very easily modified to 
handle those constraints. First, the function 
Adjunct(a, address) must be modified as follows: 
• Adjunct(a, address) = ~, if there is NA on the 
node. 
• A~unct(a, address) as previously defined, if 
there is OA on the node. 
• Adjunct(a, address) = T, if there is SA(T) on 
the node. 
Second, step 2 of the left predictor must be done 
264 
S~pl 
0 
s ° 
,..i • ' s " d 3 
I ~o 2.3 
(p) 
Figure 12: L = {a'~bnec"~ln >__ O} 
make ma,~ tt~t no ,.,'~ 
i~ po mblo on tl~ root o f ~n inifi"~ ~m~ 
S. I /\-./'\ 
$ Z 
Figure 13: Use of end marker in TAG 
only if there is no obligatory adjunction on the 
node at address dot in the tree a. 
3.5 An example 
We give one example that illustrates how the rec- 
ognizer works. The grammar used for the exam- 
ple generates the language L = {a"b"ecndn\]n > 
0}. The input string given to the recognizer 
is: aabbeccdd. The grammar is shown in Fig- 
ure 12. The states sets are shown in Figure 14. 
Next to each state we have printed in paren- 
theses the name of the processor that was ap- 
plied to the state. The input is recognized since 
\[a, O, right, above, 0 ..... -\] is in states set 
sg. 
3.6 Remarks 
Use of move dot up and move dot down 
Move dot down and move dot up can be eliminated 
in the algorithm by merging the original dot and 
the position it is moved to. However for explana- 
tory purposes we chose to use these two processors 
in this paper. 
Off-llne vs on-line 
The algorithm given is an off-line recognizer. It 
can be very easily modified to work on line by 
adding an end marker to all initial trees in the 
grammar (see Figure 13). 
Extracting a parse 
The algorithm that we describe in section 3.3 is a 
recognizer. However, if we include pointers from 
a state to the other states which caused it to he 
placed in the states set, the recognizer can be mod- 
ified to produce all parses of the input string. 
3.7 Correctness 
The correctness of the parser has been proven and 
is fully reported in Schahes and Joshi (1988). It 
consists of the proof of the invariant given in sec- 
tion 3.2. Our proof is similar in its concept to the 
proof of the correctness of Earley's parser given in 
Aho and Ullman 1973. The "ofily if" part of the 
invariant is proved by induction on the number of 
states that have been added so far to all states sets. 
The "if" part is'proved by induction on a defined 
rank of a state. The soundness (the algorithm rec- 
oguizes only valid strings) and the completeness (if 
a string is valid, then the algorithm will recognize 
it) are corollaries of this invariant. 
3.8 Implementation 
The parser has been implemented on Symbolics 
Lisp machines in Flavors. More details of the 
actual implementation can be found in Schabes 
mad Joshi (1988). The current implementation 
has an O(IGlZn 9) worst case time complexity and 
O(IGln 6) worst case space complexity. We have 
not as yet been able to reduce the worst case time 
complexity to O(\[G\[Zn6). We are currently at- 
tempting to reduce this bound. However, the main 
purpose of constructing an Parley-type parser is to 
improve the average complexity, which is crucial in 
practice. 
4 Extensions 
We describe how substitution is defined in a TAG. 
We discuss the consequences of introducing substi- 
tution in TAGs. Then we show how substitution 
can be parsed. We extend the parser to deal with 
feature structures for TAGs. Finally the relation- 
ship with PATR-II is discussed. 
4.1 Introducing substitution in 
TAGs 
TAGs use adjunction as their basic composition 
operation. It is well known that Tree Adjoining 
Languages (TALs) are mildly context-sensitive. 
TALs properly contain context-free languages. It 
is also possible to encode a context-free grammar 
with auxiliary trees using adjunction only. How- 
ever, although the languages correspond, the pos- 
sible encoding does not reflect directly the original 
265 
So 
.$1 
$2 
$a 
S4 
S5 
S6 
$7 
ss 
s9 
\[a, O, left, above, 0 ..... -\] (left predictor) 
\[¢~, O, left, below, O, -, -, -, -, -~ (move dot down) 
\[~! Zp left, ahoy% 01 --,--r--,--,--2 (scanner) 1, right, abo~e, 
0, --, -, --, --, -\] (move dot up) 
2, left, below, 0, --, --, --, --, -\] (move dot down) 
\[~, 2.1, left, above, O, -, -, -, -, -\] (scanner) 
z, le/tt.bove, Z, , , , ,-\] ~sc~ner) 
left °ha. 2 - -,- - -i (left 
\[/~, 2, left, below, 1 ..... -\] (move dot down) 
O, left, below, 2, --, --, -, --, --\] (move dot down) \[~', 
1, right, above, 1, -t --1--, --,--\] ~move dot up) 
\[0, 2.2, left, below, 1, 3, --,--, --,--\] ~left completor) 
\[/~, 2.1, right, above, I, --, --, --, --, --\] (move dot up) 
\[~, O, left, above, 0, - .... -\] (left predictor) 
f/J, O, left, below, 0, -, -, -, -, -\] (move dot down) -\] ~scanner) 
\[ct, 11 le~t l aboo% 0 r -1 --I --P -, (left predictor) ,\[~, 2, left, above, O, -, -, -, -, 
\[13, O, left, above, 1, -, -, -, -, -\] (left predictor) \[0, O, left, below, 
1, -, --, --, -, --\] (move dot down) 
\[/~, 2.1, left, aboue, 1, --, --, -, -, -\] (scanner) 
\[B, 1, left, above, 2, -, --, --, -, --\] (scanner) 
\[/~, 2, left, above, 1, --, -, --, --, -\] (left predictor) 
\[0, 2, left, below, 0, -, -, 2, 1,3\] (move dot down) 
\[~, 2.2, left, above, 1, -, -, -, -, -\] (left predictor) 
\[p, 2.1, le/t, abate, O, -, -, 211, a I (scanne 0 
\[o, 1, left, above, O, --, --, O, O, 4\] (manner) 
\[~, 2.2, fell abo~e, O, -, -, 2, 1, 3\] (left predictor) 
\[~, 2.2, le)'t, below, O, 4, --, 2, 1,3\] (left completor) 
\[0, 2.3, left, abooe, O, 4, 5, 2,1,3\] (scanner) 
\[~, 2.2, right, above, 0, 4, 5, 2, 1, 3\] (move dot up) 
\[a~ 1, right, above t O r --t --w 01014\] (move dot up) 
\[0, 2.2, right, above, 1, 3, 6, -, -, -\] (move dot up) 
\[~, 2.3, left, above, 1, 3, 6, --, -, -\] (scanner) 
\[~, 2.2, right, below, 1~ 3~ 6~ -~ - r -\] (right predictor r case 2) 
\[0, 2, right, below, 1,3, 6,--,-,--\] (right predictor, case 2) 
B I 3, lep, above, 1,3, 6, -I --I--1 (scanner) ~, O, right, below, 
I, 3, 6, --, --, -\] (right predictor, case 2) 
\[~, 3, left, above, 0, 4, 5, --, --, --\] (scanner) 
(move dot up) \[~1 21 fish'1 oh°re10, 41 51 --, --I -- (right predictor, case 2) 
\[~, O, right, below, O, 4, 5, -, -, \[~, O, 
rlqht l above, O, 4, 5, --, --, --\] (right completor) 
\[a, 0, left, beio~, 0, --, --, 0, 0, 4\] (move dot down) 
\[0, 2.1, right, above, 0, --, --, 2, 1,3\] (move dot up) 
\[\[3, 2.2, right, below, 0, 4, 5, 2,1,3\] (right predictor, case 2) 
\[a, 0, right, below, O, -, -, O, O, 4\] (right predictor, case 1) 
\[0, 2.8, right, above, 0, 4, 5, 2, 1, 3\] (move dot up) LS, 2, right, below, O, 
4, 5, 2,1,3\] (right predictor, case 1) 
\[0, 2, right, above, 1,3, 6, --, --, --\] (move dot up) 
I B r 2.31 right I above, 113, 61 --I --~--\] (move dot up) /3, O, right, above, I, 3, 6, --, --, --\] (right completor) \[0, 3, right, abo~e, 1,3, 6, --, --, --\] (move dot up) 
\[o, O, right, above, O, --, --, --, -, -\] (end test) 
\[~, 3, right, above, O, 4, 5, -, --, --\] (move dot up) 
Figure 14: States sets for the input aabbeccdd 
/\ 
Figure 15: Mechanism of substitution 
context free grammar since this encoding uses ad- 
junction. 
Substitution is the basic operation used in CFG. 
A CFG can be viewed as a tree rewriting system. 
It uses substitution as basic operation and it con- 
sists of a set of one-level trees. Substitution is a 
less powerful operation than adjunction. 
However, recent linguistic work in TAG gram- 
mar development (Abeilld, 1988) showed the need 
for substitution in TAGs as an additional opera- 
tion for obtaining appropriate structural descrip- 
tions in certain cases such as verbs taking two sen- 
tential arguments (e.g. "John equates solving this 
problem with doing the impossible") or compound 
categories. It has also been shown to be useful 
for lexical insertion (Schabes, Abeind and Joshi, 
1988). It should be emphasized that the intro- 
duction of substitution in TAGs does not increase 
their generative capacity. Neither is it a step back 
from the original idea of TAGs. 
Definition 6 (Substitution in TAG) We de- 
$ VP NP 
Figure 16: Writing a CFG in TAG 
fine substitution in TAGs to take place on specified 
nodes on the frontiers of elementary trees. When 
a node is marked to be substituted, no adjunction 
can take place on that node. Furthermore, sub- 
stitution is always mandatory. Only trees derived 
from initial trees rooted by a node of the same la- 
bel can be substituted on a substitution node. The 
resulting tree is obtained by replacing the node by 
the tree derived from the initial tree. Substitution 
is illustrated in Figure 15. 
We conventionally mark substitution nodes by 
a down arrow (1). 
As a consequence, we can now encode directly 
a CFG in a TAG with substitution. The resulting 
TAG has only one-level initial trees and uses only 
substitution. An example is shown in Figure 16. 
4.2 Parsing substitution 
The parser can be extended very easily to handle 
substitution. We use Earley's original predictor 
and completor to handle substitution. 
266 
\[I, fl, ft. fl*, bl*,subs~?\] ~. \[i,-.-,-.-.W~e\] 
Figure 17: Substitution Predictor 
The left predictor is restricted to apply to nodes 
to which adjunction can be applied. 
A flag subst? is added to the states. When set, 
it indicates that the tree (initial) has been pre- 
dicted for substitution. We use the index ! (as 
in Earley's original parser) to know where it has 
been predicted for substitution. When the initial 
tree that has been predicted for substitution has 
been totally recognized, we complete the state as 
Earley's original parser does. 
A state s is now an ll-tuple 
• \[~, dot, side,poe, l, fl, fr, star, t~, b~, subst?\]: 
where subst? is a boolean that indicates whether 
the tree has been predicted for substitution. The 
other components have not been changed. 
We add two more processors to the parser. 
Substitution Predictor 
Suppose that there is a dot to the left of and above 
a non-terminal symbol on the frontier A that is 
marked for substitution (see Figure 17). Then the 
algorithm predicts for substitution all initial trees 
rooted by A and tries to recognize the initial tree. 
This operation is performed by the substitution 
predictor. 
It applies to 
s- \[~, dot, left, above, l, f l, fr , star, t~ i b~ , subst?\] 
such that a(dot) is a non-terminal on the 
frontier of ~ .hieh is marked for 
subst itut ion: 
It adds the states 
{\[fl, O, left, above, i, -, -, -, -, -, true\] 
\]/~ is an Lnitial tree s.t.#(O) -- or(dot)} 
to Si. 
Substitution Completor 
Suppose that the initial tree that we predicted for 
substitution has been recognized (see Figure 18). 
Then the algorithm should try to recognize the 
rest of the tree in which we predicted a substitu- 
tion. This operation is performed by the substi- 
tution completor. 
\[i'.fl',fr',tl*'.bl*',subst?'\] 
_. 
\[I.fl,fr.-.-,=uel \[r,fl',fr',tl*',bl *',subst?'\] 
Figure 18: Substitution completor 
It applies to 
s=\[a,O, rioht,above, l, , , , , ,true\] 
For all states s = 
\[/3, dot', left, a~-v~o e,- l',jt,jr,star'," " t~', b~', subst?'\] 
in Sa s.t. #(dot') is marked for 
substitution and l~(dot) = a(O). 
It adds the following stats to Si: 
\[/3, dot', right, above, 1', f\[ , f~, star', t~' , b~ ', subst?'\] . 
Complexity 
The introduction of the substitution predictor and 
the substitution completor does not increase the 
complexity of the overall TAG parser. 
If we encode a CFG with substitution in TAG, 
the parser behaves in O(IGl~n s) worst case time 
and O(\[GIn 2) worst case space like Earley's orig- 
inal parser. This comes from the fact that when 
there are no auxiliary trees and when only substi- 
tution is used, the indices ft,fi,t~,b~ of a state 
will never be set. The algorithm will use only the 
substitution predictor and the substitution eom- 
pletor. Thus, it behaves exactly like Earley's orig- 
inal parser on CFGs. 
4.3 Parsing feature structures for 
TAGs 
The definition of feature structures for TAGs and 
their semantics was proposed by Vijay-Shanker 
(1987) and Vijay-Shanker and Joshi (1988). We 
first explain briefly how they work in TAGs and 
show how we have implemented them. We in- 
troduce in a TAG framework a language simi- 
lar to PATR-II which was investigated by Shieber 
(Shieber, 1984 and 1986). We then show how one 
can embed the essential aspects of PATR-II in this 
system. 
267 
t br tUu" 
m 
br 
f tf 
..- I, Ubr 
Figure 19: Updating of features 
A NP 
Vp (a) 
I /\ 
PRO V PP /\ 
to go to the movies 
S.top::gtsnsed> = + 
S,bottom::<tensed> = V.boRom::<tensed> 
V.bottom::<tensed> = - 
Feature structures in TAGs 
As defined by Vijay-Shanker (1987) and Vijay- 
Shanker and 30shi(1988), to each adjunction node 
in an elementary tree two feature structures are at- 
tached: a top and a bottom feature structure. The 
top feature corresponds to a top view in the tree 
from the node. The bottom feature corresponds 
to the bottom view. When the derivation is com- 
pleted, the top and bottom features of all nodes 
are unified. If the top and bottom features of a 
node do not unify, then a tree must be adjoined 
at that node. 
This definition can be trivially extended to sub- 
stitution nodes. To each substitution node we at- 
tach two identical feature structures (top and bot- 
tom). 
The updating of features in case of adjunction 
is shown in Figure 19. 
Unification equations 
As in PATR-II, we express with unification equa- 
tions dependencies between DAGs in an elemen- 
tary tree. The system therefore consists of a TAG 
and a set of unification equations on the DAGs 
associated with nodes in elementary trees. 
An example of the use of unification equations 
in TAGs is given in Figure 20. Note that the top 
and bottom features of node S in (~ can not be uni- 
fied. This forces an adjunction to be performed on 
S. Thus, the following sentence is not accepted: 
*to go 1;o 1;he movies. 
The auxillm-y tree 81 can be adjoined at S in or: 
John wan1;s 1;o go 1;o 1;he movies. 
But since the bottom feature of S has tensed value 
- in c~ and since the bottom feature of S has 
tensed value -4- in/32, /31 can not be adjoined at 
node S in a: 
"Bob 1;hinks 1;o go I;o 1;he movies. 
But/~2 can be adjoined in 81, which itself can be 
adjoined in a: 
Bob thinks John wan1;s 1;o go I;o 1;he 
$ 
A 
NP VP (\[~1) A /\ 
John V S 1 
I 
wltnu 
S.top::<tensed> . + 
S.bottorn::<lensed=, . V.bollom::<tensed> 
S_l.bonom::<tensed>., V.bottom::<tensed-Sl> 
V.botlom::<tensed.Sl> ,. - 
V.boRom::<tensed> . + 
S 
A 
NP VP QB2) A /\ 
Bob V S I l 
~ks 
S.top::<tensed> . + 
S.bottom::<tensed> . V.botlom::<tensed> 
S 1.bottom::<lensed> . V.bottom::<lensed-Sl> 
V.bonom::<tensed-Sl> . + 
V.bonom::<lensed> ,. ÷ 
Figure 20: Example of unification equations 
movies. 
We refer the reader to Abeill6 (1988) and to 
Schabes, Abeill6 and 3oshi (1988) for further ex- 
planation of the use of unification equations and 
substitution in TAGs. 
268 
Parsing and the relationship with PATrt-II 
By adding to each state the set of DAGs cor- 
responding to the top and bottom features of 
each node, and by making sure that the unifica- 
tion equations are satisfied, we have extended the 
parser to parse TAGs with feature structures. 
Since we introduced substitution and since we 
are able to encode a CFG directly, the system 
has the main functionalities of PATtt-II. The sys- 
tem parses unification formalisms that have a CFG 
skeleton and a TAG skeleton. 
5 Conclusion 
We described an Earley-type parser for TAGs. We 
extended it to deal with substitution and feature 
structures for TAGs. By doing this, we have built 
a system that parses unification formalisms that 
have a CFG skeleton and also those that have a 
TAG skeleton. The system is being used for Tree 
Adjoining Grammar development (AbeiU~, 1988). 
This work has led us to a new general parsing 
strategy (Schabes, Abeill~ and Joshi, 1988) which 
allows us to construct a two-stage parser. In the 
first stage a subset of the elementary trees is ex- 
tracted and in the second stage the sentence is 
parsed with respect to this subset. This strategy 
significantly improves performance, especially as 
the grammar size increases. 
References 
Abeill~, Anne, 1988. A Computational Grammar for 
French in TAG. In Proceeding of the 12 th International 
Conference on Computational Linguistics. 
Aho, A. V. and Ullman, J. D., 1973. Theory of 
Parsing, Translation and Compiling. Vol I: Parsing. 
Prentice-Hall, Englewood Cliffs, NJ. 
Earley, J., 1970. An Efficient Context-Free Parsing 
Algorithm. Commun. ACM 13(2):94-102. 
Joshi, Aravind K., 1985. How Much Context- 
Sensitivity is Necessary for Characterizing Structural 
Descriptions -- Tree Adjoining Grammars. In Dowry, 
D.; Karttunen, L.; and Zwicky, A. (editors), Natural 
Language Processing- Theoretical, Computational 
and Psychological Perspectives. Cambridge University 
Press, New York. Originally presented in 1983. 
2oshi, Aravind K., 1987. An Introduction to Tree Ad- 
joining Grammars. In Manaster-Ramer, A. (editor), 
Mathematics of Language. John Benjamins, Amster- 
dam. 
Joshi, A. K.; Levy, L. S.; and Takahashi, M., 1975. 
T~ee Adjunct GraJnmars. J. Comput. Syst. Sci. 10(1). 
Kroch, A. and Joshi, A. K., 1985. Linguistic Relevance 
of Tree Adjoining Grammars. Technical Report MS- 
CIS-85-18, Department of Computer and Information 
Science, University of Pennsylvaain. 
Schabes, Yves and Joahi, Aravind K., 1988. An 
Earley.type Parser for Tree Adjoining Grammars. 
Technical Report, Department of Computer and In- 
formation Science, University of Pennsylvania. 
Schabes, Yves; Abeill~, Anne; and Joshi, Aravind K, 
1988. New Parsing Strategies for Tree Adjoining 
Grammars. In Proceedings of the 12 th International 
Conference on Computational Linguistics. 
Shieber, Stuart M., 1984. The Design of a Computer 
Language for Linguistic Information. In 22 ~ Meet- 
ing of the Association for Computational Linguistics, 
pages 362-366. 
Shieber, Stuart M., 1986. An Introduction to Unifi- 
cation.Based Approaches to Grammar. Center for the 
Study of Language and Information, Stanford, cA. 
Vijay-Shanker, K., 1987. A Study of Tree Adjoining 
Grammars. PhD thesis, Department of Computer and 
Information Science, University of Pennsylvania. 
Vijay-Shanker, K. and Joshi, A. K., 1985. Some Com- 
putational Properties of Tree Adjoining Grammars. 
In 23 rd Meeting of the Association for Computational 
Linguistics, pages 82-93. 
Vijay-Shanker, K. and Joshi, A.K., 1988. Feature 
Structure Based Tree Adjoining Grammars. In Pro- 
ceedings of the 12 ta International Conference on Com- 
putational Linguistic& 
269 
