Proceedings of EACL '99 
Tabular Algorithms for TAG Parsing 
Miguel A. Alonso 
Departamento de Computacidn 
Univesidad de La Corufia 
Campus de Elvifia s/n 
15071 La Corufia 
SPAIN 
alonso@dc.fi.udc.es 
David Cabrero 
Departamento de Computacidn 
Univesidad de La Corufia 
Campus de Elvifia s/n 
15071 La Corufia 
SPAIN 
cabreroQdc.fi.udc.es 
Eric de la Clergerie 
INRIA 
Domaine de Voluceau 
Rocquencourt, B.P. 105 
78153 Le Chesnay Cedex 
FRANCE 
Eric.De_La_Clergerie@inria.fr 
Manuel Vilares 
Departamento de Computacidn 
Univesidad de La Corufia 
Campus de Elvifia s/n 
15071 La Corufia 
SPAIN 
vilares@dc.fi.udc.es 
Abstract 
We describe several tabular algorithms 
for Tree Adjoining Grammar parsing, 
creating a continuum from simple pure 
bottom-up algorithms to complex pre- 
dictive algorithms and showing what 
transformations must be applied to each 
one in order to obtain the next one in the 
continuum. 
1 Introduction 
Tree Adjoining Grammars are a extension of CFG 
introduced by Joshi in (Joshi, 1987) that use 
trees instead of productions as the primary rep- 
resenting structure. Several parsing algorithms 
have been proposed for this formalism, most of 
them based on tabular techniques, ranging from 
simple bottom-up algorithms (Vijay-Shanker and 
Joshi, 1985) to sophisticated extensions of the 
Earley's algorithm (Schabes and Joshi, 1988; Sch- 
abes, 1994; Nederhof, 1997). However, it is diffi- 
cult to inter-relate different parsing algorithms. In 
this paper we study several tabular algorithms for 
TAG parsing, showing their common characteris- 
tics and how one algorithm can be derived from 
another in turn, creating a continuum from simple 
pure bottom-up to complex predictive algorithms. 
Formally, a TAG is a 5-tuple ~ = 
(VN,VT, S,I,A), where VN is a finite set of 
non-terminal symbols, VT a finite set of terminal 
symbols, S the axiom of the grammar, I a finite 
set of initial trees and A a finite set of auxiliary 
trees. IUA is the set of elementary trees. Internal 
nodes are labeled by non-terminals and leaf nodes 
by terminals or ~, except for just one leaf per 
auxiliary tree (the foot) which is labeled by the 
same non-terminal used as the label of its root 
node. The path in an elementary tree from the 
root node to the foot node is called the spine of 
the tree. 
New trees are derived by adjoining: let a be a 
tree contaiIiing a node N ~ labeled by A and let 
be an auxiliary tree whose root and foot nodes 
are also labeled by A. Then, the adjoining of 
at the adjunction node N ~ is obtained by excising 
the subtree of a with root N a, attaching j3 to N ° 
and attaching the excised subtree to the foot of ~. 
We use ~ E adj(N ~) to denote that a tree ~ may 
be adjoined at node N ~ of the elementary tree a. 
In order to describe the parsing algorithms for 
TAG, we must be able to represent the partial 
recognition of elementary trees. Parsing algo- 
rithms for context-free grammars usually denote 
partial recognition of productions by dotted pro- 
ductions. We can extend this approach to the case 
of TAG by considering each elementary tree q, as 
formed by a set of context-free productions 7)(7): 
a node N ~ and its children N~... N~ are repre- 
sented by a production N ~ --~ N~... N~. Thus, 
the position of the dot in the tree is indicated by 
the position of the dot in a production in 7)(3' ). 
The elements of the productions are the nodes of 
150 
Proceedings of EACL '99 
the tree, except for the case of elements belonging 
to VT U {E} in the right-hand side of production. 
Those elements may not have children and are not 
candidates to be adjunction nodes, so we identify 
such nodes labeled by a terminal with that termi- 
nal. 
To simplify the description of parsing algo- 
rithms we consider an additional production -r -+ 
R a for each initial tree and the two additional pro- 
ductions T --* R ~ and F ~ ~ 2_ for each auxiliary 
tree B, where R ~ and F ~ correspond to the root 
node and the foot node of/3, respectively. After 
disabling T and 2_ as adjunction nodes the gener- 
ative capability of the grammars remains intact. 
The relation ~ of derivation on P(7) is de- 
fined by 5 ~ u if there are 5', 5", M ~, v such that 
5 = 5'M~5 ", u = 5'v~" and M "r --+ v E 7)(3 ') ex- 
ists. The reflexive and transitive closure of =~ is 
denoted :~ . 
In a abuse of notation, we also use :~ to rep- 
resent derivations involving an adjunction. So, 
5 ~ u if there are 5~,~",M'r,v such that 5 = 
5'M~5 '', R ~ ~ viF~v3, ~ E adj(M~), M "r --+ v2 
and v = ¢~t?31v2u3 ~tt . 
Given two pairs (p,q) and (i, j) of integers, 
(p,q) <_ (i,j) is satisfied if/< p and q _< j. Given 
two integers p and q we define p U q as p if q is un- 
defined and as q if p is undefined, being undefined 
in other case. 
1.1 Parsing Schemata 
We will describe parsing algorithms using Parsing 
Schemata, a framework for high-level description 
of parsing algorithms (Sikkel, 1997). An interest- 
ing application of this framework is the analysis of 
the relations between different parsing algorithms 
by studying the formal relations between their un- 
derlying parsing schemata. Originally, this frame- 
work was created for context-free grammars but 
we have extended it to deal with tree adjoining 
grammars. 
A parsing system for a grammar G and string 
al ... a,~ is a triple (2:, 7-/, D), with :2 a set of items 
which represent intermediate parse results, 7-/ an 
initial set of items called hypothesis that encodes 
the sentence to be parsed, and Z) a set of deduc- 
tion steps that allow new items to be derived from 
already known items. Deduction steps are of the 
form '~'~"'~ cond, meaning that if all antecedents 
7\]i of a deduction step are present and the con- 
ditions cond are satisfied, then the consequent 
should be generated by the parser. A set 5 v C Z of 
.final items represent the recognition of a sentence. 
A parsing schema is a parsing system parameter- 
ized by a grammar and a sentence. 
Parsing schemata are closely related to gram- 
matical deduction systems (Shieber et al., 1995), 
where items are called formula schemata, deduc- 
tion steps are inference rules, hypothesis are ax- 
ioms and final items are goal formulas. 
A parsing schema can be generalized from 
another one using the following transforma- 
tions (Sikkel, 1997): 
• Item refinement, 
multiple items. 
breaking single items into 
• Step refinement, decomposing a single deduc- 
tion step in a sequence of steps. 
• Extension of a schema by considering a larger 
class of grammars. 
In order to decrease the number of items and 
deduction steps in a parsing schema, we can apply 
the following kinds of filtering: 
• Static filtering, in which redundant parts are 
simply discarded. 
• Dynamic filtering, using context information 
to determine the validity of items. 
• Step contraction, in which a sequence of de- 
duction steps is replaced by a single one. 
The set of items in a parsing system PAIg cor- 
responding to the parsing schema Alg describing 
a given parsing algorithm Alg is denoted 2:Alg, the 
set of hypotheses 7/Alg, the set of final items ~'Alg 
and the set of deduction steps is denoted ~)Alg" 
2 A CYK-like Algorithm 
We have chosen the CYK-like algorithm for TAG 
described in (Vijay-Shanker and Joshi, 1985) as 
our starting point. Due to the intrinsic limitations 
of this pure bottom-up algorithm, the grammars 
it can deal with are restricted to those with nodes 
having at most two children. 
The tabular interpretation of this algorithm 
works with items of the form 
\[N "~ , i, j \[ p, q I adj\] 
such that N ~ ~ ai+l ...ap F ~ aq+l ...aj 
ai+l ... aj if and only if (p, q) 7~ (-, -) and N ~ 
ai+l.., aj if and only if (p,q) = (-,-), where 
N ~ is a node of an elementary tree with a label 
belonging to VN. 
The two indices with respect to the input string 
i and j indicate the portion of the input string that 
has been derived from N "~. If V E A, p and q are 
two indices with respect to the input string that 
indicate that part of the input string recognized 
151 
Proceedings of EACL '99 
by the foot node ofv. In other casep= q =- 
representing they are undefined. The element adj 
indicates whether adjunction has taken place on 
node N r. 
The introduction of the element adj taking its 
value from the set {true, false} corrects the items 
previously proposed for this kind of algorithms 
in (Vijay-Shanker and Joshi, 1985) in order to 
avoid several adjunctions on a node. A value of 
true indicates that an adjunction has taken place 
in the node N r and therefore further adjunctions 
on the same node are forbidden. A value of false 
indicates that no adjunction was performed on 
that node. In this case, during future processing 
this item can play the role of the item recognizing 
the excised part of an elemetitary tree to be at- 
tached to the foot node of an auxiliary tree. As a 
consequence, only one adjunction can take place 
on an elementary node, as is prescribed by the 
tree adjoining grammar formalism (Schabes and 
Shieber, 1994). As an additional advantage, the 
algorithm does not need to require the restriction 
that every auxiliary tree must have at least one 
terminal symbol in its frontier (Vijay-Shanker and 
Joshi, 1985). 
Schema 1 The parsing systems \]PCYK corre- 
sponding to the CYK-line algorithm for a tree ad- 
joining grammar G and an input string al... an 
is defined as follows: 
ICYK={ \[N 7,i,jlp,qladj\] } 
such that N ~ • 79(7), label(Nr) • VN, 7 E I U 
A, 0 < i < j, (p,q) <_ (i,j), adj e {true, false} 
7"~Cy K = { \[a, i -- 1, i\] I a = ai, 1 < i < n } 
\[a, i - 1, if N r -+ a ~Scan 
CYK = \[Nr, i - 1, i \[ -,- I false\] 
79~'¥K = \[N% i, i I -,- I false\] N~ -~ e 
 )Foot CYK = \[Fr, i, j I i, j I false\] 
\[M r, i, k \[ p, q I adj\], 
q~LeftDo,n \[P~', k, j I -, -- I adj\] 
'-'CYK = \[NT, i, j I P, q I false\] 
such that N "r --+ M+rP r E 79(7), M r E spine(v) 
\[M r, i, k l -,-ladj\], 
~R.ightDoln \[p'r k, j I P, q I adj\] 
~CYK = \[N r, i, j I P, q false\] 
such that N "r --+ M'rP ~ • P(7), pr • sp/ne(7) 
\[M ~, i, k adjJ , 
P~, k, j --,'-- \[\[ adj\] • pNoDom : 
CYK \[Nr, i, j I -, - I false\] 
such that N r ~ MrP r • P(7), M~, P'~ 
sp/ne(~) 
¢ 
)Unary = \[ M~, i, j I P, q I adj\] N~, M. r 
cY~ \[N% i, j I P, q I false\] -+ • P(~) 
\[ R~, i', j' i, j I adjl, Nr,i,j \[p,q false\] DAdj 
¢YK = \[N%i',j' \[p,q \[ true\] 
such that 3 e A, ~ • adj(N "r) 
q~Scan I I-DFoot q'~LeftDoml i DCYK ~'CYK \['j ~)~YK I.J : "-' ~'CYK ~'CYK 
~RightDom II T~NoDom U TlUnary TIAdj CYK ~ "CYK ~CYK \[J "CYK 
$'CYK = { \[R ~,0,n \[ -,-\[adj\]la e I } 
The hypotheses defined for this parsing system 
are the standard ones and therefore they will be 
omitted in the next parsing systems described in 
this paper. 
The key steps in the parsing system IPCyK are 
DcF°~?t~ and 7?~di K, which are in charge of the recog- 
nition of adjunctions. The other steps are in 
charge of the bottom-up traversal of elementary 
trees and, in the case of auxiliary trees, the prop- 
agation of the information corresponding to the 
part of the input string recognized by the foot 
node. 
The set of deductive steps q-~Foot make it possi- ~'CYK 
ble to start the bottom-up traversal of each aux- 
iliary tree, as it predict all possible parts of the 
input string that can be recognized by the foot 
nodes. Several parses can exist for an auxiliary 
tree which only differs in the part of the input 
string which was predicted for the foot node. Not 
all of them need take part on a derivation, only 
those with a predicted foot compatible with an 
adjunction. The compatibility between the ad- 
junction node and the foot node of the adjoined 
~Adj . when tree is checked by a derivation step ~'CYK" 
the root of an auxiliary tree /3 has been reached, 
it checks for the existence of a subtree of an ele- 
mentary tree rooted by a node N ~ which satisfies 
the following conditions: 
i. /3 can be adjoined on N'L 
2. N "r derives the same part of the input string 
derived from the foot node of/3. 
152 
Proceedings of EACL '99 
If the Conditions are satisfied, further adjunctions 
on N are forbidden and the parsing process con- 
tinues a bottom-up traverse of the rest of the ele- 
mentary tree 3' containing N x. 
3 A Bottom-up Earley-like 
Algorithm 
To overcome the limitation of binary branching in 
trees imposed by CYK-like algorithms, we define a 
bottom-up Earley-like parsing algorithm for TAG. 
As a first step we need to introduce the dotted 
rules into items, which are of the form 
\[N ~ --4 5 • v,i,j I P, q\] 
such that 6 ~ a~+1...% F "y aq+l...a; :~ 
ai+l ... a~ if and only if (p, q) # (-,-) and 5 =~ 
ai+l ... aj if and only if (p, q) = (-, -). 
The items of the new parsing schema, denoted 
buEx, are obtained by refining the items of CYK. 
The dotted rules eliminate the need for the ele- 
ment adj indicating whether the node in the left- 
hand side of the production has been used as ad- 
junction node. 
Schema 2 The parsing system \]PbuE correspond- 
ing to the bottom-up Earl•y-like parsing algorithm, 
given a tree adjoining grammar G and a input 
string al ... a,~ is defined as follows: 
Zb.E = \[N "~ --+ 5 • v, i, j I P, q\] 
such that N ~ 2_+ 5v • P(3"), 3" E I U A, 0 < i < 
j, (p,q) <_ (i,j) 
 Init bun = \[N'v --+ •5, i, i\[-,-\] 
 DFoot buE \[FZ ~ ±•,i,j \] i,j\] 
I N ~ --+ 5 • av,i,j -1 I P, q\], ~s(:a. a,j - 1,if 
• q,,,E = \[N~ --+ 5a • v, i, j I P, q\] 
N'r --4 6•M~v,i, k IP, q\], 
M r ~ v•, k, j \] p', q'\] ~r) COml) : 
hue \[N~--+SM~•v,i,j\[pUp',qUq'\] 
T --4 R~.,k,j I l,m\], 
M "r --~ v•, l, m I P', q'\], 
N ~ --4 5 • M~v,i,k \] p,q\], ~)AdjComp = 
hue \[N~ --4 5M'r • v, i, j I P U p', q U q'\] 
such that ~ • A, ~ • adj(M ~) 
~buE = 7)Init U T)Foot U T)Scanj ) ~buE ~I)uE ~buE "J 
~)Comp qDAdjComp 
hue U ~buE 
-,-\]l-•X } 
The deduction steps of \]PbuE are obtained from 
the steps in IPcyK applying the following refine- 
ment: 
• LeftDom, RightDom and NoDom deductive 
steps have been split into steps Init and 
Comp. 
• Unary and E steps are no longer necessary, 
due to the uniform treatment of all produc- 
tions independently of the length of the pro- 
duction. 
The algorithm performs a bottom-up recog- 
nition of the auxiliary trees applying the steps 
~)Comp During the traversal of auxiliary trees, buE1 " 
information about the part of the input string rec- 
ognized by the foot is propagated bottom-up. A 
set of deductive steps z)Init ~buE are in charge of start- 
ing the recognition process, predicting all possible 
start positions for each rule. 
A filter has been applied to the parsing system 
\]PCYK, contracting the deductive steps Adj and 
Comp in a single AdjComp, as the item gener- 
ated by a deductive step Adj can only be used to 
advance the dot in the rule which has been used 
to predict the left-hand side of its production. 
4 An Earley-like Algorithm 
An Earley-like parsing algorithm for TAG can be 
obtained by incorporating top-down prediction. 
To do so, two dynamic filters must be applied to 
\]PbuE: 
• The deductive steps in D~ nit will only consider 
productions having the root of an initial tree 
as left-hand side. 
• A new set ~)Pred of predictive steps will be 
in charge of controlling the generation of 
new items, considering only those new items 
which are potentially useful for the parsing 
process. 
Schema 3 The parsing system \]PE corresponding 
to an Earley-like parsing algorithm for TAG with- 
out the valid prefix property, given a tree adjoining 
grammar G and a input string al ... an is defined 
as follows: 
~E ---- \]~buE 
v "'t = \[7 .R-, 0, 01 -,-\] • I 
153 
.-% 
Proceedings of EACL '99 
DP~d = \[ Nr --+ ~ * Mrv, i, j I P, q\] 
\[Mr --+ *v,j,j \[ -,-\] 
©AdjP~d = \[ N'~ -'+ 5 * Mrv, i, j I P, q\] 
E \[7- --+ .R~, j, j I --, --\] 
such that fl • adj(M r) 
fr k l -,-\], ~)FootPred ~ .N'r -+ ~ * M'r v, i, j I P, q\] 
\[Mr k, k l -,-\] 
such that/3 • adj(M" 0 
\[M ~ ~ v*, k, l I P, q\], 
,±, k, k I -, -1,, , 
T)FootComp ---- \[ Ny ~ 6*Mrv, i,J \[P ,q\] 
~E \[F~ --+ _1_., k, l I k, l\] 
such that fl • adj(M~), p U p' and q t2 
q' are defined 
 )AdjComp E ---- I 
T ~ Rf~*,j, m lk, l\], 
M'r-+v*,k,l\[p,q\], , 
N r -+ 6.Mrv, i,j \[p,q'\] 
\[Nr ~ 6Mr • v, i, m \[ P U p', q U q'\] 
such that/3 • adj(M r) 
Init T)Scan j , ~)Pred U ~r)Comp, , 7) E-- 7:) E U ouE ~ E :.hue w 
T~ AdjPred i i T~FootPred I I T)V°°tC°mpl I 
~)~ p~EdjC°m V ~" E "" ~E ~'* 
~'E = ~buE 
Parsing begins by creating the item correspond- 
ing to a production having the root of an initial 
tree as left-hand side and the dot in the leffmost 
position of the right-hand side. Then, a set of de- 
ductive steps ~E Pred and ~Comp w E traverse each ele- 
T)AdjPred predicts the ad- mentary tree. A step in w E 
junction of an auxiliary tree/3 in a node of an ele- 
mentary tree 3' and starts the traversal of/3. Once 
the foot of/3 has been reached, the traversal of/3 
~FootPred is momentary suspended by a step in E , 
which re-takes the subtree of 7 which must be at- 
tached to the foot of/3. At this moment, there is 
no information available about the node in which 
the adjunction of/3 has been performed, so all pos- 
sible nodes are predicted. When the traversal of a 
• .r~FootComp predicted subtree has finished, a step m/Jn 
re-takes the traversal of/3 continuing at the foot 
node. When the traversal of/3 is completely fin- 
T~hdjC°mp checks if the ished, a deduction step in w E 
subtree attached to the foot of \[3 corresponds with 
the adjunction node. With respect to steps in 
~)AdjComp E , p and q are instantiated if and only if 
the adjunction node is in the spine of V- 
5 The Valid Prefix Property 
Parsers satisfying the valid prefix property guaran- 
tee that, as they read the input string from left to 
right, the substrings read so fax are valid prefixes 
of the language defined by the grammar. More for- 
mally, a parser satisfies the valid prefix property 
if for any substring al .. • ak read from the input 
string al . • • akak+ l •.. an guarantees that there is 
a string of tokens bl ... bin, where bi need not be 
part of the input string, such that al ... akbl . .. bm 
is a valid string of the language. 
To maintain the valid prefix property, the parser 
must recognize all possible derived trees in prefix 
form. In order to do that, two different phases 
must work coordinately: a top-down phase that 
expands the children of each node visited and a 
bottom-up phase grouping the children nodes to 
indicate the recognition of the parent node (Sch- 
abes, 1991). 
During the recognition of a derived tree in pre- 
fix form, node expansion can depend on adjunc- 
tion operations performed in the previously vis- 
ited part of the tree. Due to this kind of dependen- 
cies the set path is a context-free language (Vijay- 
Shanker et al., 1987). A bottom-up algorithm 
(e.g. CYK-like or bottom-up Eaxley-like) can 
stack the dependencies shown by the context-free 
language defining the path-set. This is sufficient 
to get a correct parsing algorithm, but without 
the valid prefix property. To preserve this prop- 
erty the algorithm must have a top-down phase 
which also stacks the dependencies shown by the 
language defining the path-set. To transform an 
algorithm without the valid prefix property into 
another which preserves it is a difficult task be- 
cause stacking operations performed during top- 
down and bottom-up phases must be correlated 
some way and it is not clear how to do so with- 
out augmenting the time complexity (Nederhof, 
1997). 
CYK-like, bottom-up Earley-like and Eaxley- 
like parsing algorithms described above do not 
preserve the valid prefix property because foot- 
prediction (a top-down operation) is not restric- 
tive enough to guarantee that the subtree attached 
to the foot node really corresponds with a instance 
of the tree involved in the adjunction. 
To obtain a Earley-like parsing algorithm for 
tree adjoining grammars preserving the valid pre- 
fix property we need to refine the items by in- 
cluding a new element to indicate the position of 
154 
Proceedings of EACL '99 
the input string corresponding to the left-most ex- 
treme of the frontier of the tree to which the dot- 
ted rule in the item belongs: 
\[h,g "~ ~ 5 ° v,i,j \[ p,q\] 
such that R ~ ~ ah+~ ...aiSvv and 5 =~ 
ai...ap F "r aq+~ ...aj ~ ai...aj if and only if 
(p, q) # (-,-) and 5 ~ ai...aj if and only if 
(P, q) = (-, -). 
Thus, an item \[N ~ --+ 5 * v,i,j I P,q\] of IPE 
corresponds now with a subset of {\[h, N 7 --+ 5. 
v, i, j I P, q\] } for all h e \[0, n\]. 
Schema 4 The parsing system \]PEarley corre- 
sponding to a Earley-like parsing algorithm with 
the valid prefix property, for a tree adjoining gram- 
mar ~ and a input string a~...an is defined as 
follows: 
~Earley = \[h, N ~ --+ 5 ° v, i, j I P, q\] 
N "r ~ 5°v ~ P(7), 7 ~ IUA, O < h < i < 
j, (p,q) < (i,j) 
 Dlnit I Earley \[0, T -+ °R ~, 0, 0 I -,-\] 
\[h,N ~ -~ 5*av, i,j- 1 \[p,q\], 
~Scan \[a,3 - 1,j\] 
~'Earley = \[h, N7 --+ 8a ° v, i, j \[ p, q\] 
~)Pred \[h, N~ ~5"M'~v,i,J \[P,q\] 
Earley "= \[h, M'r --+ °v, j, j\[ -,-\] 
f h, N "y ~ 5 * M'rv, " ~)Comp 
Earley = \[h, N "r --+ 5M7. v, i, j I P U p', q U q'\] 
DAdjPred \[h, N "r -+ 5 • M~rv, i, j I P, q\] 
E,~l~y = \[j, T --+ .R~, j, j I -,-1 
such that \[3 E adj(M ~) 
\[j,F ~ --+ o_L, k, k I -,-\], 
T~FootPred = \[ h, N "r --+ 5 • M'Y v, i, j \] p, q\] 
z"Earley \[h, M y --+ *5, k, k I -, -\] 
such that \[3 E adj(M ~) 
\[h,M "Y ~ v*,k,l I P, q\], 
\[j,F ~ -+ ._L,k, k \[ -,-\], 
~)FootComp \[h, N ~ --+ 5 * M~v,i,j I if, q'\] 
Earley = \[j,F ~ ~ .J-",~,l I ~,l\] 
fl E adj(MT), p U p' and q U q' are defined 
-DAdjComp Earley 
fj, T --+ R~.,j,m k,l\], h,M ~ --+ v.,k, l lp, q\], 
h,N ~ --+ 5 • M~v,i,j I P',q'\] 
\[h, N'r -+ 5M'r • v, i, m I P U p', q U q'\] 
such that \[3 e adj(M ~) 
~)Earley = ~Init L.J ~)Scan U q3Pred II Earley Earley ~"Earley "J 
~)Comp T3AdjPred ff')FootPredl i Earley U ~Earley l J ~"Earley "~ 
~DFootComp T)AdjComp Earley LJ ~Earley 
~'Earley = { \[O, -r -~ R%, O, nl-,-ll~e I } 
Time complexity of the Earley-like algorithm 
with respect to the length n of input string is AdjOomp 
O(nT), and it is given by steps 79Earley . A1- 
q-lAdjComp though 8 indices are involved in a step ~Earley , 
partial application allows us to reduce the time 
complexity to O(nT). 
Algorithms without the valid prefix property 
have a time complexity C0(n 6) with respect to the 
length of the input string. The change in com- 
plexity is due to the additional index in items of 
\]PEarley- That index is needed to check the trees 
T~FootPred ^--J ,r~FootComp In the involved in steps ~'~Earley i~uu t.,Earley . 
other steps, that index is only propagated to the 
generated item. This feature allows us to refine 
ff-IAdjComp splitting them into several the steps in ~Earley ' 
steps generating intermediate items without that 
index. To get a correct .s~titting, we must first • . - Adjt~omp • - 
&fferentlate steps m ~)Earley in whmh p and q 
q~AdjComp are instantiated from steps in "Earley in which 
p' and q' are instantiated. So, we must define two 
q'3AdjC°mpl and q3AdjO°mP2 of steps in- new sets ~Earley ~Earley 
q3AdjC°mp Additionally, in stead of the single set ~Earley " 
q3AdjComp 1 steps in ~Earley we need to introduce a new 
item (dynamic filtering) to guarantee the correct- 
ness of the steps. 
\[j,-r -, R~,,j,m I k,1\], 
\[h,M ~ --+ vo, k,l lp, q\], 
\[h,F ~ -+ _L.,p,q p,q\], 
DadjCom p' = \[h, N ~ --+ 5 • M'rv, i, j -, -\] 
Earley \[h, N7 --~ 5M7 • u, i, m \[ p, q\] 
such that 13 E adj(M ~) 
\[j,T --+ R~*,j, m l k,l\], 
ih, M y --+ v',k,l -,-\], , 
T)AdjCornp 2 \[h,N'r -+ 5* M'rv, i,j if,q\] 
WEarley : \[h, N~ ~ 5M~ • v, i, m I P', q'\] 
such that \[3 E adj(M "y) 
~DEarley ~D Init I.J ~D Scan LJ "FIPred II Earley Earley ~Earley ~ 
~)Comp ,/-)Adj Pred q-)FootPredl i Earley \['j ~Earley I.J ~Earley "-" 
~)FootComp "/3 AdjC°mpl It q'~ AdjC°rnp2 Earley I J ~Earley "-" ~Earley 
155 
Proceedings of EACL '99 
"DAdjC°mpl into Now, we must refine steps in '~'Earley 
~) AdjC°mp° and ~) AdjC°mpff steps in Earley Earley , and re- 
q-)AdjComp ° q')AdjC°rnp2 into steps in ~Earley fine steps in ,iEarley 
and q')AdjC°mp2' Correctness of these splittings ~Earley 
is guaranteed by the context-free property of 
TA G (Vijay-Shanker and Weir, 1993) establishing 
the independence of each adjunction with respect 
to any other adjunction. 
After step refinement, we get the Earley-like 
parsing algorithm for TAG described in (Neder- 
hof, 1997), which preserves the valid prefix prop- 
erty having a time complexity O(n 6) with respect 
to the input string. In this schema we also need 
to define a new kind of intermediate pseudo-items 
\[\[g r --+ 5 • u, i, j I P, q\]\] 
such that 5 ~ ai...ap F "y aq+l...aj ~ ai...aj 
if and only if (p, q) ¢ (-,-) and 6 :~ ai... aj if 
and only if (p, q) = (-,-) . 
Schema 5 The parsing system \]PEarley coFre- 
sponding to a the final Earley-like parsing algo- 
rithm with the valid prefix property having time 
complexity O(n6), for a tree adjoining grammar G 
and a input string al ... an is defined as follows: 
~Earley = { \[h,N r ~ (~ • b',?:,j i P,q\] } 
such that N "r ~ 5 . u E p('r), 7 E I tO A, O < h < 
i<j, (p,q)_<(i,j) 
~Earley = { \[\[ Nr -'') ~ • /\],i,J I P,q\]\] } 
such that N r ~ d.u • P(7), ~/ • IU A, O < i < 
j, (p,q) <_ (i,j) 
• \] ') ~Earley : ~Earley k.J Z~.arley 
 Dlnit Eltrley O~ I F-\[0, T~.R%0,0 -,-\] 
\[h,,N r --+ 5 . au, i,3 - lip, q\], 
~Scan \[a, 3 - 1, j\] 
• ~E,~l~y = \[h, Nr ~ 5a • u, i, j I P, q\] 
~r)Pred \[h, Nr --+ 5 * Mru, i,j l P, q\] 
Earlcy -~- \[h, Mr ~ *v, j, j \[ -,-\] 
\[ h,N r --+ 5 • Mru, i,k ! p,q\], h,,M "v --+ v.,k,j \]if,q\] ~r)(:()mp ---- 
I,:,u.l,,y \[h, N r --+ 5Mr • u, i, j I P tO p', q U q'\] 
,DAdjPre d _ \[h,N r --+ 5 * M'Yu, i,j l p, q\] 
Earley -- \[j, T -~ ;fi~ \[ -, -\] 
such that 13 E adj(M r) 
\[j,F ~ -+ *J_,k,k\[ -,-1, 
~FootP~ed = \[h, N r -'+ 5 * M'~v,i,j \[ p, q\] 
~'Earley \[h, M'r --+ .5, k, k \[ -, -\] 
such that/3 E adj(M ~) 
:D F°otC°mp = Earley 
such that /3 
q' are defined 
\[h, M r --+ 5•, k, l I P, q\], }j, 
F ~ -+ ®±,k,k -,-\], 
h,N ~ -+ 5. M~u,i,j p',q'\] 
\[j, FZ -~ _k.,k,l I k,l\] 
• adj(M'r), p U p' andq U 
\[j, T --+ RZ.,j, rn ~pkql! , 
,F~AdjComp o = \[h, M r --+ 5•, k, l \[ 
--Earley \[\[M'r --+ 5•, j, rn \[ p, q\]\] 
such that/3 E adj(M r) 
\[\[Mr j, m p, q\]l, 
\[h,F r -+ .l_.,p,q p,q\], 
~AdiCompl' \[h, N r ~ 5 • M~u,i,j -, -\] 
~'Earley = \[h, N~ ~ ~M~ • u, i, m I P, q\] 
such that/3 • adj(M r) 
\[\[M "r -+ 5.,j, rn \[ p,q\]\], 
q~AdjComp 2' \[h, Nr --+ 5* M'ru, i,j \[ p,q\] 
~Earley = \[h, Nr -, • i, m I p, q\] 
such that/3 e adj(M r) 
~)Scan -riPred I I = ,F)Init LJ \[.J ~)Earley ~'Earley Earley ~" Earley'-' 
~DCornp ,F)Adj Pred 1"~FootPredl I Earley LJ ~Earley LJ ~JEarley v 
~)FootCornp ~D AdjC°mp0 I,.J Earley I J Earley 
~) AdjC°ml)ff I.J q")AdjC°mP'/ Earley ~Earley 
-~Earley = { \[0,7- ~ R ao,0,n I -,-\] I c~ • I } 
6 Conclusion 
We have described a set of parsing algorithms 
for TAG creating a continuum which has the 
CYK-like parsing algorithm by (Vijay-Shanker 
and Joshi, 1985) as its starting point and the 
Earley-like parsing algorithm by (Nederhof, 1997) 
preserving the valid prefix property with time 
156 
Proceedings of EACL '99 
complexity O(n 6) as its goal. As intermediate al- 
gorithms, we have defined a bottom-up Earley-like 
parsing algorithm and an Earley-like parsing algo- 
rithm without the valid prefix property, which to 
our knowledge has not been previously described 
in literature 1. We have also shown how to trans- 
form one algorithm into the next using simple 
transformations.Other algorithms could also has 
been included in the continuum, but for reasons 
of space we have chosen to show only the algo- 
rithms we consider milestones in the development 
of parsing algorithms for TAG. 
An interesting project for the future will be to 
translate the algorithms presented here to sev- 
eral proposed automata models for TAG which 
have an associated tabulation technique: Strongly 
Driven 2-Stack Automata (de la Clergerie and 
Alonso, 1998), Bottom-up 2-Stack Automata (de 
la Clergerie et al., 1998) and Linear Indexed Au- 
tomata (Nederhof, 1998). 
7 Acknowledgments 
This work has been partially supported by 
FEDER of European Union (1FD97-0047-C04-02) 
and Xunta de Galicia (and XUGA20402B97). 
References 
Eric de la Clergerie and Miguel A. Alonso. 1998. 
A tabular interpretation of a class of 2-Stack 
Automata. In COLING-ACL '98, 36th Annual 
Meeting of the Association for Computational 
Linguistics and 17th International Conference 
on Computational Linguistics, Proceedings of 
the Conference, volume II, pages 1333-1339, 
Montreal, Quebec, Canada, August. ACL. 
Eric de la Clergerie, Miguel A. Alonso, and 
David Cabrero. 1998. A tabular interpreta- 
tion of bottom-up automata for TAG. In Proc. 
of Fourth International Workshop on Tree- 
Adjoining Grammars and Related Frameworks 
(TAG+4), pages 42-45, Philadelphia, PA, USA, 
August. 
Aravind K. Joshi. 1987. An introduction to 
tree adjoining grammars. In Alexis Manaster- 
Ramer, editor, Mathematics of Language, pages 
87-115. John Benjamins Publishing Co., Ams- 
terdam/Philadelphia. 
Mark-Jan Nederhof. 1997. Solving the correct- 
prefix property for TAGs. In T. Becket and 
~Other different formulations of Earley-like pars- 
ing algorithms for TAG has been previously proposed, 
e.g. (Schabes, 1991). 
H.-V. Krieger, editors, Proc. of the Fifth Meet- 
ing on Mathematics of Language, pages 124- 
130, Schloss Dagstuhl, Saarbruecken, Germany, 
August. 
Mark-Jan Nederhof. 1998. Linear indexed au- 
tomata and tabulation of TAG parsing. In Proc. 
of First Workshop on Tabulation in Parsing 
and Deduction (TAPD'98), pages 1-9, Paris, 
France, April. 
Yves Schabes and Aravind K. Joshi. 1988. An 
Earley-type parsing algorithm for tree adjoining 
grammars. In Proc. of 26th Annual Meeting of 
the Association for Computational Linguistics, 
pages 258-269, Buffalo, NY, USA, June. ACL. 
Yves Schabes and Stuart M. Shieber. 1994. An 
alternative conception of tree-adjoining deriva- 
tion. Computational Linguistics, 20(1):91-124. 
Yves Schabes. 1991. The valid prefix property 
and left to right parsing of tree-adjoining gram° 
mar. In Proc. of II International Workshop on 
Parsing Technologies, IWPT'91, pages 21-30, 
Cancfin, Mexico. 
Yves Schabes. 1994. Left to right parsing of 
lexicalized tree-adjoining grammars. Computa- 
tional Intelligence, 10(4):506-515. 
Stuart M. Shieber, Yves Schabes, and Fernando 
C. N. Pereira. 1995. Principles and implemen- 
tation of deductive parsing. Yournal of Logic 
Programming, 24(1&2):3-36, July-August. 
Klaas Sikkel. 1997. Parsing Schemata -- A 
Framework for Specification and Analysis of 
Parsing Algorithms. Texts in Theoretical Com- 
puter Science -- An EATCS Series. Springer- 
Verlag, Berlin/Heidelberg/New York. 
Krishnamurti Vijay-Shanker and Aravind K. 
Joshi. 1985. Some computational properties of 
tree adjoining grammars. In 23rd Annual Meet- 
ing of the Association \]or Computational Lin- 
guistics, pages 82-93, Chicago, IL, USA, July. 
ACL. 
Krishnamurti Vijay-Shanker and David J. Weir. 
1993. Parsing some constrained gram- 
mar formalisms. Computational Linguistics, 
19(4):591-636. 
Krishnamurti Vijay-Shanker, David J. Weir, and 
Aravind K. Joshi. 1987. Characterizing struc- 
tural descriptions produced by various gram- 
matical formalisms. In Proc. o/the P5th Annual 
Meeting of the Association \]or Computational 
Linguistics, pages 104-111, Buffalo, NY, USA, 
June. ACL. 
157 
