BIDIRECTIONAL PARSING OF 
LEXICALIZED TREE ADJOINING GRAMMARS* 
Alberto Lavelli and Giorgio Satta 
Istituto per ia Ricerca Scientifica e Teenologica 
I - 38050 Povo TN, Italy 
e-mail: lavelli/satta@irst.it 
Abstract 
In this paper a bidirectional parser for Lexicalized 
Tree Adjoining Grammars will be presented. The 
algorithm takes advantage of a peculiar characteristic 
of Lexicalized TAGs, i.e. that each elementary tree is 
associated with a lexical item, called its anchor. The 
algorithm employs a mixed strategy: it works bot- 
tom-up from the lexical anchors and then expands 
(partial) analyses making top-down predictions. Even 
if such an algorithm does not improve tim worst-case 
time bounds of already known TAGs parsing meth- 
ods, it could be relevant from the perspective of 
linguistic information processing, because it em- 
ploys lexical information in a more direct way. 
1. Introduction 
Tree Adjoining Grammars (TAGs) are a formal- 
ism for expressing grammatical knowledge that ex- 
tends the domain of locality of context-free gram- 
mars (CFGs). TAGs are tree rewriting systems spec- 
ified by a finite set of elementary trees (for a detailed 
description of TAGs, see (Joshi, 1985)). TAGs can 
cope with various kinds of unbounded dependencies 
in a direct way because of their extended domain of 
locality; in fact, the elementary trees of TAGs are 
the appropriate domains for characterizing such de- 
pendencies. In (Kroch and Joshi, 1985) a detailed dis- 
cussion of the linguistic relevance of TAGs can be 
found. 
Lexicalized Tree Adjoining Grammars (Schabes et 
al., 1988) are a refinement of TAGs such that each 
elementary tree is associated with a lexieal item, 
called the anchor of the tree. Therefore, Lexicalized 
TAGs conform to a common tendency in modem 
theories of grammar, namely the attempt to embed 
grammatical information within lexical items. 
Notably, the association between elementary trees 
and anchors improves also parsing performance, as 
will be discussed below. 
Various parsing algorithms for TAGs have been 
proposed in the literature: the worst-case time com- 
plexity varies from O(n 4 log n) (Harbusch, 1990) to 
O(n 6) (Vijay-Shanker and Joshi, 1985, Lang, 1990, 
Schabes, 1990) and O(n 9) (Schabes and Joshi, 1988). 
*Part of this work was done while Giorgio Satta was 
completing his Doctoral Dissertation at the 
University of Padova (Italy). We would like to thank 
Yves Schabes for his valuable comments. We would 
also like to thank Anne Abeill6. All errors are of 
course our own. 
As for Lexicalized TAGs, in (Schabes et al., 1988) a 
two step algorithm has been presented: during the 
first step the trees corresponding to the input string 
are selected and in the second step the input string is 
parsed with respect to this set of trees. Another paper 
by Schabes and Joshi (1989) shows how parsing 
strategies can take advantage of lexicalization in 
order to improve parsers' performance. Two major 
advantages have been discussed in the cited work: 
grammar filtering (the parser can use only a subset 
of the entire grammar) and bottom-up information 
(further constraints are imposed on the way trees can 
be combined). Given these premises and starting 
from an already known method for bidirectional CF 
language recognition (Satta and Stock, 1989), it 
seems quite natural to propose an anchor-driven bidi- 
rectional parser for Lexicalized TAGs that tries to 
make more direct use of the information contained 
within the anchors. The algorithm employs a mixed 
strategy: it works bottom-up from the lexical an- 
chors and then expands (partial) analyses making 
top-down predictions. 
2. Overview of the Algorithm 
The algorithm that will be presented is a recog- 
nizer for Tree Adjoining Languages: a parser can be 
obtained from such a recognizer by additional pro- 
cessing (see final section). As an introduction to the 
next section, an informal description of the studied 
algorithm is here presented. We assume the follow- 
ing definition of TAGs. 
Definition 1 A Tree Adjoining Grammar (TAG) 
is a 5-tuple G=(VN, Vy, S, l, A), where VN is a 
finite set of non-terminal symbols, Vy is a finite set 
of terminal symbols, Se VN is the start symbol, 1 
and A are two finite sets of trees, called initial trees 
and auxiliary trees respectively. The trees in the set 
IuA are called elementary trees. 
We assume that the reader is familiar with the 
definitions of adjoining operation and foot node (see 
0oshi, 1985)). 
The proposed algorithm is a tabular method that 
accepts a TAG G and a string w as input, and decides 
whether we L(G). This is done by recovering 
(partial) analyses for substrings of w and by combin- 
ing them. More precisely, the algorithm factorizes 
analyses of derived trees by employing a specific 
structure called state. Each state retains a pointer to a 
node n in some tree ae luA, along with two addi- 
tional pointers (called Idol and rdot) to n itself or to 
- 27 - 
its children in a. Let an be a tree obtained from the 
maximal subtree of a with root n, by means of 
some adjoining operations. Informally speaking and 
with a little bit of simplification, the two following 
cases are possible. First, ff ldot, rdo~n, state s indi- 
cates that the part of an dominated by the nodes 
between ldot and rdot has already been analyzed by 
the algorithm. Second, if ldot=rdot=n, state s indi- 
cates that the whole of an has already been analyzed, 
including possible adjunctions to its root n. 
Each state s will be inserted into a recognition 
matrix T, which is a square matrix indexed from 0 to 
nw, where nw is the length of w. If state s belongs 
to the component tij of T, the partial analysis (the 
part of an) represented by s subsumes the substring 
of w that starts from position i and ends at position 
j, except for the items dominated by a possible foot 
node in an (this is explicitly indicated within s). 
The algorithm performs the analysis of w start- 
ing from the anchor node of every tree in G whose 
category is the same as an item in w. Then it tries to 
extend each partial analysis so obtained, by climbing 
each tree along the path that connects the anchor 
node to the root node; in doing this, the algorithm 
recognizes all possible adjunctions that are present in 
w. Most important, every subtree 7'of a tree derived 
from aEluA, such that 7'd0es not contain the an- 
chor node of a, is predicted and analyzed by the algo- 
rithm in a top-down fashion, from right to left (left 
to right) if it is located to the left (right) of the path 
that connects the anchor node to the root node in a. 
The combinations of partial analyses (states) and 
the introduction of top-down prediction states is car- 
ried out by means of the application of six proce- 
dures that will be defined below. Each procedure ap- 
plies to some states, trying to "move" outward one 
of the two additional pointers within each state. 
The algorithm stops when no state in T can be 
further expanded. If some state has been obtained that 
subsumes the input string and that represents a com- 
plete analysis for some tree with the root node of 
category S, the algorithm succeeds in the recogni- 
tion. 
3. The Algorithm 
In the following any (elementary or derived) tree 
will be denoted by a pair (N, E), where N is a finite 
set of nodes and E is a set of ordered pairs of nodes, 
called arcs. For every tree a=(N, E), we define five 
functions of N into Nu {_1_} ,l called father, leftmost- 
child, rightmost-child, left-sibling, and right-sibling 
(with the obvious meanings). For every tree a=(N, 
E) and every node n~N, a function domaina is de- 
fined such that domaindn)-'~, where/3 is the maxi- 
mal subtree in a whose root is n. 
IThe symbol "_1_" denotes here the undefined element. 
For any TAG G and for every node n in some 
tree in G, we will write cat(n)=X, X~ VNuVZ, 
whenever X is the symbol associated to n in G. For 
every node n in some tree in G, such that 
cat(n)~ VN, the set Adjoin(n) contains all root nodes 
of auxiliary trees that can be adjoined to n in G. 
Furthermore, a function x is defined such that, for 
every tree a~ luA, it holds that z(a)=n, where n 
indicates the anchor node of a. In the following we 
assume that the anchor nodes in G are not labelled 
by the null (syntactic) category symbol e. The set of 
all nodes that dominate the anchor node of some tree 
in IuA will be called Middle-nodes (anchor nodes 
included); for every tree a=(N, E), the nodes nEN in 
Middle-nodes divide a in two (possibly empty) left 
and right portions. The set Left-nodes (Right-nodes) 
is defined as the set of all nodes in the left (right) 
portion of some tree in IuA. Note that the three sets 
Middle-nodes, Left-nodes and Right-nodes constitute 
a partition of the set of all nodes of trees in IuA. 
The set of all foot nodes in the trees in A will be 
called Foot-nodes: 
Let w---a I ... anw, nw >1, be a symbol string; we 
will say that nw is the length of w. 
Definition 2 A state is defined to be any 8-tuple 
\[n, ldot, lpos, rdot, rpos, fl, fr, m\] such that: 
n, ldot, rdot are nodes in some tree ~ IuA; 
lpos, rpos~ {left, right}; 
fl, fr are either the symbol "-" or indices in the 
input string such thatfl<fr; 
mE {-, rm, Ira}. 
The first component in a state s indicates a node 
n in some tree a, such that s represents some partial 
analysis for the subtree domaina(n). The second 
component (ldot) may be n or one of its children in 
if lpos=left, domaina(ldot) is included in the par- 
tial analysis represented by s, otherwise it is not. 
The components rdot and rpos have a symmetrical 
interpretation. The pair fl, fr represents the part of 
the input string that is subsumed by the possible 
foot node in domaina(n). A binary operator indicated 
with the symbol • is defined to combine the com- 
ponents fl, fr in different states; such an operator is 
defined as follows: f~f equalsfiff= -, it equalsf if 
f= -, and it is undefined otherwise. Finally, the com- 
ponent m is a marker that will be used to block ex- 
pansion at one side for a state that has already been 
subsumed at the other one. This particular technique 
is called subsumption test and is discussed in (Satta 
and Stock, 1989). The subsumption test has the 
main purpose of blocking analysis proliferation due 
to the bidirectional behaviour of the method. 
Let IS be the set of all possible states; we will 
use a particular equivalence relation O.C- Isxls de- 
fined as follows. For any pair of states s, s', sO.s" 
holds if and only if every component in s but the 
last one (the m component) equals the corresponding 
- 28 - 
component in s'. 
The algorithm that will be presented employs the 
following function. 
Definition 3 A function F is defined as follows: 2 
F: V~, -.> ~(Is) 
F(a) = {s I s=\[father(n), n, left, n, right, -, -, -\], 
cat(n)=a and z(oO=n for some tree 
ot~ IuA } 
The details of the algorithm are as follows. 
Algorithm 1 
Let G=(VN, Vy, S, I, A) be a TAG and let w=al ... 
anw, nw >--1, be any string in V~*. Let Tbe a recogni- 
tion matrix of size (nw+l)x(nw+l) whose compo- 
nents tij are indexed from 0 to nw for both sides. 
Developmatrix T in the following way (a new slate 
s is added to some entry in T only if SOjq does not 
hold for any slate Sq already present in that entry). 
1. For every slate se F(ai), l<i<-nw, add s to ti-l,i. 
2. Process each slate s added to some entry in T by 
means of the following procedures (in any order): 
Left-expander(s), Right-expander(s), 
Move-dot-left(s), Move-dot-right(s), 
Completer(s), Adjoiner(s); 
until no state can be further added. 
3. if s=\[n, n, left, n, right,-, -, -\]e to,nw for some 
node n such that cat(n)=S and n is the root of a 
tree in I, then output(true)', else output(false). 
C3 
The six procedures mentioned above are defined 
in the following. 
Procedure 1 Left-expander 
Input A state s=\[n, ldot. lpos, rdot, rpos, fl, fr, m\] 
in ti,j. 
Precondition me-Ira, ldot~n and lpos=right. 
Description 
Case 1: ldot~ VN, ldot~ Foot-nodes. 
Step 1: For every state s'~\[ldot, ldot, left, ldot, 
right, fl", fr", -\] in ti',i, i'<_i, add slate s'=\[n, 
ldot, left, rdot, rpos, fl~fl '', frOfr '', -\] to ti,j; 
set m=rm in s if left-expansion is successful:, 
Step 2: Add state s'=\[ldot, ldot, right, ldot, right, 
-, -, -\] to ti, i. For every state s"=\[n", n", left, 
n", right, fl", fr", "\] in ti',i, i'<i, 
n" ~ Adjoin( ldot ), add state s'=\[ ldot, ldot, right, 
ldot, right, -, -, -\] to tfr"fr". 
Case 2: ldotE V~.. 3 
If ai=cat(ldot), add state s~\[n, ldot, left, rdot, 
rpos, fi, fr,-\] to ti-Ij (if eat(ldot)=e, i.e. the null 
category symbol, add state s' to tij); set m=rm 
in s if left-expansion is successful. 
Case 3: ldot~ Foot-nodes. 
Add state s~\[n, ldot, left, rdot, rpos, i', i, -\] to 
2Given a generic set ;1, the symbol P(.,q) denotes the 
set of all the subsets of .,~ (the power set of ,~). 
3We assume that a 0 is undefined. 
ti, J, for every i'<~, and set m=rm in s. Q 
Procedure 2 Right-expander 
Input A slate s=\[n, ldot, lpos, rdot, rpos, fl, fr, m\] 
in tij. 
Precondition m#m, rdotg-n and rpos=-left. 
Description 
Case 1: rdot~ VN, rdot~ Foot-nodes. 
Step 1: For every slate s"=\[rdot, rdot, left, rdot. 
• te to ° .~ ,t t rtght, fl , fr , "\] m tj,j,, j~_j , add state s =\[n, 
ldot, lpos, rdot, right, flOfl", fr~fr", "\] to 
ti "'; set m=lm in s if left-expansion is d 
successful; 
Step 2: Add state s~\[rdot, rdot, left, rdot, left, -o 
-, -\] to tjj. For every slate s"--\[n", n", left, 
n'~, right, fl", fr'. "\] in tj,j., j<j', 
n" ~ Adjoin(rdot), add state s'=\[rdot, rdot, left, 
rdot, left, -, -, -\] to tfr"f/'. 
Case 2: rdote V~. 4 
If aj+l=cat(rdot), add state s~\[n, ldot, lpos, rdot, 
rigl~t, fl,fr, "\] to ti,j+l (if cat(rdot)=e, i.e. the 
null category symbol, add state s' to tij); set 
m=Im in s if right-~xpansion is successful. 
Case 3: rdot¢ Foot-nodes. 
Add state s-in, ldot, lpos, rdot, right, j, j', -\] to 
tij', for every j<j', and set m=lm in s. t3 
Procedure 3 Move-dot-left 
Input A slate s=\[n, ldot, lpos, rdot, rpos, fl, fr, m\] 
in tij. 
Precondition m~lm, and ldot~n, lpos=left, or 
ldot=n, lpos=right. 
Description 
Case 1: lpos=right. ~ 
Add slate s~\[n, rightmost-child(n), right, rdot, 
rpos, fl, fr, -\] to tij; set m=rm in s; 
Case 2: lpos=left, left-sibling(n)~l. 
Add state s'=\[n, left-sibling(ldot), right, rdot, 
rpos, fl, fr, "\] to tij; set m=rm in s. 
Case 3: lpos=-left, left-sibling(ldot)=±. 
Add slate s'=\[n, n, left, rdot, rpos, fl, fr, -\] to tij 
and set m=rm in s. (3 
Procedure 4 Move-dot-right 
Input A slate s=\[n, ldot, lpos, rdot, rpos, fl,fr, m\] 
in tij. 
Precondition m#rm, and rdot~n, rpos=right, or 
rdot=n, rpos=-left. 
Description 
Case 1: rpos=left. 
Add slate s'=\[n, ldot, lpos, leftmost-child(n), left, 
fl, fr, -\] to tij; set m=lm in s; 
Case 2: rpos=right, right-sibling(n)~Z. 
Add state s~\[n, ldot, lpos, right-sibling(rdoO, 
left, fl, fr, "\] to ti4; set m=lm in s. 
Case 3: rpos=right, rtght-sibling(ldot)=±. 
Add state s'=\[n, ldot, lpos, n, right, fl,fr, -\] to 
tij and set m=lm in s. Q 
4See note 3. 
- 29 - 
Procedure 5 Completer 
Input A state s=\[n, n, left, n, right, fl, fr, m\] in tij. 
Precondition n is not the root of an auxiliary tree. 
Description 
Case 1: nE Middle-nodes. 
Add state s'=\[father(n), n, left, n, right, fl, fr, -\] 
to ti~ j. 
Case 2: n~Left-nodes. 
For every state s"=\[n", Idol", right, rdot, rpos, 
fl", fr", m"\] in t'f,j ,J'>J', such ,that ldot"=n and 
m"~lm, add state s =\[n , idol', left, rdot, rpos, 
fluff', fr@fr", "\] in tif; if left-expansion is 
successful for slate s', set m =rm in s. 
Case 3: nERight-nodes. 
For every state s"=\[n", Idol, lpos, rdot", left, ff', 
f,", m'q in ti',i, i'<i, such that rdot"=n and 
m"#rm, add state s- \[n", Idol, lpos, rdot", right, 
H pt • , • • * ffi~ft , f,~gf, , -\] m ti',j, ff nght-expansmn is 
successful for state s", set m"--lm in s". ~. 
Procedure 6 Adjoiner 
Input A state s=\[n, n, left, n, right, fl, fr, m\] in tij. 
Precondition Void. 
Description 
Case 1: apply always. 
For every state s"=\[n", n", left, n ", right, i, j, -\] 
• ~ .t~ • . • t¢ • • m ti'~, t _t,j~_j, n eAdjom(n), add state s'=\[n, 
n, lelt, n, right, fl, fr, "\] to ti'd'. 
Case 2: n is the root of an auxiliary tree. 
Step 1: For every state s"=\[n", n", left, n", 
~l_,fn such that right, ff', fr", "\] in ", n , left, n , 
n~ Adjoin(n"), add state "' 
right, ff', fr", -\] to ti~; , 
Step 2: For every state s =\[n', Idol", right, rdot, 
rpos, ft", fr", m"\] in tj.j,,,j'>j, such that 
ne Adjoin(Idol") and m ~lm, add state 
s'=\[ldot", Idol", right, Idol", right, -, -, -\] to 
Stepl~/:r'For every state s"=\[n", Idol, lpos, rdot", 
left, ft", fr", m'q in ti',i, i" <i, such that 
n~Adjoin(rdot") and m"~rm, add state 
s'=\[rdot", rdot", left, rdot", left, -, -, -\] to 
tftft. (:2 
4. Formal Results 
Some definitions will be introduced in the fol- 
lowing, in order to present some interesting proper- 
ties of Algorithm I. Formal proofs of the statements 
below can be found in (Satta, 1990). 
Let n be a node in some tree a~l~A. Each state 
s=\[n, Idol, lpos, rdot, rpos, fl, fr, m\] in I S identifies 
a tree forest ¢(s) composed of all maximal subtrees 
in a whose roots are "spanned" by the two positions 
Idol and rdot. If ldot~n, we assume that the maximal 
subtree in a whose root is Idol is included in ¢(s) if 
and only if lpos=left (the mirror case holds w.r.t. 
rdot). We define the subsumption relation < on I S as 
follows: s~_s' iff state s has the same first component 
as state s' and ¢(s) is included in ¢(s9. We also say 
that a forest ¢(s) derives a forest ~ (¢(s) =~ ~) 
whenever I//can be obtained from ~(s) by means of 
some adjoining operations. Finally, E denotes the 
immediate dominance relation on nodes of ae IuA, 
and ~(a) denotes the foot node of a (if a~ A). The 
following statement characterizes the set of all states 
inserted in T by Algorithm 1. 
Theorem 1 Let n be a node in a~ IuA and let n' 
be the lowest node in a such that n'~ Middle-nodes 
and (n, n°)EE*; let also s=\[n, Idol, lpos, rdot, rpos, 
fl, fr, m\] be a state in I S. Algorithm 1 inserts a state 
. ~ 0 • • s, s_s , m t i h ~j+h , hl,ha->O, if and only if one of 
. . " | . . the following condl~ons is met: 
(i) n~ Middle-nodes (n'=n) and ¢(s) =~ IV, where !// 
spans ai+l ... aj (with the exception of string 
af.t+ 1 ... aft if ~(a) is included in qJ(s)) (see Figure 1), 
(ii) n~ Left-nodes, s=s' , hl=h2=O and ¢(s) ~ V/' , 
where ~: spans ai+t ... aj (with the exception of 
string aA+ 1 ... af if ~(a) is included in ¢(s)). 
Moreover', n' is t~ root of a (maximal) subtree z 
in a such thai z ~ ~, IV strictly includes if and 
every tree/~ A that has been adjoined to some 
node in the path from n' to n spans a string that 
is included in al ... ai (see Figure 2); 
(iii) the symmetrical case of (ii). 
a i +1 "'" af t X af ,+1 "" ai 
Figure 1. 
n" 
y a i +l..aflXafr+l. aj - 
Figure 2. 
In order to present the computational complexity 
of Algorithm 1, some norms for TAGs are here in~ 
troduced. Let A be a set of nodes in some trees of a 
TAG G, we define 
IGIA, k = ~ Ichildren(n)l k • 
nE .91 
:The following result refers to the Random Access 
Machine model of computation. 
- 30 - 
Theorem 2 If some auxiliary structures (vector of 
lists) are used by Algorithm t for the bookkeeping 
of all states that correspond to completely analyzed 
auxiliary trees, a string can be recognized in 
O(nt.IAI.max{IGIN.M,I+IGIM,2}) time, where M 
=Middle-nodes and N denotes the set of all nodes in 
the trees of G. 
5. A Linguistic Example 
In order to gain a better understanding of 
Algorithm 1 and to emphasize the linguistic rele- 
vance of TAGs, we present a running example. In 
the following we assume the formal framework of 
X-bar Theory (Jackendoff, 1977). Given the sen- 
tence: 
(1) Gianni incontra Maria per caso 
lit. Gianni meets Maria by chance 
we will propose here the following analysis (see 
Figure 4): 
(2) \[ca \[c' lip \[NP Gianni\] \[r inc°ntrai \[vp* \[vP 
\[w e i \[~ Maria\]\]\] \[pp per caso\]\]\]\]\]\] 
Note that the Verb incontra has been moved to the 
Inflection position. Therefore, the PP adjunction 
stretches the dependency between the Verb incontra 
and its Direct Object Maria. These cases may raise 
some difficulties in a context-free framework, be- 
cause the lack of the head within its constituent 
makes the task of predicting the object(s) rather inef- 
ficient. 
Assume a TAG G=(VN, VZ, S, I, A), where 
VN={IP, r, vP, v', NP}, V~:={Gianni, Maria, 
incontra, PP},I={o~} andA={fl} (see Figure 3; each 
node has been paired with an integer which will be 
used as its address). In order to simplify the compu- 
tation, we have somewhat reduced the initial tree a 
and we have considered the constituent PP as a ter- 
minal symbol. In Figure 4 the whole analysis tree 
corresponding to (2) is reported. 
Let x(a)=5, z(fl)=13; from Definition 3 it fol- 
lows that: 
F(5)= {\[4, 5, left, 5, right, -, -, -\]}, 
F(13)={\[ll, 13, left, 11, right,-,-,-\]}. 
A run of Algorithm 1 on sentence (1) is simpli- 
fied in the following steps (only relevant steps are 
reported). 
First of all, the two anchors are recognized: 
1) s1=\[4, 5, left, 5, right, -, -, -\] is inserted in tl.2 
and s2=\[ll, 13, left, 13, right, -, -, -\] is 
inserted in t3,4, by line 1 of the algorithm. 
Then, auxiliary tree fl is recognized in the following 
steps: 
2) s3=\[ll, 12, right, 13, right, -, -, -\] is inserted 
in t3. 4 and m is set to rm in state s2, by Case 
2 of the move-dot-left procedure; 
3) s4=\[ll, 12, left, 13, right, 2, 3, -\] is inserted 
in t2.4 and m is set to rm in state s3, by Case 
3 of the left-expander procedure; 
4) ss=\[ll, 11, left, 13, right, 2, 3, -\] is inserted 
in t2,4 and m is set to rm in state s4, by Case 
3 of the move.dot-left procedure; 
5) st=\[11, 11, left, 11, right, 2, 3, -\] is inserted 
in h,4 and m is set to lm in state Ss, by Case 
3 of the move-dot-right procedure. 
Or: IP (I) 
(2) NP 
I 
O) Gianni 
r (4) 
incontra i (5) VP (6) I 
V' G) 
(8) e i NP I 
Mma 
(9) 
(1o) 
: VP 0;) 
(12) VP PP (13) 
per caso 
Figure 3. 
IP 
NP I' 
Giar~ mcomra i VP 
VP PP 
per caso 
V' 
¢i NP 
I Maria 
Figure 4. 
After the insertion of state s7--\[4, 5, left, 6, left, -, -, 
-1 in tl,2 by Case 2 of the move-dot-right procedure, 
the VP node (6) is hypothesized by Case 1 (Step 2, 
via state s6) of the right-expander procedure with the 
insertion of state ss-\[6, 6, left, 6, left, -, -, -1 in t2. 2. 
The whole recognition of node (6) takes place with 
the insertion of state s9;\[6, 6, left, 6, right, -, -, -1 
in: t2,3. Then we have the following step: 
6) s10=\[6, 6, left, 6, right, -, -, -\] is inserted in 
-31 
t2,4, by the adjoiner procedure. 
The analysis proceeds working on tree a and reach- 
ing a final configuration in which state s~t=\[1, 1, 
left, 1, right, -, -, -\] belongs to to,4. 
6, Discussion 
Within the perspective of Lexicalized TAGs, 
known methods for TAGs recognition/parsing pre- 
sent some limitations: these methods behave in a 
left-to-right fashion (Schabes and Joshi, 1988) or 
they are purely bottom-up (Vijay-Shanker and Joshi, 
1985, Harbusch, 1990), hence they cannot take ad- 
vantage of anchor information in a direct way. The 
presented algorithm directly exploits both the advan- 
tages of lexicalization mentioned in the paper by 
Schabes and Joshi (1989), i.e. grammar filtering and 
bottom-up information. In fact, such an algorithm 
starts partial analyses from the anchor elements, di- 
rectly selecting the relevant trees in the grammar, 
and then it proceeds in both directions, climbing to 
the roots of these trees and predicting the rest of the 
structures in a top-down fashion. These capabilities 
make the algorithm attractive from the perspective of 
linguistic information processing, even if it does not 
improve the worst-case time bounds of already 
known TAGs parsers. 
The studied algorithm recognizes auxiliary trees 
without considering the substring dominated by the 
foot node, as is the case of the CYK-like algorithm 
in Vijay-Shanker and Joshi (1985). More precisely, 
Case 3 in the procedure Left-expander nondeterminis- 
tically jumps over such a substring. Note that the 
alternative solution, which consists in waiting for 
possible analyses subsumed by the foot node, would 
prevent the algorithm from recognizing particular 
configurations, due to the bidirectional behaviour of 
the method (examples are left to the reader). On the 
contrary, Earley-like parsers for TAGs (Lang, 1990, 
Schabes, 1990) do care about substrings dominated 
by the foot node. However, these algorithms are 
forced to start at each foot node the recognition of all 
possible subtrees of the elementary trees whose roots 
can be the locus of an adjunction. 
In this work, we have discussed a theoretical 
schema for the parser, in order to study its formal 
properties. In practical cases, such an algorithm 
could be considerably improved. For example, the 
above mentioned guess in Case 3 of the procedure 
Left-expander could take advantage of look-ahead 
techniques. So far, we have not addressed topics such 
as substitution or on-line recognition. Our algorithm 
can be easily modified in these directions, adopting 
the same proposals advanced in (Schabes and Joshi, 
1988). 
Finally, a parser for Lexicalized TAGs can be 
obtained from Algorithm 1. To this purpose, it suf- 
fices to store elements in IS into the recognition 
matrix T along with a list of pointers to those en- 
tries that caused such elements to be placed in the 
matrix. Using this additional information, it is not 
difficult to exhibit an algorithm for the construction 
of the desired parser(s). 
References 
Harbusch, Karin, 1990. An Efficient Parsing 
Algorithm for TAGs. In Proceedings of the 28th 
Annual Meeting of the Association for Computational 
Linguistics. Pittsburgh, PA. 
Jackendoff, Ray, 1977. X.bar Syntax: A Study of 
Phrase Structure. The M1T Press, Cambridge, MA. 
Joshi, Aravind K., 1985. Tree Adjoining 
Grammars: How Much Context-Sensitivity Is Required 
to Provide Reasonable Structural Descriptions?. In: D. 
Dowty et al. (eds). Natural Language Parsing: 
Psychological, Computational and Theoretical 
Perspectives. Cambridge University Press, New York, 
NY. 
Kroch, Anthony S. and Joshi, Aravind K., 1985. 
Linguistic Relevance of Tree Adjoining Grammars. 
Technical Report MS-CIS-85-18, Department of 
Computer and Information Science, University of 
Pennsylvania. 
Lang, Bernard, 1990. The Systematic Construction 
of Earley Parsers: Application to the Production of 
O(n 6) Earley Parsers for Tree Adjoining Grammars. In 
Proceedings of the 1st International Workshop on 
Tree Adjoining Grammars. Dagstuhl Castle, F.R.G.. 
Satta, Giorgio, 1990. Aspetti computazionali della 
Teoria della Reggenza e del Legamento. Doctoral 
Dissertation, Univ'ersity of Padova, Italy. 
Satta, Giorgio and Stock, Oliviero, 1989. Head- 
Driven Bidirectional Parsing: A Tabular Method. In 
Proceedings of the 1st International Workshop on 
Parsing Technologies. Pittsburgh, PA. 
Schabes, Yves, 1990. Mathematical and 
Computational Aspects of Lexicalized Grammars. PhD 
Thesis, Department of Computer and Information 
Science, University of Pennsylvania. 
Schabes, Yves; Abeill6, Anne and Joshi, Aravind 
K., 1988. Parsing Strategies for 'Lexicalized' 
Grammars: Application to Tree Adjoining Grammars. 
In Proceedings of the 12th International Conference 
on Computational Linguistics. Budapest, Hungary. 
Schabes, Yves and Joshi, Aravind K., 1988. An 
Earley-Type Parsing Algorithm for Tree Adjoining 
Grammars. In Proceedings of the 26th Annual 
Meeting of the Association for Computational 
Linguistics. Buffalo, NY. 
Schabes, Yves and Joshi, Aravind K., 1989. The 
Relevance of Lexicalization to Parsing. In 
Proceedings of the 1st International Workshop on 
Parsing Technologies. Pittsburgh, PA. To also appear 
under the title: Parsing with Lexicalized Tree 
Adjoining Grammar. In: M. Tomita (ed.). Current 
Issues in Parsing Technologies. The MIT Press. 
Vijay-Shanker, K. and Joshi, Aravind K., 1985. 
Some Computational Properties of Tree Adjoining 
Grammars. In Proceedings of the 23rd Annual 
Meeting of the Association for Computational 
Linguistics. Chicago, IL. 
- 32 - 
