OPTIMIZING THE COMPUTATIONAL LEXICALIZATION OF 
LARGE GRAMMARS 
Christian JACQUEMIN 
Institut de Recherche en Informatique de Nantes (IR/N) 
IUT de Nantes - 3, rue du MarEchal Joffre 
F-441M1 NANTES Cedex 01 - FRANCE 
a--mail : jaequemin@ irin.iut-nantas.univ-nantas.fr 
Abstract 
The computational lexicalization of a 
grammar is the optimization of the links 
between lexicalized rules and lexical items in 
order to improve the quality of the bottom-up 
filtering during parsing. This problem is 
N P-complete and untractable on large 
grammars. An approximation algorithm is 
presented. The quality of the suboptimal 
solution is evaluated on real-world grammars as 
well as on randomly generated ones. 
Introduction 
Lexicalized grammar formalisms and more 
specifically Lexicalized Tree Adjoining 
Grammars (LTAGs) give a lexical account of 
phenomena which cannot be considered as 
purely syntactic (Schabes et al, 1990). A 
formalism is said to be lexicalized if it is 
composed of structures or rules associated with 
each lexical item and operations to derive new 
structures from these elementary ones. The 
choice of the lexical anchor of a rule is 
supposed to be determined on purely linguistic 
grounds. This is the linguistic side of 
lexicalization which links to each lexical head a 
set of minimal and complete structures. But 
lexicalization also has a computational aspect 
because parsing algorithms for lexicalized 
grammars can take advantage of lexical links 
through a two-step strategy (Schabes and Joshi, 
1990). The first step is the selection of the set 
of rules or elementary structures associated 
with the lexical items in the input sentence ~. In 
the second step, the parser uses the rules 
filtered by the first step. 
The two kinds of anchors corresponding to 
these two aspects of lexicalization can be 
considered separately : 
• The linguistic anchors are used to access the 
grammar, update the data, gather together 
items with similar structures, organize the 
grammar into a hierarchy... 
• The computational anchors are used to 
select the relevant rules during the first step 
of parsing and to improve computational 
and conceptual tractability of the parsing 
algorithm. 
Unlike linguistic lexicalization, computational 
anchoring concerns any of the lexical items 
found in a rule and is only motivated by the 
quality of the induced filtering. For example, 
the systematic linguistic anchoring of the rules 
describing "Nmetal alloy" to their head noun 
"alloy" should be avoided and replaced by a 
more distributed lexicalization. Then, only a 
few rules "Nmetal alloy" will be activated when 
encountering the word "alloy" in the input. 
In this paper, we investigate the problem of 
the optimization of computational 
lexicalization. We study how to choose the 
computational anchors of a lexicalized 
grammar so that the distribution of the rules on 
to the lexical items is the most uniform possible 
The computational anchor of a rule should not be 
optional (viz included in a disjunction) to make sure 
that it will be encountered in any string derived from 
this rule. 
196 
with respect to rule weights. Although 
introduced with reference to LTAGs, this 
optimization concerns any portion of a 
grammar where rules include one or more 
potential lexical anchors such as Head Driven 
Phrase Structure Grammar (Pollard and Sag, 
1987) or Lexicalized Context-Free Grammar 
(Schabes and Waters, 1993). 
This algorithm is currently used to good 
effect in FASTR a unification-based parser for 
terminology extraction from large corpora 
(Jacquemin, 1994). In this framework, terms 
are represented by rules in a lexicalized 
constraint-based formalism. Due to the large 
size of the grammar, the quality of the 
lexicalization is a determining factor for the 
computational tractability of the application. 
FASTR is applied to automatic indexing on 
industrial data and lays a strong emphasis on 
the handling of term variations (Jacquemin and 
Royaut6, 1994). 
The remainder of this paper is organized as 
follows. In the following part, we prove that the 
problem of the Lexicalization of a Grammar is 
NP-complete and hence that there is no better 
algorithm known to solve it than an 
exponential exhaustive search. As this solution 
is untractable on large data, an approximation 
algorithm is presented which has a 
computational-time complexity proportional to 
the cubic size of the grammar. In the last part, 
an evaluation of this algorithm on real-world 
grammars of 6,622 and 71,623 rules as well as 
on randomly generated ones confirms its 
computational tractability and the quality of 
the lexicalization. 
The Problem of the 
Lexiealization of a Grammar 
Given a lexicalized grammar, this part describes 
the problem of the optimization of the 
computational lexicalization. The solution to 
this problem is a lexicalization function 
(henceforth a lexicalization) which associates to 
each grammar rule one of the lexical items it 
includes (its lexical anchor). A lexicalization is 
optimized to our sense if it induces an optimal 
preprocessing of the grammar. Preprocessing is 
intended to activate the rules whose lexical 
anchors are in the input and make all the 
possible filtering of these rules before the 
proper parsing algorithm. Mainly, 
preprocessing discards the rules selected 
through lexicalization including at least one 
lexical item which is not found in the input. 
The first step of the optimization of the 
lexicalization is to assign a weight to each rule. 
The weight is assumed to represent the cost of 
the corresponding rule during the 
preprocessing. For a given lexicalization, the 
weight of a lexical item is the sum of the 
weights of the rules linked to it. The weights 
are chosen so that a uniform distribution of the 
rules on to the lexical items ensures an optimal 
preprocessing. Thus, the problem is to find an 
anchoring which achieves such a uniform 
distribution. 
The weights depend on the physical 
constraints of the system. For example, the 
weight is the number of nodes if the memory 
size is the critical point. In this case, a uniform 
distribution ensures that the rules linked to an 
item will not require more than a given 
memory space. The weight is the number of 
terminal or non-terminal nodes if the 
computational cost has to be minimized. 
Experimental measures can be performed on a 
test set of rules in order to determine the most 
accurate weight assignment. 
Two simplifying assumptions are made : 
° The weight of a rule does not depend on the 
lexical item to which it is anchored. 
• The weight of a rule does not depend on the 
other rules simultaneously activated. 
The second assumption is essential for settling 
a tractable problem. The first assumption can 
be avoided at the cost of a more complex 
representation. In this case, instead of having a 
unique weight, a rule must have as many 
weights as potential lexical anchors. Apart from 
this modification, the algorithm that will be 
presented in the next part remains much the 
same than in the case of a single weight. If the 
first assumption is removed, data about the 
frequency of the items in corpora can be 
accounted for. Assigning smaller weights to 
rules when they are anchored to rare items will 
197 
make the algorithm favor the anchoring to 
these items. Thus, due to their rareness, the 
corresponding rules will be rarely selected. 
Illustration Terms, compounds and more 
generally idioms require a lexicalized syntactic 
representation such as LTAGs to account for 
the syntax of these lexical entries (Abeill6 and 
Schabes, I989). The grammars chosen to 
illustrate the problem of the optimization of the 
lexicalization and to evaluate the algorithm 
consist of idiom rules such as 9 : 
9 = {from time to time, high time, 
high grade, high grade steel} 
Each rule is represented by a pair (w i, Ai) where 
w i is the weight and A i the set of potential 
anchors. If we choose the total number of 
words in an idiom as its weight and its non- 
empty words as its potential anchors, 9 is 
represented by the following grammar : 
G 1 = {a = (4, {time}), b = (2, {high, time}), 
c = (2, {grade, high}), 
d = (3, {grade, high,steel}) } 
We call vocabulary, the union V of all the sets 
of potential anchors A i. Here, V = {grade, high, 
steel, time}. A lexicalization is a function ~. 
associating a lexical anchor to each rule. 
Given a threshold O, the membership 
problem called the Lexicalization of a 
Grammar (LG) is to find a lexicalization so that 
the weight of any lexical item in V is less than 
or equal to 0. If 0 >4 in the preceding 
example, LG has a solution g : 
g(a) = time, ~.(b) = ~(c) = high, 
;t(d) = steel 
If 0 < 3, LG has no solution. 
Definition of the LG Problem 
G = {(w i, Ai) } (wie Q+, A i finite sets) 
V= {Vi} =k.)A i ;Oe 1~+ 
(1) LG- { (V, G, O, ~.) l where :t : G ---> V is a 
total function anchoring the rules so that 
(V(w, A)e G) 2((w, A))eA 
and (We V) ~ w < 0 } 
Z((w, A)) = v 
The associated optimization problem is to 
determine the lowest value Oop t of the threshold 
0 so that there exists a solution (V, G, Oop t,/q.) to 
LG. The solution of the optimization problem 
for the preceding example is 0op t = 4. 
Lemma LG is in NP. 
It is evident that checking whether a given 
lexicalization is indeed a solution to LG can be 
done in polynomial time. The relation R 
defined by (2) is polynomially decidable : 
(2) R(V, G, O, 2.) "-- \[if ~.: V-~G and (We V) 
w < 0 then true else false\] 
2((w, a)) = v 
The weights of the items can be computed 
through matrix products : a matrix for the 
grammar and a matrix for the lexicalization. 
The size of any lexicalization ~ is linear in the 
size of the grammar. As (V, G, O, &)e LG if and 
only if \[R(V, G, 0, ~.)\] is true, LG is in NP. • 
Theorem LG is NP-complete. 
Bin Packing (BP) which is NP-complete is 
polynomial-time Karp reducible to LG. BP 
(Baase, 1986) is the problem defined by (3) : 
(3) BP "-- { (R, {R I ..... Rk}) I where 
R = { r 1 ..... r n } is a set of n positive 
rational numbers less than or equal to 1 
and {R 1 ..... Rk} is a partition of R (k bins 
in which the rjs are packed) such that 
(Vi~{1 ..... k}) ,~ r< 1. 
re Ri 
First, any instance of BP can be represented as 
an instance of LG. Let (R, {R 1 ..... Rk}) be an 
instance of BP it is transformed into the 
instance (V, G, 0, &) of LG as follows : 
(4) V= {v I ..... vk} a set of k symbols, O= 1, 
G = {(r v V) ..... (rn, V)} 
and (Vie {1 ..... k}) (Vje {1 ..... n}) 
~t((rj, v)) = V i ¢~ rje R i 
For all i~{I ..... k} andjs{1 ..... n}, we 
consider the assignment of rj to the bin R i of 
BP as the anchoring of the rule (rj, V) to the 
item v i of LG. If(R, {R 1 ..... Rk})eBP then : 
198 
(5) (VIE{1 ..... k}) 2_, r< 1 
rE Ri 
¢~ (Vie { I ..... k}) ~_~ r _ I 
A((r, v)) = vi 
Thus (V, G, 1,/q.)~LG. Conversely, given a 
solution (V, G, 1, Z) of LG, let R i "- {rye R I 
Z((ry, V)) = vi} for all ie { 1 ..... k}. Clearly 
{R 1 ..... Rk} is a partition of R because the 
lexicalization is a total function and the 
preceding formula ensures that each bin is 
correctly loaded. Thus (R, {R I ..... Rk})EBP. It 
is also simple to verify that the transformation 
from B P to L G can be performed in 
polynomial time. \[\] 
The optimization of an NP-complete 
problem is NP-complete (Sommerhalder and 
van Westrhenen, 1988), then the optimization 
version of LG is NP-complete. 
An Approximation Algorithm 
for L G 
This part presents and evaluates an n3-time 
approximation algorithm for the LG problem 
which yields a suboptimal solution close to the 
optimal one. The first step is the 'easy' 
anchoring of rules including at least one rare 
lexical item to one of these items. The second 
step handles the 'hard' lexicalization of the 
remaining rules including only common items 
found in several other rules and for which the 
decision is not straightforward. The 
discrimination between these two kinds of items 
is made on the basis of their global weight G W 
(6) which is the sum of the weights of the rules 
which are not yet anchored and which have this 
lemma as potential anchor. Vx and Gx are 
subsets of V and G which denote the items and 
the rules not yet anchored. The ws and 0 are 
assumed to be integers by multiplying them by 
their lowest common denominator if necessary. 
(6) (Vw V Z) GW(v) = ~_~ w 
(w,A) e Gx,vE A 
Step 1 : 'Easy' Lexiealization of Rare Items 
This first step of the optimization algorithm is 
also the first step of the exhaustive search. The 
value of the minimal threshold Omi n given by 
(7) is computed by dividing the sum of the rule 
weights by the number of lemmas (\['xl stands 
for the smallest integer greater than or equal to 
x and \[ V;tl stands for the size of the set Vx)" 
(7) 0,m. n = (w, A) E G~t W where I V~.I ~ 0 
lEvi 
All the rules which include a lemma with a 
global weight less than or equal to Orain are 
anchored to this lemma. When this linking is 
achieved in a non-deterministic manner, Omi . is 
recomputed. The algorithm loops on this 
lexicalization, starting it from scratch every 
time, until Omi . remains unchanged or until all 
the rules are anchored. The output value of 0,,i, 
is the minimal threshold such that LG has a 
solution and therefore is less than or equal to 
0o_ r After Step 1, either each rule is anchored /J 
or all the remaining items in Va. have a global 
weight strictly greater than Omin. The algorithm 
is shown in Figure 1. 
Step 2 : 'Hard' Lexicalization of Common 
Items During this step, the algorithm 
repeatedly removes an item from the remaining 
vocabulary and yields the anchoring of this 
item. The item with the lowest global weight is 
handled first because it has the smallest 
combination of anchorings and hence the 
probability of making a wrong choice for the 
lexicalization is low. Given an item, the 
candidate rules with this item as potential 
anchor are ranked according to : 
1 The highest priority is given to the rules 
whose set of potential anchors only includes 
the current item as non-anchored item. 
2 The remaining candidate rules taken first 
are the ones whose potential anchors have 
the highest global weights (items found in 
several other non-anchored rules). 
The algorithm is shown in Figure 2. The 
output of Step 2 is the suboptimal 
computational lexicalization Z of the whole 
grammar and the associated threshold 0s,,bopr 
Both steps can be optimized. Useless 
computation is avoided by watching the capital 
199 
of weight C defined by (8) with 0 - 0m/~ during 
Step 1 and 0 - Osubopt during Step 2 : 
(8) c=o.lvxl- w 
(w, A) ~ Gx 
C corresponds to the weight which can be lost 
by giving a weight W(m) which is strictly less 
than the current threshold 0. Every time an 
anchoring to a unit m is completed, C is 
reduced from 0- W(t~). If C becomes negative 
in either of both steps, the algorithm will fail to 
make the lexicalization of the grammar and 
must be started again from Step 1 with a higher 
value for 0. 
Input 
Output 
Stepl 
V,G 
0m/,,, V;t, G;t, 2 : (G - Ga) ---> (V-V a) 
I -\[  -'Gw Omi,, ~- (w,A)~ IVl 
repeat 
G;t~G ; Vx<--- V; 
for each ve V such as GW(v)<Omi,, do 
for each (w, A)~ G such as wA 
and ~((w, A)) not yet defined do 
~((w, A)) ~ v ; 
Gx~-Gx-{(w,A)} ; 
update GW(v) ; 
end 
v~ ~ v~- {v} ; 
end 
if( ( O'mi n <-- 0,,~ 
and ( (Vve Va) GW(v) > Omin ) ) 
or G~ = 0 ) 
then exit repeat ; 
Omi n ~-- O'mi n ; 
until( false ) ; 
Figure 1: Step 1 of the approximation algorithm. 
Input 
Output 
Step2 
O~, V, G, V,~, G~, 
~.: (G-GO ~ (V-V~ 
O~.~p t, A. : G ---> V 
O,.~pt ~ Omi,, ; 
repeat 
;; anchoring the rules with only m as 
;; free potential anchor (t~ e V x with 
;; the lowest global weight) 
~J~--vi; 
GaI ~- { (w,A)~G~tlAnV~= {t~} }; 
if ( ~ w < 0~bo~, ) 
(w, A) ~ Go, 1 
then 0m/n ~ Omin + 1 ; goto Stepl ; 
for each (w, A)~ G~, 1 do 
X((w, A)) ~- ~ ; 
G;t~--G~t-{ (w,A) } ; 
end 
Gt~,2 ~-- {(w, A)eG;~ ; AnV z D {t~} }; 
W(~) ~ ~ w ; 
:t((w, A)) = ~Y 
;; ranking 2 G~, 2 and anchoring 
for(i ~ 1; i_< \[GruEl; i~- i+ 1 )do 
(w, A) <--- r -l(i) ;; t lh ranked by r 
if( W( t~) + W > Omin ) 
then exit for ; 
w(~) ~ w(~) + w ; 
~((w, A )) ~ ~ ; 
G~ ~ G~t-{(w, A)} ; 
end 
v~- v~- {~} ; 
until ( G~t = 0 ) ; 
Figure 2: Step 2 of the approximation algorithm. 
2 The ranking function r: Gt~ 2 --> { 1 .... \[ G~2 \[ } is 
such that r((w, A)) > r((w', A3 
• min ~W(v') v ~ A ~n~v~- t~ W(v) > v' E A' ,~ V~- 
200 
Example 3 The algorithm has been applied to 
a test grammar G 2 obtained from 41 terms with 
11 potential anchors. The algorithm fails in 
making the lexicalization of G 2 with the 
minimal threshold Omin = 12, but achieves it 
with Os,,bopt = 13. This value of Os,,bop t Can be 
compared with the optimal one by running the 
exhaustive search. There are 232 (= 4 109) 
possible lexicalizations among which 35,336 
are optimal ones with a threshold of 13. This 
result shows that the approximation algorithm 
brings forth one of the optimal solutions which 
only represent a proportion of 8 10 -6 of the 
possible lexicalizations. In this case the optimal 
and the suboptimal threshold coincide. 
Time-Complexity of the Approximation 
Algorithm A grammar G on a vocabulary V 
can be represented by a \]Glx \]V I-matrix of 
Boolean values for the set of potential anchors 
and a lx I G l-matrix for the weights. In order 
to evaluate the complexity of the algorithms as 
a function of the size of the grammar, we 
assume that I V I and I GI are of the same order 
of magnitude n. Step 1 of the algorithm 
corresponds to products and sums on the 
preceding matrixes and takes O(n 3) time. The 
worst-case time-complexity for Step 2 of the 
algorithm is also O(n 3) when using a naive 
O(n 2) algorithm to sort the items and the rules 
by decreasing priority. In all, the time required 
by the approximation algorithm is proportional 
to the cubic size of the grammar. 
This order of magnitude ensures that the 
algorithm can be applied to large real-world 
grammars such as terminological grammars. 
On a Spare 2, the lexicalization of a 
terminological grammar composed of 6,622 
rules and 3,256 words requires 3 seconds (real 
time) and the lexicalization of a very large 
terminological grammar of 71,623 rules and 
38,536 single words takes 196 seconds. The 
two grammars used for these experiment were 
generated from two lists of terms provided by 
the documentation center INIST/CNRS. 
3 The exhausitve grammar and more details about this 
example and the computations of the following 
section are in (Jacquemin, 1991). 
Evaluation of the 
Approximation Algorithm 
Bench Marks on Artificial Grammars In 
order to check the quality of the lexicalization 
on different kinds of grammars, the algorithm 
has been tested on eight randomly generated 
grammars of 4,000 rules having from 2 to 10 
potential anchors (Table 1). The lexicon of the 
first four grammars is 40 times smaller than the 
grammar while the lexicon of the last four ones 
is 4 times smaller than the grammar (this 
proportion is close to the one of the real-world 
grammar studied in the next subsection). The 
eight grammars differ in their distribution of 
the items on to the rules. The uniform 
distribution corresponds to a uniform random 
choice of the items which build the set of 
potential anchors while the Gaussian one 
corresponds to a choice taking more frequently 
some items. The higher the parameter s, the 
flatter the Gaussian distribution. 
The last two columns of Table 1 give the 
minimal threshold Omi n after Step 1 and the 
suboptimal threshold Osul, op , found by the 
approximation algorithm. As mentioned when 
presenting Step 1, the optimal threshold Ooe t is 
necessarily greater than or equal to Omin after 
Step 1. Table 1 reports that the suboptimal 
threshold Os,,t, opt is not over 2 units greater than 
Omin after Step 1. The suboptimal threshold 
yielded by the approximation algorithm on 
these examples has a high quality because it is 
at worst 2 units greater than the optimal one. 
A Comparison with Linguistic Lexicalization 
on a Real-World Grammar This evaluation 
consists in applying the algorithm to a natural 
language grammar composed of 6,622 rules 
(terms from the domain of metallurgy 
provided by INIST/CNRS) and a lexicon of 
3,256 items. Figure 3 depicts the distribution of 
the weights with the natural linguistic 
lexicalization. The frequent head words such as 
alloy are heavily loaded because of the 
numerous terms in N-alloy with N being a 
name of metal. Conversely, in Figure 4 the 
distribution of the weights from the 
approximation algorithm is much more 
201 
uniform. The maximal weight of an item is 241 
with the linguistic lexicalization while it is only 
34 with the optimized lexicalization. The 
threshold after Step 1 being 34, the suboptimal 
threshold yielded by the approximation 
algorithm is equal to the optimal one. 
Lexicon size Distribution of the On~ n On~n Osubopt 
items on the rules before Step 1 after Step I suboptimal threshold 
100 uniform 143 143 143 
100 Gaussian (s = 30) 141 143 144 
100 Gaussian (s = 20) 141 260 261 
100 Gaussian (s = 10) 141 466 468 
1,000 uniform 15 15 16 
1,000 Gaussian (s = 30) 14 117 118 
1,000 Gaussian (s = 20) 15 237 238 
1,000 Gaussian (s = 10) 14 466 467 
Table 1: Bench marks of the approximation algorithm on eight randomly generated grammars. 
Number of 
items 
(log scale) 
3000 
1000 
100 
10 
15 30 Weight 45 60 75 90 105 120 135 150 165 180 195 210 225 240 
Figure 3: Distribution of the weights of the lexical items with the lexicalization on head words. 
Number of 
items 
(log scale) 
1000 
100 
10 
,,,, .... ,,,,,,,,,,111 
1 234 5678 910 12 14 16 18 20 22 24 26 28 30 32 34 36 Weight 
Figure 4: Distribution of the weights of the lexical items with the optimized lexicalization. 
202 
Conclusion 
As mentioned in the introduction, the 
improvement of the lexicalization through an 
optimization algorithm is currently used in 
FASTR a parser for terminological extraction 
through NLP techniques where terms are 
represented by lexicalized rules. In this 
framework as in top-down parsing with LTAGs 
(Schabes and Joshi, 1990), the first phase of 
parsing is a filtering of the rules with their 
anchors in the input sentence. An unbalanced 
distribution of the rules on to the lexical items 
has the major computational drawback of 
selecting an excessive number of rules when 
the input sentence includes a common head 
word such as "'alloy" (127 rules have "alloy" 
as head). The use of the optimized 
lexicalization allows us to filter 57% of the 
rules selected by the linguistic lexicalization. 
This reduction is comparable to the filtering 
induced by linguistic lexicalization which is 
around 85% (Schabes and Joshi, 1990). 
Correlatively the parsing speed is multiplied by 
2.6 confirming the computational saving of the 
optimization reported in this study. 
There are many directions in which this 
work could be refined and extended. In 
particular, an optimization of this optimization 
could be achieved by testing different weight 
assignments in correlation with the parsing 
algorithm. Thus, the computational 
lexicalization would fasten both the 
preprocessing and the parsing algorithm. 
Acknowledgments 
I would like to thank Alain Colmerauer for his 
valuable comments and a long discussion on a 
draft version of my PhD dissertation. I also 
gratefully acknowledge Chantal Enguehard 
and two anonymous reviewers for their remarks 
on earlier drafts. The experiments on industrial 
data were done with term lists from the 
documentation center INIST/CNRS. 

REFERENCES 
Abeill6, Anne, and Yves Schabes. 1989. Parsing 
Idioms in Tree Adjoining Grammars. In 
Proceedings, 4 th Conference of the 
European Chapter of the Association for 
Computational Linguistics (EACL'89), 
Manchester, UK. 
Baase, Sara. 1978. Computer Algorithms. 
Addison Wesley, Reading, MA. 
Jacquemin, Christian. 1991. Transformations 
des noms composds. PhD Thesis in 
Computer Science, Universit6 of Paris 7. 
Unpublished. 
Jacquemin, Christian. 1994. FASTR : A 
unification grammar and a parser for 
terminology extraction from large corpora. 
In Proceedings, IA-94, Paris, EC2, June 
1994. 
Jacquemin, Christian and Jean Royaut6. 1994. 
Retrieving terms and their variants in a 
lexicalized unification-based framework. In 
Proceedings, 17 th Annual International 
ACM SIGIR Conference (SIGIR'94), Dublin, 
July 1994. 
Pollard, Carl and Ivan Sag. 1987. Information- 
Based Syntax and Semantics. Vol 1: 
Fundamentals. CSLI, Stanford, CA. 
Schabes, Yves, Anne Abeill6, and Aravind K. 
Joshi. 1988. Parsing strategies with 
'lexicalized' grammars: Application to tree 
adjoining grammar. In Proceedings, 12 th 
International Conference on Computational 
Linguistics (COLING'88), Budapest, 
Hungary. 
Schabes, Yves and Aravind K. Joshi. 1990. 
Parsing strategies with 'lexicalized' 
grammars: Application to tree adjoining 
grammar. In Masaru Tomita, editor, Current 
Issues in Parsing Technologies. Kluwer 
Academic Publishers, Dordrecht. 
Schabes, Yves and Richard C. Waters. 1993. 
Lexicalized Context-Free Grammars. In 
Proceedings, 31 st Meeting of the 
Association for Computational Linguistics 
(ACL'93), Columbus, Ohio. 
Sommerhalder, Rudolph and S. Christian van 
Westrhenen. 1988. The Theory of 
Computability: Programs, Machines, 
Effectiveness and Feasibility. Addison- 
Wesley, Reading, MA. 
