Polynomial Time and Space Shift-Reduce Parsing 
of Arbitrary Context-free Grammars.* 
Yves Schabes 
Dept. of Computer & Information Science 
University of Pennsylvania 
Philadelphia, PA 19104-6389, USA 
e-mail: schabes~linc.cis.upenn.edu 
Abstract 
We introduce an algorithm for designing a predictive 
left to right shift-reduce non-deterministic push-down 
machine corresponding to an arbitrary unrestricted 
context-free grammar and an algorithm for efficiently 
driving this machine in pseudo-parallel. The perfor- 
mance of the resulting parser is formally proven to be 
superior to Earley's parser (1970). 
The technique employed consists in constructing 
before run-time a parsing table that encodes a non- 
deterministic machine in the which the predictive be- 
havior has been compiled out. At run time, the ma- 
chine is driven in pseudo-parallel with the help of a 
chart. 
The recognizer behaves in the worst case in O(IGI2n3)-time 
and O(IGIn2)-space. However in 
practice it is always superior to Earley's parser since 
the prediction steps have been compiled before run- 
time. 
Finally, we explain how other more efficient vari- 
ants of the basic parser can be obtained by deter- 
minizing portionsof the basic non-deterministic push- 
down machine while still using the same pseudo- 
parallel driver. 
1 Introduction 
Predictive bottom-up parsers (Earley, 1968; Earley, 
1970; Graham et al., 1980) are often used for natural 
language processing because of their superior average 
performance compared to purely bottom-up parsers 
*We are extremely indebted to Fernando Pereira and Stuart 
Shleber for providing valuable technical comments during dis- 
cussions about earlier versio/m of this algorithm. We are also 
grateful to Aravind Joehi for his support of this research. We 
also thank Robert Frank. All remaining errors are the author's 
responsibility alone. This research wa~ partially funded by 
ARO grant DAAL03-89-C0031PRI and DARPA grant N00014- 
90-J-1863. 
such as CKY-style parsers (Kasami, 1965; Younger, 
1967). Their practical superiority is mainly obtained 
because of the top-down filtering accomplished by the 
predictive component of the parser. Compiling out 
as much as possible this predictive component before 
run-time will result in a more efficient parser so long 
as the worst case behavior is not deteriorated. 
Approaches in this direction have been investigated 
(Earley, 1968; Lang, 1974; Tomita, 1985; Tomita, 
1987), however none of them is satisfying, either be- 
cause the worst case complexity is deteriorated (worse 
than Earley's parser) or because the technique is not 
general. Furthermore, none of these approaches have 
been formally proven to have a behavior superior to 
well known parsers such as Earley's parser. 
Earley himself (\[1968\] pages 69-89) proposed to pre- 
compile the state sets generated by his algorithm to 
make it as efficient as LR(k) parsers (Knuth, 1965) 
when used on LR(k) grammars by precomputing all 
possible states sets that the parser could create. How- 
ever, some context-free grammars, including most 
likely most natural language grammars, cannot be 
compiled using his technique and the problem of 
knowing if a grammar can be compiled with this tech- 
nique is undecidable (Earley \[1968\], page 99). 
Lang (1974) proposed a technique for evaluating 
in pseudo-parallel non-deterministic push down au- 
tomata. Although this technique achieves a worst 
case complexity of O(n3)-time with respect to the 
length of input, it requires that at most two symbols 
are popped from the stack in a single move. When the 
technique is used for shift-reduce parsing, this con- 
straint requires that the context-free grammar is in 
Chomsky normal form (CNF). As far as the grammar 
size is concerned, an exponential worst case behavior 
is reached when used with the characteristic LR(0) 
106 
machine. 1 
Tomita (1985; 1987) proposed to extend LR(0) 
parsers to non-deterministic context-free grammars 
by explicitly using a graph structured stack which 
represents the pseudo-parallel evaluation of the moves 
of a non-deterministic LR(0) push-down automaton. 
Tomita's encoding of the non-deterministic push- 
down automaton suffers from an exponential time 
and space worst case complexity with respect to the 
input length and also with respect to the grammar 
size (Johnson \[1989\] and also page 72 in Tomita 
\[1985\]). Although Tomita reports experimental data 
that seem to show that the parser behaves in practice 
better than Earley's parser (which is proven to take 
in the worst case O(\[G\[2n3)-time), the duplication of 
the same experiments shows no conclusive outcome. 
Modifications to Tomita's algorithm have been pro- 
posed in order to alleviate the exponential complex- 
ity with respect to the input length (Kipps, 1989) but, 
according to Kipps, the modified algorithm does not 
lead to a practical parser. Furthermore, the algorithm 
is doomed to behave in the worst case in exponential 
time with respect to the grammar size for some am- 
biguous grammars and inputs (Johnson, 1989). 2 So 
far, there is no formal proof showing that the Tomita's 
parser can be superior for some grammars and in- 
puts to Earley's parser, and its worst case complexity 
seems to contradict the experimental data. 
As explained, the previous attempts to compile 
the predictive component are not general and achieve 
a worst case complexity (with respect to the gram- 
mar size and the input length) worse than standard 
parsers. 
The methodology we follow in order to compile the 
predictive component of Earley's parser is to define 
a predictive bottom-up pushdown machine equiva- 
lent to the given grammar which we drive in pseudo- 
parallel. Following Johnson's (1989) argument, any 
parsing algorithm based on the LR(0) characteris- 
tic machine is doomed to behave in exponential time 
with respect to the grammar size for some ambigu- 
ous grammars and inputs. This is a result of the fact 
that the number of states of an LR(0) characteristic 
machine can be exponential and that there are some 
grammars and inputs for which an exponential num- 
ber of states must be reached (See Johnson \[1989\] for 
examples of such grammars and inputs). One must 
therefore design a different pushdown machine which 
1 The same arguraent for the exponential graramar size com- 
plexity of Tomita's parser (Johnson, 1989) holds for Lang's 
technique. 
2 This problem is particularly acute for natural language pro- 
cessing since in this context the input length is typically small 
(10-20 words) and the granunar size very large (hundreds or 
thousands of rules and symbols). 
can be driven efficiently in pseudo-parallel. 
We construct a non-deterministic predictive push- 
down machine given an arbitrary context-free gram- 
mar whose number of states is proportional to the size 
of the grammar. Then at run time, we efficiently drive 
this machine in pseudo-parallel. Even if all the states 
of the machine are reached for some grammars and 
inputs, a polynomial complexity will still be obtained 
since the number of states is bounded by the gram- 
mar size. We therefore introduce a shift-reduce driver 
for this machine in which all of the predictive compo- 
nent has been compiled in the finite state control of 
the machine. The technique makes no requirement on 
the form of the context-free grammar and it behaves 
in the worst case as well as Earley's parser (Earley, 
1970). The push-down machine is built before run- 
time and it is encoded as parsing tables in the which 
the predictive behavior has been compiled out. 
In the worst case, the recognizer behaves in the 
same O(\[Gl2nS)-time and O(\[G\[n2)-space as Earley's 
parser. However in practice it is always superior 
to Earley's parser since the prediction steps have 
been eliminated before run-time. We show that the 
items produced in the chart correspond to equiva- 
lence classes on the items produced for the same input 
by Earley's parser. This mapping formally shows its 
practical superior behavior. 3 
Finally, we explain how other more efficient vari- 
ants of the basic parser can be obtained by deter- 
minizing portions of the basic non-deterministic push- 
down machine while still using the same pseudo- 
parallel driver. 
2 The Parser 
The parser we propose handles any context-free gram- 
mar; the grammar can be ambiguous and need not be 
in any normal form. The parser is a predictive shift- 
reduce bottom-up parser that uses compiled top down 
prediction information in the form of tables. Before 
run-time, a non-deterministic push down automa- 
ton (NPDA) is constructed from a given context-free 
grammar. The parsing tables encode the finite state 
control and the moves of the NPDA. At run-time, 
the NPDA is then driven in pseudo-parallel with the 
help of a chart. We show the construction of a basic 
machine which will be driven non-deterministically. 
In the following, the input string is w -- al...an 
and the context-free grammar being considered is 
G = (~, NT, P, S), where ~ is the set of terminal 
3The characteristic LR(0) machine is the result of deter- 
minizing the n~acldne we introduce. Since this procedure in- 
troduce exponentially more states, the LR(0) machine can be 
exponentially large. 
107 
symbols, NT the set of non-terminal symbols, P a 
set of production rules, S the start symbol. We will 
need to refer to the subsequence of the input string 
w = az...aN from position i to j, w\]i,j\], which we 
define as follows: 
f ai+l ... aj , if i < j w\]i,~\] I, 
¢ ,ifi>_j 
We explain the data-structures used by the parser, 
the moves of the parser, and how the parsing tables 
are constructed for the basic NPDA. Then, we study 
the formal characteristics of the parser. 
The parser uses two moves: shift and reduce. As in 
standard shift-reduce parsers, shift moves recognize 
new terminal symbols and reduce moves perform the 
recognition of an entire context-free rule. However in 
the parser we propose, shift and reduce moves behave 
differently on rules whose recognition has just started 
(i.e. rules that have been predicted) than on rules 
of which some portion has been recognized. This be- 
havior enables the parser to efficiently perform reduce 
moves when ambiguity arises. 
2.1 Data-Structures and the Moves of 
the Parser 
The parser collects items into a set called the chart, 
C. Each item encodes a well formed substring of the 
input. The parser proceeds until no more items can 
be added to the chart C. 
An item is defined as a triple (s,i,jl, where s is a 
state in the control of the NPDA, i and j are indices 
referring to positions in the input string (i, j E \[0, n\]). 
In an item (s,i,j), j corresponds to the current 
position in the input string and i is a position in the 
input which will facilitate the reduce move. 
A dotted rule of a context-free grammar G is defined 
as a production of G associated with a dot at some 
position of the right hand side: A ~ a •/~ with 
A --~ afl E P. 
We distinguish two kinds of dotted rules. Kernel 
dotted rules, which are of the form A ~ a • fl with a 
non empty, and non-kernel dotted rules, which have 
the dot at the left most position in the right hand 
side (A --~ •1~). As we will see, non-kernel dotted 
rules correspond to the predictive component of the 
parser. 
We will later see each state s of the NPDA corre- 
sponds to a set of dotted rules for the grammar G. 
The set of all possible states in the control of the 
NPDA is written S. Section 2.2 explains how the 
states are constructed. 
The algorithm maintains the following property 
(which guarantees its soundness)4: if an item (s, i,j) 
is in the chart C then for all dotted rules A ~ aofl E s 
the following is satisfied: 
(i) if a E (E U NT) +, then B7 E (NT U ~)* such 
that S~w\]o,i\]A 7 and a=:=~w\]~d\]; 
(ii) if a is the empty string, then B 7 E (NT O ~)* 
such that S=~w\]0./\]A 7. 
The parser uses three tables to determine which 
move(s) to perform: an action table, ACTION, and 
two goto tables, the kernel goto table, GOTOk, and 
the non-kernel goto table, GOTOnk. 
The goto tables are accessed by a state and a non- 
terminal symbol. They each contain a set of states: 
GOTO~(s,X) = {r},GOTOnk(s,X) = {r'} with 
r, rt,s E S,X E NT. The use of these tables is ex- 
plained below. 
The action table is accessed by a state and a ter- 
minal symbol. It contains a set of actions. Given 
an item, (s, i,j), the possible actions are determined 
by the content of ACTION(s, aj+x) where aj+l is the 
j + 1 th input token. The possible actions contained 
in ACTION(s, aj+l) are the following: 
• KERNEL SHIFT s t, (ksh(s t) for short), for s t E 
S. A new token is recognized in a kernel dotted 
rule A --* a • aft and a push move is performed. 
The item (s I, i,j + 1) is added to the chart, since 
aa spans in this case w\]i,j+l\]. 
• NON-KERNEL SHIFT s t, (nksh(s I) for short), 
for s t E S. A new token is recognized in a non- 
kernel dotted rule of the form A --* •aft. The 
item (s',j,j + 1) is is added to the chart, since a 
spans in this case wljj+x \] 
• REDUCE X ---. fl, (red(X ---* fl) for short), for 
X --* ~ E P. The context-free rule X --*/~ has 
been totally recognized. The rule spans the sub- 
string ai+z ...aj. For all items in the chart of the 
form (s ~, k, i), perform the following two steps: 
- for all rl E GOTOk(s',X), it adds the item 
(ra, k,j) to the chart. In this case, a dotted 
rule of the form A ~ a • Xfl is combined 
with X --* fl• to form A ---* aX •/~; since a 
spans w\]k,i\] and X spans wli,j\], aX spans 
w\]k,j\]. 
- for all r2 E GOTOnk(s t, X), it adds the item 
(r2,i,j) to the chart. In this case, a dot- 
ted rule of the form A ~ • Xf~ is combined 
with X --~ fl• to form A ~ X •/~; in this 
case X spans w\]idl- 
4This property holds for all machines derived from the basic 
NPDA. 
108 
The recognizer follows: 
begin (* recognizer *) 
Input: 
al * • • an 
ACTION 
GOTO~ 
GOTOnk 
start E ,9 .~ C ,q 
(* input string *) 
(* action table *) 
(* kernel goto table *) 
(* non-kernel goto table *) 
(* start state *) 
(* set of final states *) 
Output:acceptance or rejection of the input 
string. 
Initialization: C := {(start, O, 0)} 
Perform the following three operations until no 
more items can be added to the chart C: 
(1) KERNEL SHIFT: if (s,i,j) 6 C and 
if ksh(s') 6 ACTION(s, aj+I), then 
(s', i, j + 1) is added to C. 
(2) NON-KERNEL SHIFT: if (s,i,j) e C 
and if nksh(s') E ACTION(s, aj+I), then 
(s',j,j+ 1) is added to C. 
(3) REDUCE: if (s, i, j) E C, then for all 
X --~ j3 s.t. red(X ~ ~) 6 ACTION(s, aj+t) 
and for all (s', k, i) E C, perform the follow- 
ing: 
• for all rl 6 GOTO~(s',X), (rl,k,j) is 
added to C; 
• for all r2 E GOTOnk(s',X), (r~,i,j) is 
added to C. 
If {(s, O, n) I (s, O, n) 6 C and s e .r} .# # 
then return acceptance 
otherwise return rejection. 
end (* recognizer *) 
In the above algorithm, non-determinism arises 
from multiple entries in ACTION(s, a) and also from 
the fact that GOTOk(s,X)and GOTOnk(s,X)con- 
tain a set of states. 
2.2 Construction of the Parsing Tables 
We shall give an LR(0)-like method for constructing 
the parsing tables corresponding to the basic NPDA. 
Several other methods (such as LR(k)-like, SLR(k)- 
like) can also be used for constructing the parsing 
tables and are described in (Schabes, 1991). 
To construct the LR(0)-like finite state control 
for the basic non-deterministic push-down automaton 
that the parser simulates, we define three functions, 
closure, gotok and gotonk. 
If s is a state, then closure(s) is the state con- 
structed from s by the two rules: 
(i) Initially, every dotted rule in s is added to 
closure(s); 
(ii) If A --* a • B/~ is in closure(s) and B --* 7 is a 
production, then add the dotted rule B --* e7 to 
closure(s) (if it is not already there). This rule 
is applied until no more new dotted rules can be 
added to closure(s). 
If s is a state and if X is a non-terminal or terminal 
symbol, gotok(s,X) and gotonk(s,X) are the set of 
states defined as follows: 
gotok(s, X) = 
{closure({A • A -* • XZ e s 
and a E (Z3 U NT) + } 
gotonk ( s, X ) = 
{closure({A X .,8))1 A • s} 
The goto functions we define differ from the one de- 
fined for the LR(0) construction in two ways: first we 
have distinguished transitions on symbols from ker- 
nel items and non-kernel items; second, each state 
in goto~(s,X) and gOtOn~(S,X) contains exactly one 
kernel item whereas for the LR(0) construction they 
may contain more than one. 
We are now ready to compute the set of states ,9 
defining the finite state control of the parser. 
The SET OF STATES CONSTRUCTION is con- 
structed as follows: 
procedure states(G) 
begin 
S := {closure({S --, .~ I S-* a e P})} 
repeat 
for each state s in 8 
for each X E r~ u NT terminal 
for each r E gotok(s,X) U goton~(s, X) 
add r to S 
until no more states can be added to 8 
end 
PARSING TABLES. Now we construct the LR(0) 
parsing tables ACTION, GOTOk and GOTOnk from 
the finite state control constructed above. Given a 
context-free grammar G, we construct ~q, the set of 
states for G with the procedure given above. We con- 
struct the action table ACTION and the goto tables 
using the following algorithm. 
begin (CONSTRUCTION OF THE PARSING TABLES) 
Input: A context-free grammar 
G = (Y,, NT, P, S). 
Output: The parsing tables ACTION, GOTOk 
and GOTOnk for G, the start state start and 
the set of final states ~'. 
109 
Step 1. Construct 8 = {so,..., sin}, the set of states 
for G. 
Step 2. The parsing actions for state si are deter- 
mined for all terminal symbols a E ~ as follows: 
(i) for all r e gotok(si,a), add ksh(r) to 
ACTION(si, a); 
(ii) for all r E goto, k(si,a), add nksh(r) to to 
ACTION(si, a); 
(iii) if A --* a* is in si, then add red(A--* a) 
to ACTION(si, a) for all terminal symbol a 
and for the end marker $. 
Step 4. The kernel and non-kernel goto tables for 
state si are determined for all non-terminal sym- 
bols X as follows: 
(i) VX E NT, GOTO~(si,X) := gotok(si,X) 
(ii) VX E NT, 
GOTOnk(si, X) :-- gotonk(si, X) 
Step 3. The start state of the parser is 
start := ciosure({S --* .a I S --~ a ~_ P}) 
Step 4. The set of final states of the parser is 
Y := {s e SI3 S--* a 6 P s.t. S--. a. E s} 
end (CONSTRUCTION OF THE PARSING TABLES) 
Appendix A gives an example of a parsing table. 
3 Complexity 
The recognizer requires in the worst case O(\[GIn2)- 
space and O(\[G\[2na)-time; n is the length of the input 
string, \]GI is the size of the grammar computed as 
the sum of the lengths of the right hand side of each 
productions: 
\[GI = E \[a I , where la\] is the length of a. 
A-*a EP 
One of the objectives for the design of the non- 
deterministic machine was to make sure that it was 
not possible to reach an exponential number of states, 
a property without which the machine is doomed to 
have exponential complexity (Johnson, 1989). First 
we observe that the number of states of the finite 
state control of the non-deterministic machine that 
we constructed in Section 2.2 is proportional to the 
size of the grammar, IG\[. By construction, each state 
(except for the start state) contains exactly one ker- 
nel dotted rule. Therefore, the number of states is 
bounded by the maximum number of kernel rules of 
the form A --* ao/~ (with a non empty), and is O(IGI). 
We conclude that the algorithm requires in the worst 
case O(IGIn~)-space since the maximum number of 
items (8, i, j) in the chart is proportional to IGIn 2. 
A close look at the moves of the parser reveals that 
the reduce move is the most complex one since it in- 
volves a pair of states (s, i,j) and (s', k,j/. This move 
can be instantiated at most O(IGI2nS)-time since 
i,j,k E \[0, n\] and there are in the worst case O(IGI ~) 
pairs of states involved in this move. 5 The parser 
therefore behaves in the worst case in O(IGI2nS)-time. 
One should however note that in order to bound the 
worst case complexity as stated above, arrays similar 
to the one needed for Earley's parser must be used to 
implement efficiently the shift and reduce moves. 6 
As for Earley's parser, it can also be shown that the 
algorithm requires in the worst case O(IGI2n2)-time 
for unambiguous context-free grammars and behaves 
in linear time on a large class of grammars. 
4 Retrieving a Parse 
The algorithm that we described in Section 2 is a rec- 
ognizer. However, if we include pointers from an item 
to the other items (to a pair of items for the reduce 
moves or to an item for the shift moves) which caused 
it to be placed in the chart, the recognizer can be 
modified to record all parse trees of the input string. 
The representation is similar to a shared forest. 
The worst case time complexity of the parser is the 
same as for the recognizer (O(\[GI2n3)-time) but, as 
for Earley's parser, the worst case space complexity 
increases to O(\[G\[2n 3) because of the additional book- 
keeping. 
5 Correctness and Comparison 
with Earley's Parser 
We derive the correctness of the parser by showing 
how it can be mapped to Earley's parser. In the pro- 
cess, we will also be able to show why this parser can 
be more efficient than Earley's parser. The detailed 
proofs are given in (Schabes, 1991). 
We are also interested in formally characterizing 
the differences in performance between the parser 
we propose and Earley's parser. We show that the 
parser behaves in the worst scenario as well as Ear- 
ley's parser by mapping it into Earley's parser. The 
parser behaves better than Earley's parser because it 
has eliminated the prediction step which takes in the 
worst case O(\]GIn)-time for Earley's parser. There- 
fore, in the most favorable scenario, the parser we 
SKerael shift and non-kernel shift moves require both at 
most O(IGIn 2 )-time. 
6Due to the lack of space, the details of the implementation 
are not given in this paper but they are given in (Schabes, 
1991). 
110 
propose will require O(IGln) less time than Earley's 
parser. 
For a given context-free grammar G and an input 
string al .-.an, let C be the set of items produced by 
the parser and CearZey be the set of items produced 
by Earley's parser. Earley's parser (Earley, 1970) 
produces items of the form (A ---* a * ~, i, j) where 
A --* a • ~ is a single dotted rule and not a set of 
dotted rules. 
The following lemma shows how one can map the 
items that the parser produces to the items that Ear- 
ley's parser produces for the same grammar and in- 
put: 
Lemma 1 If Cs, i,j) E C then we have: 
(i) for all kernel dotted rules A ~ a • ~ E s, we 
have C A ~ ct • ~, i, j) E CearIey 
(ii) and for all non-kernel dotted rules A ---, *j3 E 
s, we have C A ~ •~, j, j) E Cearaev 
The proof of the above lemma is by induction on 
the number of items added to the chart C. 
This shows that an item is mapped into a set of 
items produced by Earley's parser. 
By construction, in a given state s E S, non-kernel 
dotted rules have been introduced before run-time by 
the closure of kernel dotted rules. It follows that Ear- 
ley's parser can require O(IGln) more space since all 
Earley's items of the form C A ~ •a, i, i) (i E \[0, n\]) 
are not stored separately from the kernel dotted rule 
which introduced them. 
Conversely, each kernel item in the chart created by 
Earley's parser can be put into correspondence with 
an item created by the parser we propose. 
Lemma 2 If CA --* a • fl, i,j) E CearZev and if (~ # e, 
then C s, i,j) e C where s = closure({A ~ a • fl}). 
The proof of the above lemma is by induction on 
the number of kernel items added to the chart created 
by Earley's parser. 
The correctness of the parser follows from Lemma 1 
and its completeness from Lemma 2 since it is well 
known that the items created by Earley's parser are 
characterized as follows (see, for example, page 323 in 
Aho and Ullman \[1973\] for a proof of this invariant): 
Lemma 3 The item C A --. a • fl, i, j) E Ceartey 
if and only if, ST E (VNT U VT)* such that 
S"~W\]o,i\]XT and X==c, FA=~w\]ij\]A. 
The parser we propose is therefore more efficient 
than Earley's parser since it has compiled out predic- 
tion before run time. How much more efficient it is, 
depends on how prolific the prediction is and therefore 
on the nature of the grammar and the input string. 
6 Optimizations 
The parser can be easily extended to incorporate stan- 
dard optimization techniques proposed for predictive 
parsers. 
The closure operation which defines how a state 
is constructed already optimizes the parser on chain 
derivations in a manner very similar to the tech- 
niques originally proposed by Graham eta\]. (1980) 
and later also used by Leiss (1990). 
In addition, the closure operation can be designed 
to optimize the processing of non-terminal symbols 
that derive the empty string in manner very simi- 
lar to the one proposed by Graham et al. (1980) and 
Leiss (1990). The idea is to perform the reduction 
of symbols that derive the empty string at compila- 
tion time, i.e. include this type of reduction in the 
definition of closure by adding (iii): 
If s is a state, then closure(s) is now the state con- 
structed from s by the three rules: 
(i) Initially, every dotted rule in s is added to 
closure(s); 
(ii) ifA~ a.Bflisinclosure(s) andB ~ 7is 
a production, then add the dotted rule B ~ • 7 
to closure(s) (if it is not already there); 
(iii) ifA ~ a.B~ is in closure(s) and ifB=~ e, then 
add the dotted rule A ~ aB • ~ to closure(s) 
(if it is not already there). 
Rules (ii) and (iii) are applied until no more new 
dotted rules can be added to closure(s). 
The rest of the parser remains as before. 
7 Variants on the basic ma- 
chine 
In the previous section we have constructed a ma- 
chine whose number of states is in the worst case 
proportional to the size of the grammar. This re- 
quirement is essential to guarantee that the complex- 
ity of the resulting parser with respect to the gram- 
mar size is not exponential or worse than O(IGI2)- 
time as other well known parsers. However, we may 
use some non-determinism in the machine to guaran- 
tee this property. The non-determinism of the ma- 
chine is not a problem since we have shown how the 
non-deterministic machine can be efficiently driven in 
pseudo-parallel (in O(\[G\[2n3)-time). 
We can now ask the question of whether it is pos- 
sible to determinize the finite state control of the ma- 
chine while still being able to bound the complexity 
of the parser to O(\[Gl2n3)-time. Johnson (1989) ex- 
hibits grammars for which the full determinization 
111 
of the finite state control (the LR(0) construction) 
leads to a parser with exponential complexity, because 
the finite state control has an exponential number of 
states and also because there are some input string 
for which an exponential number of states will be 
reached. However, there are also cases where the full 
determin~ation either will not increase the number 
of states or will not lead to a parser with exponential 
complexity because there are no input that require to 
reach an exponential number of states. We are cur- 
rently studying the classes of grammars for which this 
is the case. 
One can also try to determinize portions of the fi- 
nite state automaton from which the control is derived 
while making sure that the number of states does not 
become larger than O(IGI). 
All these variants of the basic parser obtained by 
determinizing portions of the basic non-deterministic 
push-down machine can be driven in pseudo-parallel 
by the same pseudo-parallel driver that we previously 
defined. These variants lead to a set of more efficient 
machines since the non-determinism is decreased. 
8 Conclusion 
We have introduced a shift-reduce parser for unre- 
stricted context-free grammars based on the construc- 
tion of a non-deterministic machine and we have for- 
mally proven its superior performance compared to 
Earley's parser. 
The technique which we employed consists of con- 
structing before run-time a parsing table that encodes 
a non-deterministic machine in the which the predic- 
tive behavior has been compiled out. At run time, the 
machine is driven in pseudo-parallel with the help a 
chart. 
By defining two kinds of shift moves (on kernel dot- 
ted rules and on non-kernel dotted rules) and two 
kinds of reduce moves (on kernel and non-kernel dot- 
ted rules), we have been able to efficiently evaluate in 
pseudo-parallel the non-deterministic push down ma- 
chine constructed for the given context-free grammar. 
The same worst case complexity as Earley's rec- 
ognizer is achieved: O(IGl2na)-time and O(IG\]n2) - 
space. However, in practice, it is superior to Earley's 
parser since all the prediction steps and some of the 
completion steps have been compiled before run-time. 
The parser can be modified to simulate other types 
of machines (such LR(k)-like or SLR-like automata). 
It can also be extended to handle unification based 
grammars using a similar method as that employed 
by Shieber (1985) for extending Earley's algorithm. 
Furthermore, the algorithm can be tuned to a par- 
ticular grammar and therefore be made more effi- 
cient by carefully determinizing portions of the non- 
deterministic machine while making sure that the 
number of states in not increased. These variants 
lead to more efficient parsers than the one based on 
the basic non-deterministic push-down machine. Fur- 
thermore, the same pseudo-parallel driver can be used 
for all these machines. 
We have adapted the technique presented in this 
paper to other grammatical formalism such as tree- 
adjoining grammars (Schabes, 1991). 

Bibliography 
A. V. Aho and J. D. Ullman. 1973. Theory of Pars- 
ing, Translation and Compiling. Vol I: Parsing. 
Prentice-Hall, Englewood Cliffs, NJ. 
Jay C. Earley. 1968. An Efficient Context-Free Pars- 
ing Algorithm. Ph.D. thesis, Carnegie-Mellon Uni- 
versity, Pittsburgh, PA. 
Jay C. Earley. 1970. An efficient context-free parsing 
algorithm. Commun. ACM, 13(2):94-102. 
S.L. Graham, M.A. Harrison, and W.L. Ruzzo. 1980. 
An improved context-free recognizer. ACM Trans- 
actions on Programming Languages and Systems, 
2(3):415-462, July. 
Mark Johnson. 1989. The computational complex- 
ity of Tomlta's algorithm. In Proceedings of the 
International Workshop on Parsing Technologies, 
Pittsburgh, August. 
T. Kasami. 1965. An efficient recognition and syn- 
tax algorithm for context-free languages. Technical 
Report AF-CRL-65-758, Air Force Cambridge Re- 
search Laboratory, Bedford, MA. 
James R. Kipps. 1989. Analysis of Tomita's al- 
gorithm for general context-free parsing. In Pro- 
ceedings of the International Workshop on Parsing 
Technologies, Pittsburgh, August. 
D. E. Knuth. 1965. On the translation of languages 
from left to right. Information and Control, 8:607- 
639. 
Bernard Lang. 1974. Deterministic tech- 
niques for efficient non-deterministic parsers. In 
Jacques Loeckx, editor, Automata, Languages 
and Programming, 2nd Colloquium, University of 
Saarbr~cken. Lecture Notes in Computer Science, 
Springer Verlag. 
Hans Leiss. 1990. On Kilbury's modification of Ear- 
ley's algorithm. ACM Transactions on Program- 
ming Languages and Systems, 12(4):610-640, Oc- 
tober. 
Yves Schabes. 1991. Polynomial time and space 
shift-reduce parsing of context-free grammars and 
of tree-adjoining grammars. In preparation. 
Stuart M. Shieber. 1985. Using restriction to ex- 1 
tend parsing algorithms for complex-feature-based 2 
formalisms. In 23 rd Meeting of the Association 3 4 
for Computational Linguistics (ACL '85), Chicago, s 
July. 
Masaru Tomita. 1985. Efficient Parsing for Natural 
Language, A Fast Algorithm for Practical Systems. 
Kluwer Academic Publishers. 
Masaru Tomita. 1987. An efficient augmented- 
context-free parsing algorithm. Computational 
Linguistics, 13:31-46. 
D. H. Younger. 1967. Recognition and parsing of 
context-free languages in time n 3. Information and 
Control, 10(2):189-208. 
