Graph-structured Stack and Natural Language Parsing 
Masaru Tomlta 
Center for Machine Translation 
and 
Computer Science Department 
Camegie-MeUon University 
Pittsburgh, PA 15213 
Abstract 
A general device for handling nondeterminism in stack 
operations is described. The device, called a 
Graph-structured Stack, can eliminate duplication of 
operations throughout the nondeterministic processes. 
This paper then applies the graph-structured stack to 
various natural language parsing methods, including 
ATN, LR parsing, categodal grammar and principle- 
based parsing. The relationship between the graph- 
structured stack and a chart in chart parsing is also 
discussed. 
1. Introduction 
A stack plays an important role in natural language 
parsing. It is the stack which gives a parser context- 
free (rather than regular) power by permitting 
recursions. Most parsing systems make explicit use 
of the stack. Augmented Transition Network (ATN) 
\[10\] employs a stack for keeping track of retum 
addresses when it visits a sub-network. Shift-reduce 
parsing uses a stack as a pdmary device; sentences 
are parsed only by pushing an element onto the stack 
or by reducing the stack in accordance with 
grammatical rules. Implementation of pdnciple-based 
parsing \[9, 1, 4\] and categodal grammar \[2\] also often 
requires a stack for stodng partial parses already builL 
Those parsing systems usually introduce backtracking 
or pseudo parallelism to handle nondeterminism, 
taking exponential time in the worst case. 
This paper describes a general device, a 
graph-structured stack. The graph-structured stack 
was originally introduced in Tomita's generalized LR 
parsing algorithm \[7, 8\]. This paper applies the graph- 
structured stack to various other parsing methods. 
Using the graph-structured stack, a system is 
guaranteed not to replicate the same work and can 
run in polynomial time. This is true for all of the 
parsing systems mentioned above; ATN, shift-reduce 
parsing, principle-based parsing, and perhaps any 
other parsing systems which employ a stack. 
The next section describes the graph-structure 
stack itself. Sections 3, 4, 5 and 6 then describe the 
use of the graph-structured stack in shift-reduce LR 
parsing, ATN, Categorlal Grammars, and principle- 
based parsing, respectively. Section 7 discusses the 
relationship between the graph-structured stack and 
chart \[5\], demonstrating that chart parsing may be 
viewed as a special case of shift-reduce parsing with 
a graph-structured stack. 
2. The Graph-structured Stack 
In this section, we describe three key notions of the 
graph-structured stack: splitting, combining and local 
ambiguity packing. 
• 2.1. SpUttlng 
When a stack must be reduced (or popped) in more 
than one way, the top of the stack is split. Suppose 
that the stack is in the following state. The left-most 
element, A, is the bottom of the stack, and the right- 
most element, E, is the top of the stack. In a graph- 
structured stack, there can be more than one top, 
whereas there can be only one bottom. 
#,--- n --- C --- D --- Z 
Suppose that the stack must be reduced in the 
following three different ways. 
F <-- D \]~ 
G <-- D IB 
H<-- C D 1 
Then after the three reduce actions, the stack looks 
249 
like: 
A --- B lom \ 
\ 
\ 
--i F / 
/ 
C .... G 
lfl 
2.2. Combining 
When an element needs to be shifted (pushed) 
onto two or more tops of the stack, it is done only 
once by combining the tops of the stack. For 
example, if "1" is to be shifted to F, G and H in the 
above example, then the stack will look like: 
/-- r --\ / \ 
/ \ 
A --- B --- C .... G .... Z \ / 
\ / \ a--/ 
2.3. Local Ambiguity Packing 
If two or more branches of the stack turned out to 
be Identical, then they represent local ambiguity; the 
Identical state of stack has been obtained in two or 
more different ways. They are merged and treated as 
a single branch. Suppose we have two rules: 
J<--F Z 
J<-- G Z 
After applying these two rules to the example above, 
the stack will look like: 
A---a .... c---o \ 
\ 
\-- x --- z 
The branch of the stack, "A-B-C-J', has been 
obtained in two ways, but they are merged and only 
one is shown in the stack. 
3. Graph-structured Stack and 
Shift-reduce LR Parsing 
In shift-reduce parsing, an input sentence is parsed 
from left to dght. The parser has a stack, and there 
are two basic operations (actions) on the stack: shift 
and reduce. The shift action pushes the next word in 
the input sentence onto the top of the stack. The 
reduce action reduces top elements of the stack 
according to a context-free phrase structure rule in the 
grammar. 
One of the most efficient shift-reduce parsing 
algorithms is LR parsing. The LR parsing algodthm 
pre-compiles a grammar into a parsing table; at run 
time, shift and reduce actions operating on the stack 
are deterministically guided by the parsing table. No 
backtracking or search is involved, and the algodthm 
runs in linear time. This standard LR parsing 
algorithm, however, can deal with only a small subset 
of context-free grammars called LR grammars, which 
are often sufficient for programming languages but 
cleady not for natural languages. If, for example, a 
grammar is ambiguous, then its LR table would have 
multiple entries, and hence deterministic parsing 
would no longer be possible. 
Figures 3-1 and 3-2 show an example of a non-LR 
grammar and its LR table. Grammar symbols starting 
with " represent pre-terminals. Entdes "sh n" in the 
actton table (the left part of the table) Indicate that the 
action is to "shift one word from input buffer onto the 
stack, and go to state n'. Entries "re n" Indicate that 
the action is to "reduce constituents on the stack using 
rule n'. The entry "acc" stands for the action "accept', 
and blank spaces represent "error'. The goto table 
(the dght part of the table) decides to which state the 
parser .should go after a reduce action. The LR 
parsing algorithm pushes state numbers (as well as 
constituents) onto the stack; the state number on the 
top of the stack Indicates the current state. The exact 
definition and operation of the LR parser can be found 
in Aho and UIIman \[3\]. 
We can see that there are two multiple entries in 
the action table; on the rows of state 11 and 12 at the 
column labeled "prep'. Roughly speaking, this is the 
situation where the parser encounters a preposition of 
a PP right after a NP. If this PP does not modify the 
NP, then the parser can go ahead to reduce the NP to 
a higher nonterminal such as PP or VP, using rule 6 
or 7, respectively (re6 and re7 in the multiple entries). 
If, on the other hand, the PP does modify the NP, then 
250 
(1) S --> NP VP 
(2) S --> S PP 
(3) NP --> *n 
(4) NP --> *det *n 
(5) NP --> NP PP 
(6) PP --> *prep NP 
(7) VP --> *v NP 
Figure 3-1: An Example Ambiguous Grammar 
State *det *n *v *prep $ NP PP VP S 
0 
1 
2 
3 
4 
5 
6 
8 
9 
I0 
11 
12 
sh3 sh4 
shl0 
sh3 sh4 
sh3 sh4 
sh7 
re3 
2 1 
sh6 acc 5 
sh6 9 8 
re3 re3 
re2 re2 
11 
12 
re1 re1 
re5 re5 re5 
re4 re4 re4 
re6 re6, sh6 re6 9 
re7,sh6 re7 9 
Figure 3-2: LR Parsing Table with Multiple Entries 
(dedved from the grammar in fig 3-1) . 
I ........ s .................... 1 ........ \ 
I \ 
I ............. s, I ...... \ \ 
I \ \ 
I I ........ =re ...... 12 ....... \ \ 
I I \ \ 
o~--m,--2---v---'/~--~e--12--~p---6~--m,--11---p---6---ae-~11--~p---6 
\ ..... s .... I .... \-. I \ ~-e ......... I I 
\ I 
\-,--. .... ~re .............. 6 I 
Flgure 3-3: A Graph-structured Stack 
251 
the parser must wait (sh6) until the PP is completed 
so it can build a higher NP using rule 5. 
With a graph-structured stack, these non- 
deterministic phenomena can be handled efficiently in 
polynomial time. Figure 3-3 shows the graph- 
structured stack right after shifting the word "with" in 
the sentence "1 saw a man on the bed in the 
apartment with a telescope." Further description of 
the generalized LR parsing algorithm may be found in 
Tomita \[7, 8\]. 
4. Graph-structured Stack and ATN 
An ATN parser employs a stack for saving local 
registers and a state number when it visits a 
subnetwork recursively. In general, an ATN is 
nondeterministic, and the graph-structured stack is 
viable as may be seen in the following example. 
Consider the simple ATN, shown in figure 4-1, for the 
sentence "1 saw a man with a telescope." 
After parsing "1 saw", the parser is in state $3 and 
about to visit the NP subnetwork, pushing the current 
environment (the current state symbol and all 
registers) onto the stack. After parsing "a man', the 
stack is as shown in figure 4-2 (the top of the stack 
represents the current environment). 
Now, we are faced with a nondeterministic choice: 
whether to retum from the NP network (as state NP3 
is final), or to continue to stay in the NP network, 
expecting PP post nominals. In the case of returning 
from NP, the top element (the current environment) is 
popped from the stack and the second element of the 
stack is reactivated as the current environment. The 
DO register is assigned with the result from the NP 
network, and the current state becomes $4. 
At this moment, two processes (one in state NP3 
and the other in state $4) are alive 
nondeterministically, and both of them are looking for 
a PP. When "with" is parsed, both processes visit the 
PP network, pushing the current environment onto the 
stack. Since both processes are to visit the same 
network PP, the current environment is pushed only 
once to both NP3 and $4, and the rest of the PP is 
parsed only once as shown in figure 4-3. 
Eventually, both processes get to the final state $4, 
and two sets of registers are produced as its final 
results (figure 4-4). 
5. Graph-structured Stack and categorial 
grammar 
Parsers based on categodal grammar can be 
implemented as shift-reduce parsers with a stack. 
Unlike phrase-structure rule based parsers, 
information about how to reduce constituents is 
encoded in the complex category symbol of each 
constituent with functor and argument features. 
Basically, the parser parses a sentence strictly from 
left to dght, shiffing words one-by-one onto the stack. 
In doing so, two elements from the top of the stack are 
Inspected to see whether they can be reduced. The 
two elements can be reduced in the following cases: 
• x/'z x -> x (Forward Functional 
Application) 
• Y x\x -> x (Backward Functional 
Application) 
• x/x x/z -> x/z (Forward Functional 
Composition) 
• x\z x/x -> x\z (Backward 
Functional Composition) 
When it reduces a stack, it does so non-destnJctively; 
that is, the original stack is kept alive even after the 
reduce action. An example categodal grammar is 
presented in figure 5-1. 
z 
saw (s\~e)/,~ 
• ~I~ 
nusn N 
w~th (.r~\~)/m,, ((s\m,) \ (s\m,))/m, 
• I~/N 
telescope N 
Figure 5-1: An Example Categodal Grammar 
The category, (S\NP), represents a verb phrase, as 
it becomes S if there is an NP on its left. The 
categories, (NP~NP) and (S\NP)\(S\NP), represent a 
prepositional phrase, as it becomes a noun phrase or 
a verb phrase if there is a noun phrase or a verb 
phrase on its left, respectively. Thus, a preposition 
such as "with" has two complex categodas as in the 
252 
PP / .... \ 
v ~ / I 
(Sl) ...... > (S2) ....... > (S3) ...... > \[S4\] < .... / 
PP / .... \ 
det n / J 
(NP1) ..... > (HP2) ..... > \[NP3\] < .... / 
\ 
\ p:on 
\ ..... > \[.rP4\] 
p NP 
(PP1) ..... > (PIP2) ..... > \[PP3\] 
SI-NP-S2 
52-v-53 
S3-NP-S4 
S4-PP-S4 
NPI-det-NP2 
NP2-n-NP3 
NP3-PP-NP3 
NPI-pEon-NP4 
PPI-p-PP2 
PP2-NP-PP3 
A: Sub:) <-- * 
C: (Sub j -ve:b-ag:eement ) 
A: MY<-- * 
A: DO<-- * 
A: \]~:x:\[8 <=m * 
A: Det <-- * 
A: Head <-- . 
A: Qua1 <-- * 
A: Head <-- * 
A: Prep <-- * 
A: P:el~:)b:) <-- * 
\[\]: final states 
(): non-final states 
Figure 4-1: A Simple ATN for "1 saw a man with a telescope" 
botto~ S3 N~3 
\[Sub:): Z \[Det: a 
MV: Head: 
\[=oat: : sea \[=oat= : man 
tense: past\]\] Hum: 8Angle\]\] 
Figure 4-2: Graph-structured Stack in ATN Parsing "1 saw a man" 
bottom \ 
\ \ 
\. 
\ \ 
\ 
S3 NP3 
\[Sub:): X \[Det: a 
\]~: Head: man\] 
\[=oat: see 
tense: past\]\] 
S4 
\[Sub:): z 
MV: \[¢oot: see 
tense: past\] 
DO: \[Det: a 
Head: man\]\] 
PP2 \[Pr~p: 
with\] 
/ 
/. / 
/ / 
Figure 4-3: Graph-structured Stack in ATN Parsing "1 saw a man with a" 
\]NrP2 
\[Det : a\] 
253 
bott~ s4 \[sub:) 
: z 
MV: \[=got: see 
1Cerise : past\] 
IX): \[Det: a 
Head: man\] 
Mods: \[P=ep: with 
P:epOb:): \[Det: a 
Head: t:elescope\] \] \] (sub::): 
z 
MV: \[=oo'c : see 
tense: past\] 
IX): \[Det: e 
Head: man\] 
Qua1: \[P=ep : with 
P:epObj: \[Det: a 
\]Bead: telescope\] \] \] 
Figure 4-4: Graph-structured Stack in ATN Parsing "1 saw a man with a telescope" 
/-. (s\~m)/~ / 
Figure 5-1: Graph-structured Stack in CG parsing 
"1 saw a" 
/ .............. (S\Ne)/H ..... \ / \ 
bottom .... m~ .... (s\~) In .... ~/a ...... \ \ \ 
\ \ \ m~ 
\ \ 
\ \, s\~ \ 
\ s 
Figure 5-2: Graph-structured Stack in CG parsing "1 saw a man" 
/ ........ (sXsP)/s ..... \ / \ 
botto~ --- ~ --- (s\~)/lce --- mP/m --- H --\ 
\ \ \ \ /-- (mP\~) INs \ \ \ 
we .... \ I 
\ \ I .... ((s\mP) / (s\Ne)) INs 
\ \, s\~m --I \ / 
\ s ---I 
Figure 5-3: Graph-structured Stack in CG parsing "1 saw a man with" 
254 
example above. Nondeterminism in this formalism 
can be similarly handled with the graph-structured 
stack. After parsing "1 saw a', there is only one way to 
reduce the stack; (S\NP)/NP and NP/N into 
(S\NP)/N with Forward Functional Composition. The 
graph-structured stack at this moment is shown in 
figure 5-1. 
After parsing "man', a sequence of reductions takes 
place, as shown in figure 5-2. Note that S\NP is 
obtained in two ways (S\NP)/N N --> S\NP and 
(S\NP)/NP NP --> S\NP), but packed into one node 
with Local Ambiguity Packing described in section 2.3. 
The preposition "with" has two complex categories; 
both of them are pushed onto the graph-structured 
stack, as in figure 5-3. 
This example demonstrates that Categodal 
Grammars can be implemented as shift-reduce 
parsing with a graph-structured stack, it Is interesting 
that this algorithm is almost equivalent to "lazy chart 
parsing" descdbed in Paraschi and Steedman \[6\]. 
The relationship between the graph-structured stack 
and a chart in chad parsing is discussed in section 7. 
6. Graph-structured Stack and 
Principle-based Parsing 
Pdnciple-based parsers, such as one based on the 
GB theory, also use a stack to temporarily store partial 
trees. These parsers may be seen as shift-reduce 
parsers, as follows. Basically, the parser parses a 
sentence strictly from left to dght, shifting a word onto 
the stack one-by-one. In doing so, two elements from 
the top of the stack are always inspected to see 
whether there are any ways to combine them with one 
of the pdnciplas, such as augment attachment, 
specifier attachment and pre- and post-head adjunct 
attachment (remember, there are no outside phrase 
structure rules in principle-based parsing). 
Sometimes these principles conflict and there is 
more than one way to combine constituents. In that 
case, the graph-structure stack is viable to handle 
nondeterminism without repetition of work. Although 
we do not present an example, the implementation of 
pdnciple-based parsing with a graph-structured stack 
is very similar to the Implementation of Categodal 
Grammars with a graph-structured stack. Only the 
difference is that, in categodal grammars, Information 
about when and how to reduce two constItuents on 
the top of the graph-structured stack is explicitely 
encoded in category symbols, while in principle-based 
parsing, it is defined implicitely as a set of pdnciplas. 
7. Graph-structured Stack and Chart 
Some parsing methods, such as chart parsing, do 
not explicitly use a stack. It Is Interesting to 
investigate the relationship between such parsing 
methods and the graph-structured stack, and this 
section discusses the correlation of the chart and the 
graph-structured stack. We show that chad parsing 
may be simulated as an exhaustive version of shift- 
reduce parsing with the graph-structured stack, as 
described Informally below. 
1. Push the next word onto the graph- 
structured stack. 
2. Non-destructively reduce the graph- 
structured stack in all possible ways with 
all applicable grammar rules; repeat 
until no further reduce action is 
applicable. 
3. Go to 1. 
A snapshot of the graph-structured stack in the 
exhaustive shift-reduce parsers after parsing "1 saw a 
man on the bed in the apartment with" is presented in 
figure 7-1 (slightly simplified, ignodng determiners, for 
example). A snapshot of a chart parser alter parsing 
the same fragment of the sentence is also shown in 
figure 7-2 (again, slightly simplified). It is clear that the 
graph-structured stack in figure 7-1 and the chart in 
figure 7-2 are essentially the same; in fact they are 
topologically Identical if we ignore the word boundary 
symbols, "*', in figure 7-2. It is also easy to observe 
that the exhaustive version of shitt-reduce parsing is 
essentially a version of chart parsing which parses a 
sentence from left to dght. 
255 
/ ..... s ........ \ 
/ \ 
/ ............. s ........... \ \ 
/ \ \ 
I I ........ ~ ............... \ \ 
/ I \ \ 
bott~ ..... ~ ..... v ...... ~ ...... p ...... ~ ...... p ...... ~ ...... p 
\ \ I\ I 
\ ..... s ...... \, I \ ......... ~ I \ / 
\ ......... ~ ......................... I 
Figure 7.1: A Graph-structured Stack in an Exhaustive Shift-Reduce Parser 
"1 saw a man on the bed in the apartment with" 
/IIIIIIIIIlllIl~' .IIIIIIIIIIIIIIl II~ 
I \ 
I ............. s .................... \ \ 
I \ \ 
I I ........ m, ........... \ \ I I \ \ 
----~---*---.---'---IqP---*---p---'---NP---*---p---*---We---*---p---* 
\ \ I \ I 
\ ..... s .......... \ .... I \ ,we .......... I 
\ I 
\ m~ ........... I 
"Z" "laW" "a I" "On" "thl ~d" "4n" "the apt" "w4th" 
Figure 7.2: Chart in Chart Parsing 
"1 saw a man on the bed in the apartment with" 
256 
8. Summary 
The graph-structured stack was introduced in the 
Generalized LR parsing algorithm \[7, 8\] to handle 
nondeterminism in LR parsing. This paper extended 
the general idea to several other parsing methods: 
ATN, principle-based parsing and categodal grammar. 
We suggest considering the graph-structure stack for 
any problems which employ a stack 
nondeterministically. It would be interesting to see 
whether such problems are found outside the area of 
natural language parsing. 

Bibliography 
\[I\] Abney, S. and J. Cole. 
A Govemment-Blnding Parser. 
In Proceedings of the North Eastern Linguistic 
Society. XVI, 1985. 
\[2\] Ades, A. E. and Steedman, M. J. 
On the Order of Words. 
Linguistics and Philosophy 4(4):517-558, 
1982. 
\[3\] Aho, A. V. and UIIman, J. D. 
Principles of Compiler Design. 
Addison Wesley, 1977. 
\[4\] Barton, G. E. Jr. 
Toward a Principle-Based Parser. 
A.I. Memo 788, MITAI Lab, 1984. 
\[5\] Kay, M. 
The MIND System. 
Natural Language Processing. 
' Algodthmics Press, New York, 1973, pages 
pp.155-188. 
\[6\] Pareschi, R. and Steedman, M. 
A Lazy Way to Chart-Parse with Categodal 
Grammars. 
25th Annual Meeting of the Association for 
Computational Linguistics :81-88, 1987. 
\[7\] Tomita, M. 
Efficient Parsing for Natural Language. 
Kluwer Academic Publishers, Boston, MA, 
1985. 
\[8\] Tomita, M. 
An Efficient Augmented-Context-Free Parsing 
Algorithm. 
Computational Linguistics 13(1-2):31-46, 
January-June, 1987. 
Wehdi, E. 
A Government-Binding Parser for French. 
Working Paper 48, Institut pour les Etudes 
Semantiquas et Cognitives, Unlversite de 
Geneve, 1984. 
Woods, W. A. 
Transition Network Grammars for Natural 
Language Analysis. 
CACM 13:pp.591-606, 1970. 
