Sentence Disambiguation 
by a Shift-Reduce Parsing Technique* 
Stuart M. Shieber 
Abstract 
Artificial Intelligence Center 
SRI International 
333 Ravenswood Avenue 
Menlo Park, CA 94025 
Native speakers of English show definite and consistent 
preferences for certain readings of syntactically ambiguous sen- 
tences. A user of a natural-language-processing system would 
naturally expect it to reflect the same preferences. Thus, such 
systems must model in some way the linguistic performance as 
well as the linguistic competence of the native speaker. We 
have developed a parsing algorithm--a variant of the LALR(I} 
shift.-reduce algorithm--that models the preference behavior of 
native speakers for a range of syntactic preference phenomena 
reported in the psycholinguistic literature, including the recent 
data on lexical preferences. The algorithm yields the preferred 
parse deterministically, without building multiple parse trees 
and choosing among them. As a side effect, it displays ap- 
propriate behavior in processing the much discussed garden-path 
sentences. The parsing algorithm has been implemented and has 
confirmed the feasibility of our approach to the modeling of these 
phenomena. 
1. Introduction 
For natural language processing systems to be useful, they 
must assign the same interpretation to a given sentence that a 
native speaker would, since that is precisely the behavior users 
will expect.. Consider, for example, the case of ambiguous sen- 
tences. Native speakers of English show definite and consistent 
preferences for certain readings of syntactically ambiguous sen- 
tences \[Kimball, 1973, Frazier and Fodor, 1978, Ford et aL, 1982\]. 
A user of a natural-language-processing system would naturally 
expect, it to reflect the same preferences. Thus, such systems 
must model in some way the lineuistie performance as well as 
the linguistic competence of the native speaker. 
This idea is certainly not new in the artificial-intelligence 
literature. The pioneering work of Marcus \[Marcus, 1980\] is per- 
haps the best. known example of linguistic-performance modeling 
in AI. Starting from the hypothesis that ~deterministic" parsing 
of English is possible, he demonstrated that certain performance 
"This research was supported by the Defense Advanced Research Proiects 
Agency under Contract NOOO39-80-C-0575 with the Naval Electronic 
Systems Command. The views and conclusions contained in this document 
are those of the author and should not be interpreted a.s representative of 
the oh~cial policies, either expressed or implied, of the Defense Advanced 
Research Projects Agency or the United States government. 
constraints, e.g., the difl\]culty of parsing garden-path sentences, 
could be modeled. His claim about deterministic parsing was 
quite strong. Not only was the behavior of the parser required 
to be deterministic, but, as Marcus claimed, 
The interpreter cannot use some general rule to take 
a nondeterministic grammar specification and im- 
pose arbitrary constraints to convert it to a deter- 
ministic specification {unless, of course, there is a 
general rule which will always lead to the correct 
decision in such a case). \[Marcus, 1980, p.14\] 
We have developed and implemented a parsing system 
that. given a nondeterministic grammar, forces disambiguation 
in just the manner Marcus rejected (i.e. t .hrough general rules}; 
it thereby exhibits the same preference behavior that psycbolin- 
guists have attributed to native speakers of English for a cer- 
tain range of ambiguities. These include structural ambiguities 
\[Frazier and Fodor, 1978, Frazier and Fodor, 1980, Wanner, 1980l 
and lexical preferences \[Ford et aL, 1982l, as well as the garden- 
path sentences as a side effect. The parsing system is based on 
the shih.-reduee scheduling technique of Pereira \[forthcoming\]. 
Our parsing algorithm is a slight variant of LALR{ 1) pars- 
ing, and, as such, exhibits the three conditions postulated by 
Marcus for a deterministic mechanism: it is data-driven, reflects 
expectations, and has look-ahead. Like Marcus's parser, our 
parsing system is deterministic. Unlike Marcus's parser, the 
grammars used by our parser can be ambiguous. 
2. The Phenomena to be Modeled 
The parsing system was designed to manifest preferences 
among ,~tructurally distinct parses of ambiguous sentences. It, 
does this by building just one parse tree--rather than build- 
ing multiple parse trees and choosing among them. Like the 
Marcus parsing system, ours does not do disambiguation requir- 
ing "extensive semantic processing," hut, in contrast to Marcus, 
it does handle such phenomena as PP-attachment insofar as 
there exist a priori preferences for one attachment over another. 
By a priori we mean preferences that are exhibited in contexts 
where pragmatic or plausibility considerations do not tend to 
favor one reading over the other. Rather than make such value 
judgments ourselves, we defer to the psycholinguistic literature 
{specifically \[Frazier and Fodor, 1978\], \[Frazier and Fodor, 1980\] 
and \[Ford et al., 1982\]) for our examples. 
113 
The parsing system models the following phenomena: 
Right Association 
Native speakers of English tend to prefer readings in which 
constituents are "attached low." For instance, in the sen- 
tence 
Joe bought the book that I hod been trving to obtain for 
~usan. 
the preferred reaL~lng is one in w~lch the prepositional 
phrase "for Susan ~ is associated with %o obtain ~ rather 
than %ought. ~ 
Minlmal Attachment 
On the other hand, higher attachment in preferred in eer- 
rain cases such as 
Joe bought the book \[or Suean. 
in which "for Susan* modifies %he book" rather than 
"bought." Frazier and Fodor \[1978\] note that these are 
canes in which the higher attachment includes fewer nodes 
in the parse tree. Ore" analysis is somewhat different. 
Lexical Preference 
Ford et al. \[10821 present evidence that attachment 
preferences depend on lexical choice. Thus, the preferred 
reading for 
The woman wanted the dresm on that rock. 
has low attachment of the PP, whereas 
The tnoman positioned the dreu on that rack. 
has high attachment. 
Garden-Path Sentences 
Grammatical sentences such as 
The horse raced pamt the barn fell. 
seem actually to receive no parse by the native speaker 
until some sort of "conscioun parsing" is done. Following 
Marcus \[Marcus, 1980\], we take this to be a hard failure 
of the human sentence-processing mechanism. 
It will be seen that all these phenomena axe handled in oux 
parser by the same general rules. The simple context-free gram- 
mar used t (see Appendix I) allows both parses of the ambiguous 
sentences as well as one for the garden-path sentences. The par- 
ser disambiguates the grammar and yields only the preferred 
structure. The actual output of the parsing system can be found 
in Appendix II. 
3. The Parsing System 
The parsing system we use is a shift-reduce purser. Shift- 
reduce parsers \[Aho and Johnson, 19741 axe a very general class 
of bottom-up parsers characterized by the following architecture. 
They incorporate a stock for holding constituents built up during 
IWe make no claims a4 to the accuracy of the sample grammar. It is 
obviously a gross simplific~t.ion of English syntax. Ins role is merely to 
show that the parsing system is sble to dis,~mbiguate the sentences under 
consideration correctly. 
the parse and a shift-reduce table for guiding the parse, At each 
step in the parse, the table is used for deciding between two basic 
types of operations: the shift operation, which adds the next 
word in the sentence (with its pretcrminal category) to the top 
of the stack, and the reduce operation, which removes several 
elements from the top of the stack and replaces them with a 
new element--for instance, removing an NP and a VP from the 
top of the stack and replacing them with an S. The state of the 
parser is also updated in accordance with the shift-reduce table 
at each stage. The combination of the stack, input, and state of 
the parser will be called a configuration and will be notated as, 
for example, 
1 NPv IIMar, 110 1 
where the stack contains the nonterminals NP and V, the input 
contains the lexical item Mary and the parser is in state 10. 
By way of example, we demonstrate the operation of the 
parser (using the grammar of Appendix I) on the oft-cited sen- 
tence "John loves Mary. ~ Initially the stack is empty and no 
input has been consumed. The parser begins in state 0. 
II  ahn 10.. Mar, i0 i 
As elements are shifted to the stack, they axe replaced by their 
preterminal category." T.he shiR-reduce table for the grammar 
of Appendix I states that in state 0, with a proper noun as the 
next word in the input, the appropriate action is a shift. The 
new configuration, therefore, is 
i PNOUN lo~e8 Mar~l i 4 ! 
The next operation specified is a reduction of the proper noun 
to a noun phrase yielding 
, NP iI loves Mary \[2 i 
The verb and second proper noun axe now shifted, in accordance 
with the shift-reduce table, exhausting the input, and the proper 
noun is then reduced to an NP. 
NP v !l Ma,, !1o 
v P. ouN il !, 
NP V NP i\] :14 
Finally, the verb and noun phrase on the top of the stack are 
reduced to a VP 
i NP VP !I ! l II ~6 I 
which is in turn reduced, together with the subject NP, to an S. 
i sJl ,'I ) 
This final configuration is an accepting configuration, since all 
2But see Section 3.'2. for an exception. 
114 
the input has been consumed and an S derived. Thus the sen- 
tence is grammatical ia the grammar of Appendix I, as expected. 
3.1 Differences from the Standard LR Techniques 
The shift-reduce table mentioned above is generated 
automatically from a context-free grammar by the standard al- 
gorithm \[Aho and Johnson, 1974\]. The parsing alogrithm differs, 
however, from the standard LALR(1) parsing algorithm in two 
ways. First, instead of assigning preterminal symbols to words 
as they are shifted, the algorithm allows the assignment to be 
delayed if the word is ambiguous among preterminals. When 
the word is used in a reduction, the appropriate preterminal is 
assigned. 
Second, and most importantly, since true LR parsers exist 
only for unambiguous grammars, the normal algorithm for deriv- 
ing LALR(1) shift-reduce tables yields a table that may specify 
conflicting actions under certain configurations. It is through the 
choice made from the options in a conflict that the preference 
behavior we desire is engendered. 
3.2 Preterminal Delaying 
One key advantage of shift-reduce parsing that is critical 
in our system is the fact that decisions about the structure to 
be assigned to a phrase are postponed as long as possible. In 
keeping with this general principle, we extend the algorithm 
to allow the ~ssignment of a preterminal category to a lexical 
item to be deferred until a decision is forced upon it, so to 
speak, by aa encompassing reduction. For instance, we would not 
want to decide on the preterminal category of the word "that," 
which can serve as either a determiner (DET) or complementizer 
(THAT), until some further information is available. Consider 
the sentences 
That problem i* important. 
That problema are difficult to naive ia important. 
Instead of a.~signiag a preterminal to ~that," we leave open the 
possibility of assigning either DET or THAT until the first reduc- 
tion that involves the word. In the first case, this reduction 
will be by the rule NP ~DET NOM, thus forcing, once and for 
all, the assignment of DET as preterminal. In the second ease, 
the DET NOM analysis is disallowed oa the basis of number 
agreement, so that the first applicable reduction is the COMPS 
reduction to S, forcing the assignment of THAT as preterminal. 
Of course, the question arises as to what state the par- 
ser goes into after shitting the lexical item ~that." The answer 
is quite straightforward, though its interpretation t,i~ d t,,a the 
determinism hypothesis is subtle. The simple answer is that 
the parser enters into a state corresponding to the union of the 
states entered upon shifting a DET and upon shifting a THAT 
respectively, in much the same way as the deterministic simula- 
tion of a nondeterministic finite automaton enters a ~uniou" 
state when faced with a nondeterministic choice. Are we then 
merely simulating a aoadeterministic machine here. ~ The anss~er 
is equivocal. Although the implementation acts as a simulator 
for a nondeterministic machine, the nondeterminism is a priori 
bounded, given a particular grammar and lexicon. 3 Thus. the 
nondeterminism could be traded in for a larger, albeit still finite, 
set of states, unlike the nondeterminism found in other pars- 
ing algorithms. Another way of looking at the situation is to 
note that there is no observable property of the algorithm that 
would distinguish the operation of the parser from a determinis- 
tic one. In some sense, there is no interesting difference between 
the limited nondeterminism of this parser, and Marcus's notion 
of strict determinism. In fact, the implementation of Marcus's 
parser also embodies a bounded nondeterminism in much the 
same way this parser does. 
The differentiating property between this parser and that 
of Marcus is a slightly different one, namely, the property of 
qaaM-real-time operation. 4 By quasi-real-time operation, Marcus 
means that there exists a maximum interval of parser operation 
for which no output can be generated. If the parser operates for 
longer than this, it must generate some output. For instance, 
the parser might be guaranteed to produce output (i.e., struc- 
ture) at least every three words. However, because preterminal 
assignment can be delayed indefinitely in pathological grammars, 
there may exist sentences in such grammars for which arbitrary 
numbers of words need to be read before output can be produced. 
It is not clear whether this is a real disadvantage or not, and, 
if so, whether there are simple adjustments to the algorithm 
that would result in quasi-real-time behavior. In fact, it is a 
property of bottom-up parsing in general that quasi-real-time 
behavior is not guaranteed. Our parser has a less restrictive but 
similar property, fairneaH, that is, our parser generates output 
linear in the input, though there is no constant over which out- 
put is guaranteed. For a fuller discussion of these properties, see 
Pereira and Shieber \[forthcoming\]. 
To summarize, preterminal delaying, as an intrinsic part 
of the algorithm, does not actually change the basic properties 
of the algorithm in any observable way. Note, however, that 
preterminal assignments, like reductions, are irrevocable once 
they are made {as a byproduct of the determinism of the algo- 
rithm}. Such decisions can therefore lead to garden paths, as 
they do for the sentences presented in Section 3.6. 
We now discuss the central feature of the algorithm. 
namely, the resolution of shift-reduce conflicts. 
3.3 The Disambiguation Rules 
Conflicts arise in two ways: aM/t-reduce conflicts, in which 
the parser has the option of either shifting a word onto the stack 
or reducing a set of elements on the stack to a new element; 
reduce-reduce conflicts, in which reductions by several grammar 
3The boundedness comes about because only a finite amount or informa- 
tie, n is kept per state (an integer) and the nondeterrninlsm stops at the 
prcterminat level, so that, the splitting of states does not. propogate, 
41 am indebted to Mitch Marcus for this .bservation and the previous 
comparison with his parser. 
i15 
rules are possible. The parser uses two rules to resolve these 
conflicts: 5 
(I) Resolve shift-reduce conflicts by shifting. 
(2) Resolve reduce-reduce conflicts by performing 
the longer reduction. 
These two rules suffice to engender the appropriate be- 
havior in the parser for cases of right association and minimal 
attachment. Though we demonstrate our system primarily with 
PP-attachment examples, we claim that the rules are generally 
valid for the phenomena being modeled \[Pereira and Shieber, 
forthcoming\]. 
3.4 Some Examples 
Some examples demonstrate these principles. Consider the 
sentence 
Joe took the book that I bought for Sum,re. 
After a certain amount of parsing has beta completed deter- 
ministically, the parser will be in the following coniigttration: 
I NP v that V Ill°r S,... I 
with a shift-reduce confict, since the V can be reduced to a 
VP/NP ° or the P can be shifted. The principle* presented would 
solve the conflict in favor of the shift, thereby leading to the 
following derivation: 
NP V NP that NP V P l\] Su,an 112 ) 
"NPV NP that NPVP NP II 119 I 
NP v NP that NP V PP !l 124 I 
NPVNPthatNPVP/NP II i 22 I 
NP V NP that S/NP .1O I 
NP v NP II I 7 I 
,,2 
Iq'P V NP, 11. }14 I 
., NP VP t1 I 8 I 
.... sll I' I 
which yields the structure: 
\[sdoe{vptook{Nl,{xethe book\]\[gthat I bought for Susanl\]\]\] 
The sentence 
5The original notion of using a shift-reduce parser and general scheduling 
principles to handle right association and minlmal attachment, together 
with the following two rules, are due to Fernando Pereira \[Pereira, 1982\[. 
The formalization of preterminal delaying and the extensions to the Ionic tl- 
preference cases and garden-path behavior are due to the author. 
8The "slash-category" analysis of long-distance dependencies used here is 
loosely based on the work of Gaadar \[lggl\]. The Appendix 1 grammar 
does not incorporate the full range of slashed rules, however, but merely a 
representative selection for illustrative purposes. 
Joe bou¢ht the book for Su,an. 
demonstrates resolution of a reduce-reduce conflict. At some 
point in the parse, the parser is in the following configuration: 
\[ NP V NP PP ii 120 I 
with a reduce-reduce conflict. Either a more complex NP or a 
VP can be built. The conflict is resolved in favor of the longer 
reduction, i.e., the VP reduction. The derivation continues: 
I NP VP \[I I 8 ! 
I sll 1! I 
ending in an accepting state with the following generated struc- 
ture: 
\[sdoe{v~,bought\[Npthe bookl\[Ppfor Susan\]I\] 
3.5 Lexical Preference 
To handle the lexical-preferenee examples, we extend the 
second rule slightly. Preterminal-word pairs can be stipulated as 
either weak or strong. The second rule becomes 
(2} Resolve reduce-reduce conflicts by performing 
the longest reduction with the stroncest &ftmost 
stack element. 7 
Therefore, if it is assumed that the lexicon encodes the 
information that the triadic form of ~ant" iV2 in the sample 
grammar) and the dyadic form of ~position" (V1) are both weak, 
we can see the operation of the shift-reduce parser on the ~dress 
on that rack" sentences of Section 2. Both sentences are similar 
in form and will thus have a similar configuration when the 
reduce-reduce conflict arises. For example, the first sentence will 
be in the following configuration: 
t NP wanted NP PP i\[ 120 i 
In this case, the longer reduction would require assignment of the 
preterminat category V2 to ~ant," which is the weak form: thus, 
the shorter reduction will be preferred, leading to the derivation: 
I NP wanted NP \]1 11,1 
\] NP VP II i 6 :,': 
I sli il 
and the underlying structure: 
\[sthe woman\[vpwaated\[Np{Npthe dress\]\[ppoa that r~klll\] 
7Note that, strength takes precedence over length. 
116 
In the ca~e in which the verb is "positioned," however, the longer 
reduction does not yield the weak form of the verb; it will there- 
fore be invoked, reslting in the structure: 
\[sthe woman \[vP positioned \[Npthe dress\]\[ppon that rackl\]\] 
3.6 Garden-Path Sentences 
As a side effect of these conflict resolution rules, certain 
sentences in the language of the grammar will receive no parse 
by the parsing system just discussed. These sentences are ap- 
parently the ones classified as "garden-path" sentences, a class 
that humans also have great difficulty parsing. Marcus's conjec- 
ture that such difficulty stems from a hard failure of the normal 
sentence-processing mechanism is directly modeled by the pars- 
ing system presented here. 
For instance, the sentence 
The horse raced past the barn fell 
exhibits a reduce-reduce conflict before the last word. If the 
participial form of "raced" is weak, the finite verb form will be 
chosen; consequently, "raced pant the barn" will be reduced to a 
VP rather than a participial phrase. The parser will fail shortly, 
since the correct choice of reduction was not made. 
Similarly, the sentence 
That scaly, deep-sea fish ,hould be underwater i~ impor- 
tant. 
will fail. though grammatical. Before the word %hould" is 
shifted, a reduce-reduce conflict arises in forming an NP from 
either "That scaly, deep-sea l~h" or "scaly, deep-sea fish." The 
longer (incorrect} reduction will be performed and the parser will 
fail. 
Other examples, e.g., "the boy got fat melted," or "the 
prime number few" would be handled similarly by the parser, 
though the sample grammar of Appendix I does not parse them 
\[Pcreira and Shieber, forthcoming\]. 
4. Conclusion 
To be useful, aatttral-language systems must model the 
behavior, if not the method, of the native speaker. We have 
demonstrated that a parser using simple general rules for disam- 
biguating sentences can yield appropriate behavior for a large 
class of performance phenomena--right a-~soeiation, minimal at- 
tachment, lexical preference, and garden-path sentences--and 
that, morever, it can do so deterministically wit, hour generating 
all the parses and choosing among them. The parsing system 
has been implemented and has confirmed the feasibility of ottr 
approach to the modeling of these phenomena. 
References 
Aho, A.V.. and S.C. Johnson, 1974: "LR Parsing," Computi,, 9 
Sur,,eys. Volume 6, Number 2, pp. 99-i24 ISpring). 
Ford, M., J. Bresnan, and R. Kaplan, 1982: "A Competence- 
Based Theory of Syntactic Closure," in The Mental 
Representation o/Grammatical Relations, J. Bresnan, ed. 
(Cambridge, Massachusetts: MIT Press). 
Frazier, L., and J.D. radar, 1978: ~I'he Sausage Machine: A 
New Two-Stage Parsing Model," Cognition, Volume 6, pp. 
291-325. 
Frazier, L., and J.D. Fodor, 1980: "Is the Human Sentence 
Parsing Mechanism aa ATN?" Cognition, Volume 8, pp. 
411-459. 
Gazdar, G., 1981: "Unbounded dependencies and coordinate 
structure," Linquistic Inquiry, Volume 12, pp. 105-179. 
Kimball, d., 1973: "Seven Principles of Surface Structure Parsing 
in Natural Language," Cognition, Volume 2, Number 1, 
pp. 15-47. 
Marcus, M., 1980: A Theory of Syntactic Recognition/or Natural 
Lanquagc, (Cambridge, Massachusetts: MIT Press). 
Pereira, F.C.N., forthcoming: "A New Characterization of 
Attachment Preferences," to appear in D. Dowry, 
L. Karttunen, and A. gwicky (eds.) Natural 
Language Prate,int. Psyeholingui, tic, Computational, 
and Theoretical Perspective~, Cambridge, England: 
Cambridge University Press. 
Pereira, F.C.N., and S.M. Shieber, forthcoming: "ShiR-Reduce 
Scheduling and Syntactic Closure/ to appear. 
Wanner, E., 1980: "The ATN and the Sausage Machine: Which 
One is Baloney?" Caanition, Volume 8, pp. '209-225. 
Appendix I. The Test Grammar 
The following is the grammar used to test the parting 
~ystem descibed in the paper. Not a robust grammar of English 
by any means, it is presented only for the purpose of establishing 
that the preference rules yield the correct, results. 
S -- NP VP VP -- V3 INF 
S--gVP VP--V4 ADJ 
NP -- DET NOM VP -- V5 PP 
NP -- NOM 5-- that S 
NP -- PNOUN INF -- to VP 
NP -- NP S/NP PP -- P NP 
NP -- NP PARTP PARTP -- VPART PP 
NP -- NP PP S/NP -- that S/NP 
DET -- NP's S/NP -- VP 
NOM -- N S/NP -- NP VP/NP 
NOM -- ADJ NOM VP/NP -- Vl 
VP -- AUX VP VP/NP -- V2 PP 
VP -- V0 VP/NP -- V3 INF/NP 
VP -- Vl NP VP/NP -. AUX VP/NP 
VP -- V2 NP PP INF/NP --* to VP/NP 
Appendix II. Sample Runs 
>> do* bought the hook that I had beln tryin E to obt.in 
for Susan 
117 
Accepted: Is 
Cup Cpnonn Joe)) 
(vp 
Cvl bought) 
Cap 
(up (dec the) 
(uoa (n book))) 
(sbar/np 
(that that) 
Cs/np 
Cup (pnou I)) 
Cvp/up 
(uuz bud) 
(vp/np 
(auz been) 
(vp/np Cv3 tryinl) 
(t-~/np 
(~plup 
(v2 obtain) 
(pp (p for} 
(up (pnoun Saul\] 
sta~e: 
stack: 
input: 
(1) 
<(0)> 
(v4 is) 
\[e \[up (den Thlt) 
(non (IdJ scaly) 
Chum (~tJ 4eup-ssl) 
(mum (u fish\] 
C,p Can should) 
(vp (v4 be) 
(adj uadu~ter\] 
(|dj itportut) 
(end) 
>> Joe bought the book for Suuu 
Accepted: \[8 (up (puoun Joe)) 
(vp (v2 boucht) 
Cup Cdet the) 
Chum Cn book))) 
(pp (p for) 
Cup (puoun Sueua\] 
>> The vomam vatted the dreou on thnt r~h 
Accepted: Is Cup Cdut The) 
Cue= (u vomu))) 
(Tp (vt v~ted) 
Cap (up (den the) 
(no= (n druu))) 
(pp (p on) 
(rip (det that) 
Curt (u rack\] 
>> The youth poeitioued the dreue on that rack 
Accepted: Is (up (den The) 
(noa (n vol,~))) 
(vp (~2 poaitioued) 
(up (den the) 
(nee (~ dreJl))) 
(pp Cp on) 
(up (den that} 
Cuom (. rack\] 
>> The horse raced put the barn fell 
Parse failed. Currant confiEurltlon: 
8tare: (l) 
stack: <(0)> Is Cap (4*t me) 
(not (u horse))) 
(vp (v6 rncea) 
(pp (p put) 
(up (4et the) 
(aou (u b~rn\] 
input: (tO fell) 
Cend) 
)) That ecal! ~eep-let fish should be undes=l~tur i8 importer 
Parse failed. Current cou~ilOlrttiou: 
118 
