Two Principles of Parse Preference 
Jerry R. Hobbs and John Bear 
Artificial Intelligence Center 
SRI International 
1 Introduction 
The DIALOGIC system for syntactic, analysis and se- 
mantic translation ha~ been under development for 
over ten years, and during that time it has been used 
in a number of domains in both database interface 
and message-processing applications. In addition, it 
has been tested on a number of sentences of linguis- 
tic interest. Built into the system are facilities for 
ranking parses according to syntactic and selectional 
considerations, and over the years, as various kinds 
of ambiguity have become apparent, heuristics have 
been devised for choosing the preferred parses. Our 
aim in this paper is first to present a compendium of 
many of these heuristics and secondly to propose two 
principles that seem to underlie the Jmuristics. The 
first will be useful to researchers engaged in building 
grammars of similarly broad coverage. The second is 
of psychological interest and may be a guide for es- 
timating parse preferences for newly discovered am- 
biguities for which we lack the experience to decide 
among on a more empirical basis. 
The mechanism for implementing parse preference 
heuristics is quite simple. Terminal nodes of a parse 
tree acquire a score (usually 0) from the lexical entry 
for the word sense. When a nonterminal node of a 
parse tree is constructed, it is given an initial score 
which is the sum of the scores of its child nodes. Var- 
ious conditions are checked during the construction 
of the node and, as a result, a score of 20, 10, 3, -3, 
-10, or -20 may be added to the initial score. The 
score of the parse is the score of its root node. The 
parses of ambiguous sentences are ranked according 
to their scores. Although simple, this method has 
been very successful. In this paper, however, rather 
than describe the heuristics in terms this detailed, we 
will describe them in terms of the preferences among 
the alternate structures that motivated our scoring 
schemes. 
While these heuristics have arisen primarily 
through our everyday experience with the system, we 
have done small empirical studies by hand on some 
of the ambiguities, using several different kinds of 
text, including some from the Brown corpus and some 
transcripts of spoken dialogue. We have counted the 
number of occurrences of potentially ambiguous con- 
structions that were in accord with our claims, and 
the number of occurrences that were not. Some of 
the constructions were impossible to find, not only 
because they occur so rarely but also because many 
are very difficult for anyone except a dumb parser to 
spot. But in every case where we found examples, the 
numbers supported our claims. We present our pre- 
liminary findings below for those eases where we have 
begun to accumulate a nontrivial number of examples. 
2 Brief Review of the IAtera- 
ture 
Most previous work on parse preferences has con- 
cerned itself with tile most notorious of the 
ambiguities--the attachment ambiguities of postmod- 
ifiers. Among the first linguists to address this prob- 
lem was Kimball (1973). tie proposed several process- 
ing principles in an attempt to account for why certain 
readings of ambiguous sentences were more salient 
than others. Two of these principles were Right As- 
sociation and Closure. 
In the late 1970s and early 1980s there was a great 
deal of work among linguists and psycholinguists (e.g. 
Frazier and Fodor, 1979; Wanner and Maratsos, 1978; 
Marcus, 1979; Church, 1980; Ford, Bresnan, and Ks- 
plan, 1982) attempting to refine Kimbali's initial anal- 
ysis of syntactic bias and proposing their own prin- 
ciples govering attachment. Frazier and Fodor pro~ 
posed the principles of Minimal Attachment and Lo- 
cal Association. Church proposed the A-over-A Early 
Closure Principle; and Ford, Bresnan and Kaplan im 
troduced the notions of Lexical Preference and Final 
Arguments. 
The two ideas that dominated their hypotheses 
and discussions were Right Association, which says 
roughly that postmodifiers prefer to be attached to 
the nearest previous possible head, and a stronger 
principle stipulating that argument interpretations 
are favored over adjunct interpretations. This latter 
principle is implied by Frazier and Fodor's Minimal 
162 \] 
Attachment and also by Ford, Bresnan and Kaplan's 
Lexical Preference. 
In recent computational linguistics, Shieber and 
Pereira (Shieber, 1983; Pereira, 1985) proposed a 
shift-reduce parser for parsing English, and showed 
that Right Association was equivalent to preferring 
shifts over reductions, and that Minimal Attachment 
was equivalent to favoring the longest possible reduc- 
tion at each point. 
More recently, there have been debates, for exam- 
ple, between Schubert (1984, 1986) and Wilks et al. 
(1985), about the interaction of syntax with seman- 
tics and the role of semantics in disambiguating the 
classical ambiguities. 
We take it for granted that, psychologically, syn- 
tax, semantics, and pragmatics interact very tightly to 
achieve disambiguation. In fact, in other work (Hobbs 
et al., t988), we have proposed an integrated frame- 
work for natural language processing that provides for 
this tight interaction. However, in this paper, we are 
considering only syntactic factors. In the semantically 
and pragmatically unsophisticated systems of today, 
these are the most easily accessible factors, and even 
in more sophisticated systems, there will be examples 
that semantic and pragmatic factors alone will fail to 
disambiguate. 
The two principles we propose mawr be viewed as 
generalizations of Minimal Attachment and Right As- 
sociation. 
3 Most Restrictive Context 
The first principle might be called the Most Restric- 
tive Context principle. It can be stated as follows: 
Where a constituent can be placed in two 
different structures, favor the structure that 
places greater constraints on allowable con- 
stituents. 
For example, in 
John looked for Mary. 
"for Mary" can be interpreted as an adverbial signal- 
ing the beneficiary of the action or as a complement of 
the verb "look". Since virtually any verb phrase can 
take an adverbial whereas only a very few verbs can 
take a "for" prepositional phrase as its complement, 
the latter interpretation has the most restrictive con- 
text and therefore is favored. 
A large number of preferences among ambiguities 
can be subsumed under this principle. They are enu- 
merated below. 
1. As in the above example, favor argument over 
adverbial intepretations for postmodifying preposi- 
tional phrases where possible. Thus, whereas in 
John cooked for Mary. 
"for Mary" is necessarily an adverbial, in "John 
looked for Mary" it is taken as a complement. Sub- 
sumable under this heuristic is the preference of "by" 
phrases after passives to indicate the agent rather 
than a location. This heuristic, together with the next 
type, constitutes tile traditional Minimal Attachment 
principle. This heuristic is very strong; of 47 occur- 
rences examined, all were in accord with tile heuristic. 
2. Favor arguments over mere modifiers. Thus, in 
John bought a book from Mary. 
the favored interpretation is "bought from Mary" 
rather than "book from Mary". Where tile head noun 
is also subcategorized for the preposition, as in, 
John sold a ticket to the theater. 
this principle fails to decide among the readings, and 
tile second principle, described in the next section, 
becomes decisive. 
This principle was surprisingly strong, but perhaps 
for illegitimate reasons. Of 75 potential ambiguities, 
all but one were in accord with the heuristic. The one 
exception was 
IIDTV provides television images with finer 
detail than current systems. 
and even this is a close call. However, it is often very 
uncertain whether we should say verbs, nouns, and 
adjectives subcategorize for a certain preposition. For 
example, does "discussion" subcategorize for "with" 
and "about"? We are likely to say so when it yields 
the right parse and not to notice the possibility when 
it would yiehl the wrong parse. So our results here 
may not be completely unbiased. 
3. Favor complement interpretations of infinitives 
over purpose adverbial interpretations. In 
John wants his driver to go to Los Angeles. 
the preferred interpretation has only the driver and 
not John going to Los Angeles. 
Of 44 examples of potential ambiguities of this sort 
that we found, 41 were complements and only 3 were 
purpose adverbials. Even these three could have been 
eliminated with the simplest seleetional restrictions. 
One example was the following 
tie pushed aside other business to devote all 
his time to this issue. 
2 163 
which could have been parsed analogously to 
He pushed strongly all the young researchers 
to publish papers on their work. 
A particularly intriguing example, remembering that 
"provide" can be ditransitive, is the following: 
That is weaker than what the Bush admin- 
istration needs to provide the necessary 
tax revenues. 
4. Favor the attachment of temporal prepositional 
phrases to verbs or event nouns. In the preferred read- 
ing of 
John saw the President during the cam- 
paign. 
the seeing was during the campaign, since "President" 
is not an event noun. In the preferred reading of 
The historian described the demonstrations 
during Gorbachev's visit. 
the demonstrations are during the visit. This case can 
be considered an example of Minimal Attachment if 
we assume that all verbs and event nouns have poten- 
tial temporal arguments. Of 74 examples examined, 
66 were in accord with this heuristic. Two that did 
not involved the phrase "business sin~e August 1". 
5. Favor adverbial over object interpretations of 
temporal and measure noun phrases. Thus, in 
John won one day in Hawaii. 
"one day in tIawaii" is preferentially the time John 
won and not his prize. In 
John walked 10 miles. 
"10 miles" is a measure of how far he walked, not 
what he walked. This is an example of Most Restric- 
tive Context because noun phrases, based on syntactic 
criteria alone, can always be the object of a transi- 
tive verb, whereas only temporal and measure noun 
phrases can function as adverbials. This case is in- 
teresting because it runs counter to Minimal Attach- 
ment. Here arguments are disfavored. 
Of fifteen examples we found of such ambiguities, 
eleven agreed with the heuristic. The reason for the 
large percentage of examples that did not is that 
sports articles were among tlmse examined, and they 
contained sentences like 
Smith gained 1240 yards last season. 
This illustrates the hidden dangers in genre selection. 
6. Favor temporal nouns as adverbials over coin- 
pound nominal heads. The latter interpretation is 
possible, as seen in 
Is this a CSLI Thursday? 
But the preferred reading is the temporal one that is 
most natural in 
I saw the man Thursday. 
7. Favor "that" as a complementizer rather than as 
a determiner. Thus, in 
I know that sugar is expensive. 
we are probably not referring to "that sugar". This 
is a case of Most Restrictive Context because the 
determiner "that" can appear in any noun phrase, 
whereas the complementizer "that" can occur only 
after a small number of verbs. This is a heuristic we 
suspect everyone who has built a moderately large 
grammar has implemented, because of the frequency 
of the ambiguity. 
8. An initial "there" is interpreted as an existential, 
where possible, rather than as a locative. We interpret 
There is a man in the room. 
as an existential declarative sentence, rather than 
as an utterance with an initial locative. Locatives 
can occur virtually anyplace, whereas the existential 
"there" can occur in only a very small range of con- 
texts. Of 30 occurrences examined, 29 were in accord 
with the heuristic. The one exception was 
There, in the midst of all those casinos, is 
'Ptump's Taj Mahal. 
9. Favor predeterminers over separate noun 
phrases. In 
Send all the money. 
the reading that treats "all the" as a complex deter- 
miner is favored over the one that treats "all" as a 
separate complete noun phrase in indirect object po- 
sition. There are very many fewer loci for predeter- 
miners than for noun phrases, and hence this is also 
an example of Most Restrictive Context. 
10. Favor preprepositional lexical adverbs over sep- 
arate adverbials. Thus, in 
John did the job precisely on time. 
we favor "precisely" modifying "on time" rather than 
"did the job". Very many fewer adverbs can func- 
tion as preprepositional modifiers than can function 
as verbal or sentential adverbs. Of 28 occurrences ex- 
amined, all but one were in accord with the heuristic. 
The one was 
Who is going to type this all for you? 
_k 
164 3 
11. Group numbers with prenominal unit nouns 
but not with other prenominal nouns. For example, 
"I0 mile runs" are taken to be an indeterminate num- 
ber of runs of l0 miles each rather than as exactly l0 
runs of a mile each. Other nouns can firnction the 
same way as unit nouns, as in "2 car garages", t)ut it 
is vastly more common to have the mlmber attached 
to the head noun instead, as in "5 wine glasses". Vir- 
tually any noun can appear as a prcnominal noun, 
whereas only unit nouns can appear in the adjectival 
"10-mile" constrnction. Iience, for unit nouns this is 
the most restrictive context. While other nouns can 
..~ometirnes occur in this context, it is only through a 
reinterpretation as a unit noun, as in ';2 car garages". 
12. Disfavor headless structures. Headless struc- 
l.nres impose no constraints, and are therefore never 
the most re,~trictive context, and thus are the least fa- 
vored in cases of ambiguity. An example of this case 
i,~ the sentence 
John knows the best man wins. 
which we interpret as a concise form of 
John knows (that) the best man wins. 
rather than aq a concise form of 
John knows the best (thing that) man wins 
0. 
4 Attach Low and Parallel 
The second principle might be called the Attach Low 
and Parallel principle. It may be stated as follows: 
Attach constituents as low as possible, 
and in parallel with other constituents ifpos- 
sible. 
The cases subsumed by this principle are quite het- 
erogeneou S. 
1. Where not overridden by the Most Restrictive 
Context principle, favor attaching postmodifiers to 
the closest possible site, skipping over proper nouns. 
Thus, where neither the verb nor the noun is subcat- 
egorized for the preposition, as in 
John phoned a man in Chicago. 
or where both the verb and the noun are subcatego- 
rized for the preposition, as in 
John was given a book by a famous profes- 
SOIL 
the noun is favored as the attachment point, since that 
is the lowest possible attachment point in tile parse 
tree. This case is just tile traditional Right Associa- 
tion. 
The subcase of prepositional phrases with "of" is 
significant enough to be mentioned separately. We 
might say that every noun is subcategorized for "of" 
and that therefore "of" prepositional phrases are 
nearly always attached to the immediately preceding 
word. Of 250 occurrences examined, 248 satisfied this 
heuristic, and of the other two 
Since the first reports broke of the CIA's ac- 
tivities, ... 
He ordered the destruction two years ago of 
some records. 
tile second would not admit an incorrect attachment. 
in any case. 
We examined 148 instances of this case not involv- 
ing "of", temporal prepositional phrases, or preposi- 
tions that are subcategorized for by possible attach- 
ment points. Of these, 116 were in accord with the 
heuristic and 32 were not. An example where this 
heuristic failed was 
They abandoned hunting for food produc- 
tion. 
For a significant number of examples (34), it did not 
matter where the attachment was made. For instance, 
in 
John made coffee for Mary. 
both the coffee and the making are for Mary. We 
counted these cases as being in accord with the heuris- 
tic, since the heuristic would yield a correct interpre- 
tation. 
This is perhaps the place to present results on two 
very simple algorithms. The first is to attach prepo- 
sitional phrases to the closest possible attachment 
point, regardless of other considerations. Of 251 oc- 
currences examined, 125 attached to the nearest pos- 
sibility, I09 to the second nearest, 14 to the third, and 
3 to the fourth, fifth, or sixth. This algorithm is not 
especially recommended. 
The second algorithm is to attach to the near: 
est possible attachment point that subeategorizes for 
the preposition, if there is such, assuming verbs and 
event nouns to subcategorize for temporal preposi- 
tional phrases, and otherwise to attach to the nearest 
possible attachment point. This is essentially a sum- 
mary of our heuristics for prepositional phrases. Of 
297 occurrences examined, this yielded the right an- 
swer on 256 and the wrong one on 41. 
4 165 
2. Favor preprepositional readings of measure 
phrases over readings as separate adverbials. Thus, 
in 
John walked 10 miles into the forest.. 
we preferentially take "10 miles" as modifying "into 
the forest" rather than "walked", so that John is 
now 10 miles from the edge of the forest, rather than 
merely somewhere in tile forest but 10 miles from his 
starting point. Since the preposition occurs lower in 
the parse tree than the verb, this is an example of 
Attach Low and Parallel. Note that this is a kind of 
"Left Association". 
3. Coordinate "both" with "an~", if possible, 
rather than treating it a~s a separate determiner. In 
John likes both intelligent and attractive 
women. 
the interpretation in which there are exactly two 
women who are intel!igmit and attractive is disfa- 
vored. Associating "both" with the coordinated ad- 
jectives rather than attaching it to the head noun is 
attaching it lower in the parse tree. 
4. Distribute prenominal nouns over conjoined 
bead nouns. In "oil sample and filter", we mean "oil 
s,-~i,q>le and oil filter". A principle of Attach Low 
wo,.~i,:l not seem to be decisive in this case. Would 
~{. ~,e~m that we attach "oil" low by attaching it to 
"sample" or that we attach "and filter" low by at- 
taching it to "sample". It is because of examples like 
this (and the next case) that we propose the principle 
Attach Low and Parallel. We favor the reading that 
captures the parallelism of the two head nouns. 
5. Di~.t.ri}mte determiners and noun complements 
over conjoined head nouns. In "the salt and pepper on 
the table", we treat "salt" and "pepper" as conjoined, 
rather than "the salt" and "pepper on the table". As 
in the previous case, where we have a choice of what to 
attach low, we favor attaching parallel elements low. 
6. Favor attaching adjectives to head nouns rather 
than prenominal nouns. We take "red boat house" 
to refer to a boat house that is red, rather than to 
a house for red boats. Like all of our principles, this 
preference can be overridden by semantics or conven- 
tion, as in "high stress job". IIere again we could 
interpret Attach Low ~ telling us to attach "red" to 
"boat" or to attach "boat" to "house". Attach Low 
and Parallel tells us to favor the latter. 
5 Interaction and Overriding 
There will of course be many examples where both 
of our principles apply. In the cases that occur 
with some frequency, in particular, the prepositional 
phrase attachment ambiguities, it seems that the 
Most Restrictive Context principle domina.tes Attach 
Low and Parallel. It is unclear what the interacti6ns 
between these two principles should be, more gener- 
ally. 
These principles can be overridden by more than 
just semantics and pragmatics. Comma.s in written 
discourse and pauses in spoken discourse (see Bear 
and Price, 1990, on the latter) often function to over- 
ride Attach Low and Parallel, as in 
John phoned the man, in Chicago. 
Specify the length, in bits, of a word. 
It is the phoning that is in Chicago, and the specifica- 
tion is in bits while the length is of a word. Similarly, 
commas and pauses can override the Most Restrictive 
Context principle, as in 
John wants his driver, to go to Los Angeles. 
Itere we prefer the purpose adverbial reading in which 
John and the driver both are going to Los Angeles. 
6 Cognitive Significance; 
The analysis of parse preferences in terms of these 
two very general principles is quite appealing, and 
more than simply because they subsume a. great many 
cases. They seem to relate somehow to deep princi- 
ples of cognitive economy. The Most Restrictive Con- 
text principle is a matter of taking all of the available 
information into account in constructing interpreta- 
tions. The "Low" of Attach bow and Parallel is an 
instance of a general cognitive heuristic to interpret 
features of the enviromnent ~ locally as possible. The 
"Parallel" exemplifies a general cognitive heuristic to 
see similarity wherever possible, a heuristic that pro- 
motes useful generalizations. 
Acknowledgements 
The authors would like to express their gratitude to 
Paul Martin, who is responsible for discovering some 
of the heuristics, and to Mark Liberman for sending 
us some of the data. The research was funded by tile 
Defense Advanced Research Projects Agency under 
Office of Naval Research contract N00014-85-C-0013, 
and by a gift from tile Systems Development Founda- 
tion. 
166 5 

References 

\[1\] Bear, John, and Jerry }Iobbs, 1988, "Localizing 
Expression of Ambiguity", Proceedings of the Sec- 
ond Conference on Applied Natural Language P~v- 
cessing, Austin, 'I~xas, pp. 235-241. 

\[2\] Bear, John, and Patti Price, 1990, "Prosody, Syn- 
tax and Parsing", Proceedings, 28th Annual Meet- 
ing of the Association for Computational Linguis- 
t, ics, Pittsburgh, Pennsylvania. 

\[3\] Church, Kenneth, 1980. "On Memory Limitations 
in Natural Language Processing", MIT Technical 
Report MIT/LCS/TI{-245. 

\[4\] Ford, Marylyn, Joan Bresnan, and Ronald Ka- 
plan, 1982. "A Competence-Based Theory of Syn- 
tactic Closure," in J. Bresnan (Ed.) The Men- 
tal Representation of Grammatical Relations, MIT 
Press: Cambridge, Massachusetts. 

\[5\] Frazier, Lyn and Janet Fodor, 1979. "Tile Sausage 
Machine: A New Two-Stage Parsing Model", Cog- 
nilion, Voh 6, pp. 291-325. 

\[6\] Ilobbs, Jerry R., Mark Stickel, Paul Martin, and 
Douglas Edwards, 1988. "Interpretation as Abduc- 
tion", Proceedings, 26lh Annual Meeting of the As- 
sociation for Computational Linguistics, pp. 95- 
103, Buffalo, New York, June 1988. 

\[7\] Kimball, John, 1973. "Seven Principles of Surface 
Structure Parsing in Natural Language '' , Cognition 
Vol. 2, No. 1, pp. 15-47. 

\[8\] Marcus, Mitchel, 1980. A Theory of Syntac- 
tic Recognilion for Natural Language, MIT Press: 
Cambridge, Massachusetts. 

\[9\] Pereira, Fernando, 1985. "A New Characteriza- 
tion of Attachment Preferences," in D. Dowry et 
al. (Eds.) Natural Language Processing, Cambridge 
Univer,~ity Press: Cambridge, England. 

\[10\] Schubert, Lenhart, 1984. "On Parsing Prefer- 
ences", Proceedings, COLING 1984, Stanford, Cal- 
ifornia, pp. 247-250. 

\[11\] Schubert, Lenhart, 1986. "Are There Preference 
Trade-offs in Attachment Decisions?" Proceedings, 
AAAI 1986, Philadelphia, Pennsylvania, pp. 601-605. 

\[12\] Shieber, Stuart, 1983. "Sentence Disambiguation 
by a Shifl,-Reduce Parsing Technique", Proceedings, 
IJCAI 1983, Washington, D.C., pp. 699-703. 

\[13\] Wanner, Eric, and Michael Maratsos, 1978. "An 
ATN Approach to Comprehension," in Halle, Bres- 
nan, and Miller (Eds.) Linguistic Theory and Psy~ 
chological Reality. MIT Press: Cambridge, Mas.J 
sachu.~etts. 

\[14\] Wilks, Yorick, Xiuming Ituang, and Dan Fass, 
1985. "Syntax, Preference and Right Attachment", 
Proceedings, IJCAI 1985, Los Angeles, Califonfia, 
pp. 779-784. 
