SEARCH AND INFERENCE STRATEGIES IN 
PRONOUN RESOLUTION : AN E~ERIMENTAL STUDY 
Kate Ehrlich 
Department of Psychology 
UnlversiCy of Massachusetts 
Amherst, ~ 01003 
The qusstlun of how people resolve pronouns has the various factors combine. 
been of interest to language theorists for a long time 
because so much of what goes on when people find 
referents for pronouns seems to lie at the heart of 
comprehension. However, despite the relevance of pro- 
nouns for comprehension and language cheorT, the 
processes chat contribute to pronoun resolution have 
proved notoriously difficult Co pin down. 
Part of the difficulty arises from the wide range 
of fac=ors that can affect which antecedent noun phrase 
in a tex~ is usderstood to be co-referentlal with a 
particular pronoun. These factors can range from simple 
number/gender agreement through selectional rescrlc~ions 
co quite complex "knowledge chat has been acquired from 
the CaxC (see Webber, (1978) for a neatly illustrated 
description of many of these factors). Research in 
psychology, artificial intelligence a~d linguistics has 
gone a long way toward identifying some of these factors 
and their role in pronoun resolu~ion. For instance, in 
psychology, research carried ouC by Caramazza =-d his 
colleagues (Caramazza et el, 1977) as well as research 
chat I have dune (Ehrllch, 1980), has demuns~rated that 
number/sender agreement really c=- fumcciun to constrain 
the choice of referent in a way Chat signiflcantly 
facilltaCes processing. Within an AI framework, there 
has been some very interesting work carried out by 
Sidner (1977) m~d Grosz (1977) thac seeks to identify 
the current topic of a Cex1: and co show Chat knowledge 
of the topic can considerably sillily pronoun reso- 
lutlon. 
It is important that people are able co select 
appropriate referents for pronouns and co have some 
basis for that decision. The research discussed so far 
has mentioned some of the factors Chac contribute co 
chose decisiuns. However, part of ~he problem of really 
understanding how people resolve pronouns is knowing how 
Certainly it is important 
a~d useful to polnc to a particular factor as concri- 
butlng to a reference decision, but in many texts more 
than one of these factors will be available to a reader 
or listener. One problem for the theorist is then to 
explaln which factor predominates in the decision as 
well as to describe the scheduling of evaluaclon pro- 
cedures. If it could be shown that there was a stricc 
ordering in which tests were applied, say, number/gender 
agreement followed by selectionai restrictions followed 
by inference procedures, pronoun resoluclon may be simp- 
ler to explain. At our present level of knowledge it is 
dlfficulc to discern ordering principles chat have any 
degree of generality. For Instance, for every example 
where the topic seems to determine choice, a sinLilar 
example c~- often be found where the more recent ante- 
cedent is preferred over the one that forms part of the 
topic. Moreover, even this claim begs the quesclon of 
how the coplc can be identified unambiguously. 
A different approach is possible. The process of 
assigning a referent Co a pronoun c~m be viewed as 
utilizing two kinds of strategies. One strategy is con- 
cerned with selecting the best referent from amongst the 
candidates available. The ocher strategy is concerned 
with searching through memory for the candidates. 
These two types of strategy, which will be referred to 
msem¢-lically as inference and search strategies, have 
different kinds of characteristics. A search strategy 
dictates the order in which candldaces are evaluated, 
but has no machinery for carrying out the evaluation. 
The inference strategy helps to set up the represen- 
taclon of the information in the cexC agains c which can- 
dldacas can be evaluated, but has ~o way of finding the 
c~aldidates. ~n the rest of this paper, she way these 
straCegles ~ighc interact will be explored and the 
results of two studies will be reported that bear on 
89 
the issues. 
One possible search strategy is ~o examine can- 
didates serially beginning with the one menKioned most 
recently and working back through the text. This 
strategy makes some sense because, as Hobbs (1978) has 
pointed out, most pronouns co-refer with antecedents 
Chat were menr.laned within the last few senuences. 
Thus, a serial search s~rategy provides a principled 
way of rescric~Lng how a text is searched. Moreover, 
there is some evidence fro~ psychological research ~hat 
it takes longer to resolve pronouns when the antecedent 
wlch which the pronotn~ co-refers is far rather than near 
the pronoun (e.g. Clark & $engul, 1979; SprlnEston, 
1975). Although such distance effects have been used 
to argue for differences in memory reErieval, wlCh the 
nearer antecedents bein 8 easier to retrieve Ch~ the 
further ones, none of the reported data rule out a 
serial search strategy. 
AS argued earlier, a search s~rar~Ey alone cannot 
aecoun~ for pronoun resoluLian because it lacks any 
machinery for evaluation. There are, however, many 
kinds of informa~io~ tha~ people ~ bring to bear when 
evaluating c~dida~es and some of these were discussed 
earlier. A c~on method is to decide between alder- 
native candidates on ~he basis of information gained 
through inferences. Inference is a rather u~iqui~ous 
and often ill-deflned no~ion, and, although it is beyond 
the scope of this paper to clarify the concept, it is 
worth no~ing ~hat Chore are (at leas~) ~wo kinds of 
inference chat play a role in anaphora generally. One 
kind which T will call 'lexlcal' inferences are. drawn 
to establish Chat t~o different linguls~ic expressions 
refer ~o ~he same entity. For insnance, in the follow- 
ing pair of sentences from Garrod and Sanford (1977): 
(I) A bus came roaring round the corner 
The vehicle nearly flattened a pedes~rlan 
a 'lexlcal' inference esuabllshes that ~he particular 
vehicle mentluned in ~he second sentence is in fact a 
bus. Tnferences can also be drawn to support the 
selection of one referent over another. In a sentence 
such as : 
(2) John sold a car to Fred because he needed it 
a series of inferences based in part an out knowledge of 
selling a~d needing, supports ~he selection of Fred 
rather ~h=m John as referent for the pronoun "he". In 
the experiments to be reported, it was 'lexical' 
inferences ra~her ~han the oCher kind that were mani- 
pulated. 
Subjects in ~he experiment were asked to read texts 
such as the a~e given below: 
(3) Fred was outside all day 
John was inside all day 
a) He had a sleep inside after lunch 
b) He had a sleep in his room after lunch 
and then immedla~ely after, answer a question such as 
'~dho had a sleep after lunchY" Chat was designed to 
elicit the referent of the pranou~ in ~he las~ sentence. 
Two factors were independently varied. The antecedent 
could be near or far from the pronoun, ~he lacier 
affected by switching the order of the first £wo sen- 
~ences. The second factor was whether a 'bridg~Ing' 
inference had to be drw~n ~o es~chllsh co-reference 
bed, sen part of the predlca~e of the lasc sentence and 
~he target sentence. The ~o versions, (a) no inference 
and (b) inference, are shown as alternative ~hird sen- 
canoes in example (3) -hove. The principal measures 
were ~he Lime to answer ~he question and ~he accuracy of 
~he respunse. 
The experi-~ent addresses ~wo critical issues. One 
is whether ~he 'lewical' inference is drEdn as part of 
the evaluaLion procedure, or, whether it is drawn in- 
dependently of Cha~ process. The o~her issue concerns 
the search sura~eEy itself: do subjects examine can- 
dlda~es serially, and, if so, do they s~ill use oCher 
criteria to reject the first canal/dace and choose the 
second? Two dlstincc models of processing can be con- 
s~rucced from a conslderarion of Chess issues. In the 
case where inferences are triggered by the need ~o 
9O 
evaluate a candidate, any effect due to extra processing 
should be unaffected by whether the antecedent ks near 
or far from the pronoun. In either case the inference 
will be drawn in response to r/Re need to decide on the 
acceptability of the candidate. In the second model, 
the inference is triggered by the anaphoric expression, 
e.g. "in his room" An the third sentence, and the need 
to relate chat expression to the location "inside" men- 
tioned in a previous sentence. The inference is ex- 
pected to take a certain amotmt of time to be drawn 
(cf. Kintsch, 1974). According to the second model, 
one would expect that in cases where the antecedent is 
near the pronoun, there will be some effect due to 
inference because the process may not be completed in 
time to answer the question. When the antecedent is far 
from the pronoun, however, the inference process will 
be completed and hence no effect of inference should 
still be detected. The two models assume rationality on 
the part of the subjects; that is, they assume that 
subjects will accurately select the further antecedent 
where appropriate even though recency would predict 
selecr.lon of the first candidate that is evaluated. If 
this assumption ks valid, subjects should select the 
far antecedent where appropriate mere often than the 
(erroneous) near candidate. 
The results of the experiment, shown An Table 1, 
support the second model; ' lexlcal' inferences are 
drawn only once and in response to an anaphoric expres- 
sion. The data also provide evidence of a serial search 
strategy by showing that there are more errors and 
longer latencles associated with far rather than near 
antecedents. The data further show that even when the 
correct choice is far from the pronoun, subjects will 
choose it in preference to ~he nearer condidate, thus 
demonstrating that a serial search strategy alone can- 
not predict the choice of referent. 
The inferences that subjects had to draw in this 
experiment concerned simple lexlcal relations. The 
increase in latency due to having drawn such an infer- 
ence supports the resul~s of earlier studies, par- 
tlcularly those of Garrod and Sanford (1977). Whac the 
present study fails to do, however, is to determine 
whether that inference ks drawn spontaneously, while 
reading. Previous research (e.g., ~intsch, 1974, Garrod 
ald Sanford, 1977) has shown ~hat inferences are more 
likely to be drawn while reading ~han at a response 
stage. It was thus of some interest to know when ~he 
lexical inferences in ~he present study were drawn. 
This issue was examined by modifying the previous ex- 
periment to include both an additional measure of read- 
ing time and a 1.5s delay between presentation and test. 
The latter modification is important since if subjects 
are drawing inferences while reading, ~he process may 
not be completed by the time the question is asked 
i~mnedlately after presentation. The introduction of a 
delay also allows for a further test of the two pro- 
ceasing modeled outlined earlier. If indeed 'lexlcal' 
inferences are drawn to establish co-reference between 
anaphoric expressions rather than to determine pro- 
nominal reference, as the previous experiment indicated, 
then there should be an effect of inference on reading 
~ime but not at response when there is a delay, because 
by response ~he inference should have been dr~m. The 
data were consistent with this hypothesis. However, 
what also emerged from the second study was that only 
some of ~he passages seemed to elicit inferences at 
reading; the number of passages was increased in the 
second experiment ro corn%tar possible repetition 
effects. In fact, for half the passages subjects res- 
ponded by saying there was no answer. An example of 
such a passage is given below: 
(4) Jill had a newspaper in the living-room 
Ann had a book in the living-room 
She read some chemistry An the evening 
It was also the case for these passages that the in- 
ferences did not seem to be drawn while reading but 
rather in response to the question. There is some 
doubt here about cause and effect, nevertheless, the 
91 
observation raises some in~eresclng questions con- 
cerning wha~ triggers an inference to be drawn. One 
answer, supplied by Garrod & Sanford in ~heir experi- 
ment.s, is thac a relation baleen e~cpressioas muse 
someh~ be perceived before an inference is drawn to 
de~e~-mlne ~e nature of ~he relation. I~n o~her words, 
people do not draw inferences randomly to relate lln- 
8uisuic expressions. Thus, whereas Garrod & $anford 
found ~ha~ subjects would infer co-reference between 
"bus" and "vehicle" in exa~le (i), they failed to make 
that connection, qui~ rightly, in a slnuLlar passage 
shown below: 
(5) A bus came roaring round the corner 
It nearly smashed some vehicles 
What kinds of strategies do readers adop~ when 
they search ~heir memory to find plausible referents 
for pronouns? Resul~s of che experiments reported here 
point ~o a strategy in which an~ities are examined 
serially from ~he pronoun. The purpose of a serial 
search strategy is to provide a principled we7 in which 
readers can ex"rn'Ine ~ho~e entities they have stored in 
mmory, for ~heir appropriateness as ~he referent of a 
particular prono ~-~. The strategy is ~hus unnecessary 
when there is only one emr/~y in memory by vlr~ue of 
sim~le criteria such as humor and gender agreement 
wi~h ~he pronoun. What cons~.Itutes 'simple' criteria 
is, of course, an open question; che answer, however, 
will materially affect ~he applicability of ~he search 
s~rategy. 
The ~t important part of reference resolution is, 
however, deciding on the referent. A serial search 
strategy has no machinery for evaluating candidates, i~ 
can only direct ~he order in which candidates are 
examined. The process of selecting a plausible referent 
depends on ~he inferences a reader has drawn while ~he 
~ext is read. Thus, when subjects found i~ hard ~o 
selec~ a referent at all ~hey also failed to draw m~my 
inferences while ~hey read ~he ~ext. Moreover, because 
~he inferences for ~hese passa8es did seem to be drawn 
in response to a question ellci~Ing ~he referent, ~he 
i,~llcarAon is that inferences for che clearer material 
are generally drawn spontaneously and before a specific 
need for ~he informar.lon arises. One can conjecture 
from ~hese data that the select_ion of plausible refer- 
an~s is dependent on how well a reader has understood 
~he preceding text. If inferences are not drawn on~il 
a specific need arises, such as finding a referent, ~hen 
it may be too late, to selec~ a referent easily or 
accurately, l~us, reference can also be viewed in terms 
of what a ~ext makes available for anaphoric reference 
(cf. Webber, 1978). 
The picture of pronoun resolution that emerges 
from the studies reported here, is one in which effects 
of distance between the pronoun and its antecedent may 
play some role, not as a predicator of pronominal 
reference as has often been ~houEht, but as part of a 
search strateEy. There certainly are cases where nearer 
antecedents seem to be preferred over ones further back 
in the text; however, it is more profitable to look ~o 
concepts such as foregroundin E (of. Chafe, 1974) rather 
than silnple recency for explanations of the preference. 
• It is also of some interest to have shown that infer- 
ences ~my con~rlbute ~o pronoun resolution huc drawn 
for other reasons. 
R~KENCES 
Carama~za, A., Grober, E., Garvey, C. and Yates, J. 
(1977). Comprehension of anaphoric pronom~s. 
Journal of Verbal Learning and Verbal Behavior, i_6, 
601-9. 
~fe, W.L. (1974). Language and consciousness. Lan__- 
guage, 50, 111-133. 
Clark, H.H., and Sengul, C.J. (1979). In search of re- 
ferents for nouns and pronouns. ~.emory and Cog- 
hi=ion, 7, 35-41. 
Ehrlich, K. (1980). Comprehension of pronouns. Ouar- 
terlv Journal of Exper~nental PsTcholo~, 32, 247- 
Garrod, S. and Sanford,A.J. (1977). Interpreclng ana- 
92 
photic relations: =he integration cf semantic 
information while reading. Journal of Verbal 
Learnin~ and Verbal Behavior, 16, 77-90. 
Grosz, B.J. (1977). The representation and use of 
focus in a system for understanding dialogs. In 
Proceedin~ of =he Fifth International Joint Con- 
ference on Artificial Intelligence. Cambridge: 
MIT. 
Hobbs, J.R. (1978). Resolving pronoun references. 
Lingua, 44, 311-338. 
Kintsch, W. (1974). The representation of meaning in 
memory. Potomac, Md: Erlbatnn. 
Sidner, C. (1977). Levels of ccmplexlty in discourse 
for anaphora disambiguatlon and speech act inter- 
pretation. In Proceedings of =he Fifth Inter- 
national Joint Conference cn Artificial Intel- 
li~ence. Cambridge: ~flT. 
Springsron, F.J. (1975). Some cognitive aspects of 
presupposed coreferential anaphora. Unpublished 
doctoral dissertation, Stanford University. 
Webber, B.L. (1978). A formal approach to discourse 
anaphora. 8BN report no. 3761. Cambridge, Mass: 
Bolt, Beranek and Newman, Inc. 
TABLE I 
Percent correct responses (?.C.) and mean response 
=~mes (R.T.). 
Inference condir ion 
Distance No inference Inference 
R.T. P.C. R.T. P.C. 
Near 1.32 95% 1.42 87% 
Far i .56 72% 1.56 70% 
93 

