Anaphora for Everyone: 
Pronominal Anaphora Resolution without a Parser 
Christopher Kennedy 
Board of Studies in Linguistics 
University of California 
Santa Cruz, CA 95064 
kennedy~ 11ing. ucsc. edu 
Branimir Boguraev 
Advanced Technologies Group 
Apple Computer, Inc. 
Cupertino, CA 95014 
bkb(4 app i e. com 
Abstract 
We present an algorithm for anaphora res- 
olutkm which is a modified and extended 
version of that developed by (Lappin and 
Leass,/994). In contrast to that work, our al- 
gorithm does not require in-depth, full, syn.. 
tactic parsing of text. Instead, with minimal 
compromise in output quality, the modifica- 
tions enable the resolution process to work 
from tile output of a part of speech tag- 
ge~; enriched only with annotations of gram- 
matica\] functkm of lexical items in the in- 
put text stream. Evaluation of the results 
of our in-tplementation demonstrates that ac- 
curate anaphora resolution can be realized 
within natural language processing fl'ame- 
works which do not--~,)r cannot- employ ro- 
bust and rcqiable parsing components. 
1 Overview 
(l,appin and Leass, 1994) describe an algorithm for 
pronominal anaphora resolution with high rate of cor- 
rect analyses. While one of the strong points of this 
algorithm is that it operates primarily on syntactic in- 
formation ahme, this also turns out to be a limiting 
factor for its wide use: current state-of-the-art of prac- 
tically applicable parsing technology still falls short of 
robust and reliable delivery of syntactic analysis of real 
texts to the level of detail and precision that the filters 
a nd constraints described by I ,appin and l ,eass assume. 
We are particularly interested in a class of text pro- 
cessing applications, capable of delivery of content 
analysis to a depth inw~lving non-trivial amount of 
discourse processing, including anaphora resolution. 
The operational context prohibits us from making any 
assumptions concerning domain, style, and genre of 
input; as a result, we have developed a text processing 
framework which builds its capabilities entirely on the 
basis of a considerably shallower linguistic analysis of 
the input stream, thus trading off depth of base level 
analysis for breadth of cown:age. 
In this paper, we present work on modifying the lmp- 
pin/Leass algorithm in a way which enables it to work 
off a flat morpho-syntactic analysis of the sentences of 
a text, while retaining a degree of quality and accuracy 
in pronorainal anaphora resolution comparable to that 
reported in (Lappin and l,eass, 1994). The modifica- 
tions discussed below make the algorithm available to 
a wide range of text processing frameworks, which, 
due to the lack of full syntactic parsing capability, nor- 
really would have been unable to use this high preci- 
sion anap hora resolution tool. The work is additionally 
important, we feel, as it shows that informatkm about 
the content and logical structure of a text, in princi-. 
pie a core requirement for higher level semantic and 
discourse processes, can be effectively approximated 
by the right mix of constituent analysis and inferences 
about functional relations. 
2 General outline of the algorithm 
The base level linguistic analysis for actaphora resolu- 
tion is the output of a part of speech tagger, augmented 
with syntactic function annotatkms for each input to. 
ken; this kind of analysis is generated by the mor- 
pbosyntactic tagging system described in (Voutilainen 
et al., 1992), (Karlsson et al., 1995) (hencehvth 1,1NC:- 
~;olq'). In addition to extremely high levels of accuracy 
in recall and precision of tag assignment ((VoutiJainen 
et al., 1992) report 99.77°/,, overall recall and 95.54% 
overall preciskm, over a variety of text genres, and 
in comparison with other state-of-the-art tagging sys- 
tems), the primary motivation for adopting this system 
is the requirement to develop a robust text processor- - 
with anaphora resolution being just one of its discourse 
analysis functkms capable of reliably handling arbi- 
trary kinds of input. 
The tagger provides a very simple analysis of the 
structure of the text: for each lexical item in each sen- 
tence, it provides a set of values which indicate the 
morphological, lexical, grammatical and syntactic fea- 
tures of the item in tile context in which it appears. In 
addition, the modified algorithm we present requh:es 
annota tion of the input text stream by a simple position-- 
identification function which associates an integer with 
each token in a text sequentially (we will refer to a to- 
ken's integer value as its oJ~et). 
As an example, given the text 
"For 1995 the company set up its headquar- 
ters in Hall \] l, the newest and most presti-. 
gious of CeBIT's 23 hal Is." 
tile anaphora resolutkm algorithm would be presented 
with the h}llowing analysis stream. Note, in particu-. 
lar, the grammatical function information (e.g., @SUl~J, 
O)q.FMAINV) and the integer values (e.g., "offt 39") asso- 
cia ted with each token. 
"For/off139" "for" PREP @ADVL 
"1995/off140 .... 1995" NUM CARD @<P 
"the/offl41" "the" DET CENTRAL ART SG/PL @DN> 
"company/off142" "company" N NOM SG/PL @SUBJ 
"set/off143" "set" V PAST VFIN @+FMAINV 
"up/off144" "up" ADV ADVL @ADVL 
"its/off145 .... it" PRON GEN SG3 @GN> 
"headquarters/off146 .... headquarters" N NOM SG/PL @OBJ 
"in/off147 .... in" PREP @<NOM @ADVL 
"Hall/off148 .... hall" N NOM SG @NN> 
"ll/off149" "Ii" NUM CARD @<P 
"$,/offl50 .... ," PUNCT 
"the/offl51" "the" DET CENTRAL ART SG/PL @DN> 
"newest/off152 .... new" A SUP @PCOMPL-O 
"and/off153 .... and" CC @CC 
"most/off154" "much" ADV SUP @AD-A> 
"prestigious/off155 .... prestigious" A ABS @<P 
"of/off156 .... of" PREP @<NOM-OF 
"CeBIT's/off157" "cebit" N GEN SG @GN> 
"23/0ff158 .... 23" NUM CARD @QN> 
"halls/off159 .... hall" N NOM PL @<P 
"$./off160 .... ." PUNCT 
2.1 Data collection 
Although LINGSOFT does not provide specific infor- 
mation about constituent structure, partial constituen- 
cy-specifically, identification of sequences of tokens 
as phrasal units--can be inferred from the analysis by 
running the tagged text through a set of filters, which 
are stated as regular expressions over metatokens such 
as the ones illustrated above. 
For the purposes of anaphora resolution, the pri- 
mary data set consists of a complete listing of all noun 
phrases, reduced to modifier-head sequences. This 
data set is obtained by means of a phrasal grammar 
whose patterns characterize the composition of a noun 
phrase (NP) in terms of possible token sequences. The 
output of NP identification is a set of token/feature 
matrix/offset sequences, where offset value is deter- 
mined by the offset of the first token in the sequence. 
The offset indicates the position of the NP in the text, 
and so provides crucial information about precedence 
relations. 
A secondary data set consists of observations about 
the syntactic contexts in which the NPs identified by 
the phrasal grammar appear. These observations are 
derived using a set of patterns designed to detect nom- 
inal sequences in two subordinate syntactic environ- 
ments: containment in an adverbial adjunct and con- 
tainment in an NP (i.e., containment in a prepositional 
or clausal complement of a noun, or containment in a 
relative clause). This is accomplished by running a set 
of patterns which identify NPs that occur locally to ad- 
verbs, relative pronouns, and noun-preposition or noun- 
complementizer sequences over the tagged text in con- 
junction with the basic NP patterns described above. 
Because the syntactic,patterns are stated as regular ex- 
pressions, misanalyses are inevitable. In practice, how- 
ever, the extent to which incorrect analyses of syntactic 
context affect the overall accuracy of the algorithm is 
not large; we will return to a discussion of this point in 
section 4. 
A third set of patterns identifies and tags occurrences 
of "expletive" it. These patterns target occurrences of 
the pronoun it in certain contexts, e.g., as the subject of 
members of a specific set of verbs (seem, appear, etc.), or 
as the subject of adjectives with clausal complements. 
Once the extraction procedures are complete and the 
results unified, a set of discourse referents--abstract ob- 
jects which represent the participants in the discourse-- 
is generated from the set of NP observations. A particu- 
larly convenient implementation of discourse referents 
is to represent them as objects in the Common Lisp 
Object System, with slots which encode the following 
information parameters (where ADJUNCT and EMBED 
indicate whether a discourse referent was observed in 
either of the two syntactic contexts discussed above): 
TEXT: text form 
TYPE: referential type (e.g., REF, PRO, RFLX) 
AGR: person, number, gender 
GFUN: grammatical function 
ADJUNCT: T or NIL 
EMBED: T or NIL 
POS: text position 
Note that each discourse referent contains information 
about itself and the context in which it appears, but 
the only information about its relation to other dis- 
course referents is in the form of precedence relations 
(as determined by text position). The absence of explicit 
information about configurational relations marks the 
crucial difference between our algorithm and the Lap- 
pin/Leass algorithm. (Lappin and Leass, 1994) use 
configurational information in two ways: as a factor in 
the determination of the salience of a discourse refer- 
ent (discussed below), and as input to a set of disjoint 
reference filters. Our implementation seeks to perform 
exactly the same tasks by inferring hierarchical rela- 
tions from a less rich base. The modifications and 
assumptions required to accomplish this goal will be 
highlighted in the following discussion. 
2.2 Anaphora resolution 
Once the representation of the text has been recast as a 
set of discourse referents (ordered by offset value), it is 
sent to the anaphora resolution algorithm proper. The 
basic logic of the algorithm parallels that of the Lap- 
pin/Leass algorithm. The interpretation procedure in- 
volves moving through the text sentence by sentence 
and interpreting the discourse referents in each sen- 
tence from left to right. There are two possible in- 
terpretations of a discourse referent: either it is taken 
to introduce a new participant in the discourse, or it 
is taken to refer to a previously interpreted discourse 
referent. Coreference is determined by first eliminating 
from consideration those discourse referents to which 
an anaphoric expression cannot possibly refer, then se- 
lecting the optimal antecedent from the candidates that 
remain, where optimality is determined by a salience 
measure. 
In order to present the details of anaphora resolution, 
we define below our notions--and implementations-- 
of coreference and salience. 
2.2.1 Coreference 
As in the Lappin and Leass algorithm, the anaphor- 
antecedent relation is established between two dis- 
course referents (cf. (Helm, 1982), (Kamp, 1981)), @hile 
the more general notion of coreference is represented 
in terms of equivalence classes of anaphorically re- 
lated discourse referents, which we will refer to as 
"COREF classes". Thus, the problem of interpreting an 
anaphoric expression boils down to the problem of es- 
tablishing an anaphoric link between the anaphor and 
some previously interpreted discourse referent (pos- 
sibly another anaphor); a consequence of establishing 
114 
this link is that the anaphor becomes a member of the 
COREF class already associated with its antecedent. 
In our implementation, COREF classes are repre- 
sented as objects in the Common Lisp Object System 
which contain information about the COREF class as 
a whole, including canonical form (typically deter- 
mined by the discourse referent which introduces the 
class), membership, and, most importantly, salience 
(discussed below). 1 The connection between a dis- 
course referent and its COREF class is mediated through 
the COREF object as follows: every discourse referent 
includes an information parameter which is a pointer 
to a COREF object; discourse referents which have been 
determined to be coreferential share the same COREF 
value (and so literally point to the same object). Imple- 
menting coreference in this way provides a means of 
getting from any discourse referent in a COREF class to 
information about the class as a whole. 
2.2.2 Salience 
The information parameter of a COREF object most cru- 
cial to anaphora resolution is its salience, which is de- 
termined by the status of the members of the COREF 
class it re.presents with respect to 10 contextual, gram- 
matical, and syntactic constraints. Following (Lappin 
and Leass, 1994), we will refer to these constraints as 
"salience factors". Individual salience factors are asso- 
ciated with numerical values; the overall salience, or 
"salience weight" of a COREF is the sum of the values of 
the salience factors that are satisfied by some member 
of the COREF class (note that values may be satisfied at 
most once by each member of the class). The salience 
factors used by our algorithm are defined below with 
their values. Our salience factors mirror those used by 
(Lappin and Leass, 1994), with the exception of Poss-s, 
discussed below, and CNTX-S, which is sensitive to the 
context in which a discourse referent appears, where a 
context is a topically coherent segment of text, as deter- 
mined by a text-segmentation algorithm which follows 
(Hearst, 1994). 
SENT-S: 100 iff in the current sentence 
CNTX-S: 50 iff in the current context 
SUBJ-S: 80 iff GFUN = subject 
EXST-S: 70 iff in an existential construction 
POSS-S: 65 iff GFUN = possessive 
ACC-S: 50 iff GFUN = direct object 
DAT-S: 40 iff GFUN = indirect object 
OBLQ-S: 30 iff the complement of a preposition 
HEAD-S: 80 iff EMBED = NIL 
ARG-S: 50 iff ADJUNCT = NIL 
Note that the values of salience factors are arbitrary; 
what is crucial, as pointed out by (Lappin and Leass, 
1994), is the relational structure imposed on the factors 
by these values. The relative ranking of the factors is 
justified both linguistically, as a reflection of the role 
of the functional hierarchy in determining anaphoric 
relations (cf. (Keenan and Comrie, 1977)), as well as 
by experimental results--both Lappin and Leass' and 
our own. For all factors except CNTX-S and POSS-S, we 
adopt the values derived from a series of experiments 
described in (Lappin and Leass, 1994) which used dif- 
ferent settings to determine the relative importance of 
1The implementation of a COREF object needs to be aware of po- tenlial circularities, thus a COREF does not actually contain its member 
discourse referents, but rather a listing of their offsets, 
each factor as a function of the overall success of the 
algorithm. Our values for CNTX-S and POSS-S were de- 
termined using similar tests. 
An important feature of our implementation of 
salience, following that of Lappin and Leass, is that it 
is variable: the salience of a COREF class decreases and 
increases according to the frequency of reference to the 
class. When an anaphoric link is established between a 
pronoun and a previously introduced discourse refer- 
ent, the pronoun is added to the COREF class associated 
with the discourse referent, its COREF value is set to the 
COREF value of the antecedent (i.e., to the COREF ob- 
ject which represents the class), and the salience of the 
COREF object is recalculated according to how the new 
member satisfies the set of salience factors. This final 
step raises the overall salience of the COREF, since the 
new member will minimally satisfy SENT-S and CNTX-S. 
Salience is not stable, however: in order to realisti- 
cally represent the local prominence of discourse ref- 
erents in a text, a decay function is built into the algo- 
rithm, so that salience weight decreases over time. If 
new members are not added, the salience weight of a 
COREF eventually reduces to zero. The consequence of 
this variability in salience is that a very general heuris- 
tic for anaphora resolution is established: resolve a 
pronoun to the most salient candidate antecedent. 
2.2.3 Interpretation 
As noted above, in terms of overall strategy, the resolu- 
tion procedure follows that of Lappin and Leass. The 
first step in interpreting the discourse referents in a new 
sentence is to decrease the salience weights of the COREF 
classes that have already been established by a factor of 
two. Next, the algorithm locates all non-anaphoric dis- 
course referents in the sentence under consideration, 
generates a new COREF class for each one, and calcu- 
lates its salience weight according to how the discourse 
referent satisfies the set of salience factors. 
The second step involves the interpretation of lexical 
anaphors (reflexives and reciprocals). A list of candi- 
date antecedent-anaphor pairs is generated for every 
lexical anaphor, based on the hypothesis that a lexical 
anaphor must refer to a coargument. In the absence 
of configurational information, coarguments are iden- 
tified using grammatical function information (as de- 
termined by LINGSOFT) and precedence relations. A 
reflexive can have one of three possible grammatical 
function values: direct object, indirect object, or oblique. 
In the first case, the closest preceding discourse referent 
with grammatical function value subject is identified as 
a possible antecedent. In the latter cases, both the clos- 
est preceding subject and the closest preceding direct 
object that is not separated from the anaphor by a sub- 
ject are identified as possible antecedents. If more than 
one possible antecedent is located for a lexical anaphor, 
the one with the highest salience weight is determined 
to be the actual antecedent. Once an antecedent has 
been located, the anaphor is added to the COREF class 
associated with the antecedent, and the salience of the 
COREF class is recalculatec~ accordingly. 
The final step is the interpretation of pronouns. The 
basic resolution heuristic, as noted above, is quite sim- 
ple: generate a set of candidate antecedents, then es- 
tablish coreference with the candidate which has the 
greatest salience weight (in the event of a tie, the clos- 
est candidateis chosen). In order to generate the candi- 
date set, however, those discourse referents with which 
115 
a pronoun cannot refer must be eliminated from consid- 
eration. This is accomplished by running the overall 
candidate pool (the set of interpreted discourse ref- 
erents whose salience values exceed an arbitrarily set 
threshold) through two sets of filters: a set of morpho- 
logical agreement filters, which eliminate from consid- 
eration any discourse referent which disagrees in per- 
son, numbeb or gender with the pronoun, and a set of 
disjoint reference filters. 
The determination of disjoint reference represents a 
significant point of divergence between our algorithm 
and the Lappin/Leass algorithm, because, as is well 
known, configurational relations play a prominent role 
in determining which constituents in a sentence a pro- 
noun may refer to. Three conditions are of particular 
relevance to the anaphora resolution algorithm: 
Condition \]: A pronoun cannot corefer with a 
coargument. 
Condition 2: A pronoun cannot corefer with a 
nonpronominal constituent which it both 
commands and precedes. 
Condition 3: A pronoun cannot corefer with a 
constituent which contains it. 
In the absence of configurafional information, our al- 
gorithm relies on inferences from grammatical func- 
tion and precedence to determine disjoint reference. In 
practice, even without accurate information about con- 
stituent structure, the syntactic filters described below 
are extremely accurate (see the discussion of this point 
in section 4). 
Condition i is implemented by locating all discourse 
referents with GFUN value direct object, indirect object, or 
oblique which follow a pronoun with GFUN value subject 
or direct object, as long as no subject intervenes (the 
hypothesis being that a subject indicates the beginning 
of the next clause). Discourse referents which satisfy 
these conditions are identified as disjoint. 
Condition 2 is implemented by locating for ev- 
ery non-adjunct and non-embedded pronoun the set 
of non-pronominal discourse referents in its sentence 
which follow it, and eliminating these as potential an- 
tecedents. In effect, the command relation is inferred 
from precedence and the information provided by the 
syntactic patterns: an argument which is neither con- 
tained in an adjunct nor embedded in another nominal 
commands those expressions which it precedes. 
Condition 3 makes use of the observation that a dis- 
course referent contains every object to its right with a 
non-nil EMBED value. The algorithm identifies as dis- 
joint a discourse referent and every pronoun which fol- 
lows it and has a non-nil EMBED value, until a discourse 
referent with EMBED value NIL is located (marking the 
end of the containment domain). Condiditon 3 also 
rules out coreference between a genitive pronoun and 
the NP it modifies. 
After the morphological and syntactic filters have 
been applied, the set of discourse referents that remain 
constitute the set of candidate antecedents for the pro- 
noun. The candidate set is subjected to a final evalu- 
ation procedure which performs two functions: it de- 
creases the salience of candidates which the pronoun 
precedes (cataphora is penalized), and it increases the 
sa l i ence of candida tes which satisfy either a locality or a 
parallelism condition (described below), both of which 
apply to intrasentential candidates. 
The h)cality heuristic is designed to negate the effects 
of subordinationwhen both candidate and anaphor ap- 
pear in the same subordinate context, the assumption 
being that the prominence of a candidate should be de- 
termined with respect to the position of the anaphor. 
This is a point of difference between our algorithm and 
the one described in (Lappin and Leass, 1994). The 
salience of a candidate which is determined to be in the 
same subordinate context as a pronoun (determined 
as a function of precedence relations and EMBED and 
ADJUNCT values) is temporarily increased to the level 
it would have were the candidate not in the subordi- 
nate context; the level is returned to normal after the 
anaphor is resolved. 
The parallelism heuristic rewards candidates which 
are such that the pair consisting of the GFUN values of 
candidate and anaphor are identical to GFUN values of 
a previously identified anaphor-antecedent pair. This 
parallelism heuristic differs from a similar one used 
by the Lappin/Leass algorithm, which rewards candi- 
dates whose grammatical function is identical to that 
of an anaphor. 
Once the generation and evaluation of the candidate 
set is complete, the candidates are ranked according 
to salience weight, and the candidate with the high- 
est salience weight is determined to be the antecedent 
of the pronoun under consideration. In the event of 
a tie, the candidate which most immediately precedes 
the anaphor is selected as the antededent (where prece- 
dence is determined by comparing offset values). The 
COREF value of the pronoun is set to that of the an- 
tecedent, adding it to the the antecedent's COREF class, 
and the salience of the class is recalculated accordingly. 
3 Example output 
The larger context from which the sample analysis in 
the beginning of Section 2 was taken is as follows: 
"...while Apple and its PowerPC partners 
claimed some prime real estate on the show 
floor, Apple's most interesting offerings de- 
buted behind the scenes. Gone was the nar- 
row corner booth that Apple shoehorned its 
products into last year. For 1995 the com- 
pany set up its headquarters in Hall 11, the 
newest and most prestigious of CeNT's 23 
halls." 
The anaphora resolution algorithm generates the fol- 
lowing analysis for the first italicized pronoun. For 
each candidate, ~ the annotation in square brackets in- 
dicates its offset value, and the number to the right 
indicates its salience weight at the point of interpreta- 
tkm of the pronoun. 
ANA: its \[@off/\]33\] 
CND: Apple \[@of 1/131\] 432 
Apple \[/aol f/10\] \] 352 
its \[@off/\].03\] 352 
App\]e's \[@offf/\] I 5\] 1352 
prilne real estat(! \[@off/\]08\] 165 
show floor \[(aoff/1\]2l \]55 
year \[@o~f/137 I 310/3 
The candidate set illustrates several important points. 
First, the equality in salience weights of the candi- 
dates at offsets 101, 103, and 115 is a consequence of 
2Note that our syntactic filters are quite capable of discarding a 
number of configurationally inappropriate antecedents, which appear 
to satisfy the precedence relation. 
116 
the fact that these discourse referents are members of 
the same COP, Et ~' class. Their unification into a single 
class indicates both successful anaphora resolution (of 
the pronoun at offset 103), as well as the operation of 
higherqevel discourse processing designed to identify 
all references to a particular COREF class, not just the 
anaphoric ones (cf. (Kennedy and Boguraev, :1996)). 
The higher salience of the optimal candidate--which 
ix also a member of this COREF class--shows the effect 
of the locality heuristic described in section 2.2.3. Both 
the pronoun and the candidate appear in the same sub- 
ordinate context (within a relative clause); as a result 
the salience of the candidate (but not of the class to 
which it bekmgs) is temporarily boosted to negate the 
effect of subordinatkm. 
An abbreviated candidate set for the second itali- 
cized pronoun is given below: 
ANA: its {61of f /145\] 
CND: company \[(,)ot I / 142 \] :H,0 
Appl e ((,!of 17/ 13 / \] 192 
it:~:; {(aof I / I 3 ~ \] 192 
This set is interesting because it illustrates the promi- 
nent role of SENT-S in controlling salience: company ix 
correctly identified as the antecedent of the pronotm, 
despite the frequency of mention of members of the 
COREF class containing Apple and its, because it occurs 
in the same sentence as the anaphor. Of course, this ex- 
ample also indicates the need fl~r additional heuristics 
designed to connect company with Apple, since these 
discourse referents clearly make reference to the same 
object. We are currentlyworking towards this goal; see 
(Kennedy and Boguraev, \] 996) for discussion. 
'l'he following text segment illust rates the resolution 
of in tersen ten tia l a napho ra. 
"Sun's prototype lntemet access device uses 
a 1-10-Mhz MicroSPARCprocesso~; and is 
diskless. Its dimensions are 5.5 inches x 9 
inches x 2inches." 
ANA: \]its \[\[aol f/347\] 
CNI): IAlte~:ileL access devic() \[(,!o~\[/33\[i\] 180 
M i c KOf;PARCI)rOC e!s sot \[(4oEI /34\] \] 16!i 
~;un's \[<4o1 f/3:t3 I \[40 
The first sentence in this fl'agment introduces three dis- 
course referents bearing different grammatical func- 
tions, none of which appear in subordinate contexts. 
Since the sentence in which the anaphor occurs does 
not contain any candidates (the discourse referent in- 
troduced by dimensions ix eliminated from considera- 
tion by both the morphok)gical anct disjoint reference 
filters), only those from the previous sentence are con- 
sidered (each is compatible with the morphological 
requirements of the anaphor). These are ranked ac- 
cording to salience weight, where the crucial factor is 
grammatical function value. The result of the ranking 
is that Internet access device--the candidate which satis- 
fies the highest-weighted salience facto1, SUBl-S--is the 
optimal candidate, and so correctly identified as the 
an tecedent 
4 Evaluation 
Quantitative evaluation shows the anaphora resolution 
algorithm described here to run at a rate of 75'70 accu- 
racy. The data set on which the evaluatkm was based 
consisted of 27 texts, taken from a random selection 
of genres, including press releases, product annotmce- 
meats, news stories, magazine articles, and other doc- 
uments existing as World Wide Web pages. Within 
these texts, we counted 3(16 third person anaphoric pro- 
nouns; of these, 231l were correctly resolved to the dis- 
course referent identified as the antecedent by the first 
author. 3 This rate of accuracy is clearly comparable 
to that of the Lappin/Leass algorithm, which (Lappin 
and Leass, \] 994) report as 85°/,,. 
Several observations about the results and the com- 
parison with (lmppin and I,eass, 1994) are in order. 
First, and most obviously, some deterioratkm in qual- 
ity is to be expected, given the relatively impoverished 
linguistic base we start with. 
Second, it is important to note that this is not just a 
matter of simple comparison. The results in (l.appin 
and Leass, 1994) describe the output of the procedttre 
applied to a singh,' text genre: computer manuals. Ar- 
guably, this is an example of a particularly well be- 
haved text; in any case, it is not clear how the figure 
would be normalized over a wide range of text types, 
some of them not completely 'clean', as is the case with 
our data. 
Third, close analysis of the most common types of 
error our algorithm currently makes reveals two spe- 
cific configurations in the input which confuse the pro- 
cedure and contribute to the error rate: gender mis- 
match (35% of errors) and certain long range contextttal 
(stylistic) phenomena, best exemplified by text contain- 
ing quoted passages in-line (14% of errors). 
Implementing a gender (dis-)agreement filter is not 
technically complex; as noted above, the current algo- 
rithrn contains one. The persistence of gender mis- 
matches in the output simply reflects the lack of a con- 
sistent gender slot in the I,\[NGSOFT tagger output. Aug- 
menting the algorithm with a lexical database which 
includes more detailed gender information will result 
in improved accuracy. 
Ensuring proper interpretatkm of anaphors both 
within and outside of quoted text requires, in effect, 
a method of evaluating quoted speech separately from 
its surrotmdingcnntext. Although a complex problem, 
we feel that this is possible, given that our input data 
stream embodies a richer notkm of position and con- 
text, as a resu\[t of an independent text segmentation 
procedure adapted from (\[ learst, 1994) (and discussed 
above in section 2.2.2). 
What is worth noting is the small number of errors 
which can be directly attributed to the absence of con- 
figurational inh~rmation. Of the 75 misinterpreted pro- 
nouns, only 2 inw~lved a failure to establish configu- 
ratkmally determined disjoint reference (both of these 
inw~lved Condition 3), and only an additional several 
errors could be tmambiguously traced to a failure to 
correctly identify the syntactic context in which a dis~ 
course referent appeared (as determined by a misfireof 
the salience factors sensitive to syntactic context, I lEAD- 
S and ARC:S). 
Overall, these considerations lead to two conchl-. 
sions. First, with the incorporation of more explicit 
morphological and contextual information, it should 
3The set of 306 "anaphoric" pronouns excluded 30 occurrences 
of "expletive" it not identified by the expletive patterns (prhnarily 
occurrences in object position), as well as 6 occurrences of it which 
referred to a VP or propositional constituent. We are currently mfinin g 
the existing expletive patterns for improved accuracy. 
117 
be possible to increase the overall quality of our out- 
put, bringing it much closer in line with Lappin and 
Leass' results. Again, straight comparison would not 
be trivial, as e.g. quoted text passages are not a natural 
part of computer manuals, and are, on the other hand, 
an extremely common occurrence in the types of text 
we are dealing with. 
Second, and most importantly, the absence of ex- 
plicit configurational information does not result in a 
substantial degradation in the accuracy of an anaphora 
resolution algorithm that is otherwise similar to that 
described in (Lappin and Leass, 1994). 
5 Conclusion 
Lappin and Leass' algorithm for pronominal anaphora 
resolution is capable of high accuracy, but requires in- 
depth, full, syntactic parsing of text. The modifications 
of that algorithm that we have developed make it avail- 
able to a larger set of text processing frameworks, as 
we assume a considerably 'poorer' analysis substrate. 
While adaptations to the input format and interpreta- 
tion procedures have necessarily addressed the issues 
of coping with a less rich level of linguistic analysis, 
there is only a small compromise in the quality of the 
results. Our evaluation indicates that the problems 
with the current implementation do not stem from the 
absence of a parse, but rather from factors which can 
be addressed within the constraints imposed by the 
shallow base analysis. The overall success of the algo- 
rithm is important, then, not only for the immediate 
utility of the particular modifications, but also because 
the strategy we have developed for circumventing the 
need for full syntactic analysis is applicable to other in- 
terpretation tasks which, like the problem of anaphora 
resolution, lie in the space of higher level semantic and 
discourse analysis. 
References 
Marti Hearst. 1994. Multi-paragraph segmentation of 
expository text. In 32nd Annual Meeting of the Associ- 
ation for Computational Linguistics, Las Cruces, New 
Mexico. Association for Computational Linguistics, 
Morristown, New Jersey. 
Irene Heim. 1982. The Semantics of Definite and Indefinite 
Noun Phrases. Doctoral dissertation, University of 
Massachusetts, Amherst. 
Hans Kamp. 1981. A theory of truth and semantic 
representation. In J. Groenendijk, T. Janssen, and M. 
Stokhof (eds.), Formal Methods in the Study of Lan- 
guage. Mathematisch Centrum Tracts, Amsterdam. 
Fred Karlsson, Atro Voutilainen, Juha Heikkila, and 
Arto Antilla. 1995. Constraint grammar: A language- 
independent system for parsing free text. Mouton de 
Gruyter, Berlin/New York. 
Edward Keenan and Bernard Comrie. 1977. Noun 
phrase accessibility and universal grammar. Linguis- 
tic Inquiry, 8:62-100. 
Christopher Kennedy and Branimir Boguraev. 1996. 
Anaphora in a wider context: Tracking discourse 
referents. In W. Wahlster (ed.), I2th European Con- 
ference on Artificial Intelligence. John Wiley and Sons, 
Ltd, London/New York. 
Shalom Lappin and Herb Leass. 1994. An algorithm 
for pronominal anaphora resolution. Computational 
Linguistics, 20(4):535-561. 
Atro Voutilainen, Juha Heikkila, and Arto Antilla. 
1992. A constraint grammar of English: A performance- 
oriented approach. University of Helsinki, Publication 
No. 21, Helsinki, Finland. 
118 
