Q 
@ 
@ 
@ 
O 
O 
O @ 
@ 
O @ 
O 
O @ 
@ 
0 @ 
@ 
O @ 
O @ 
O 
O 
O @ 
@ 
@ 
O @ 
O 
O @ 
@ 
O 
O @ 
@ 
O @ 
O 
O 
O 
O 
Generating Anaphoric Expressions: Pronoun or Definite Description? 
Kathleen F. McCoy 
Dept. of Computer and Information Sciences 
University of Delaware 
103 Smith Hall 
Newark, DE 19716, USA 
mccoyeci s . udel. edu 
Michael Strube 
institute for Research in Cognitive Science 
University of Pennsylvania 
3401 Walnut Street. Suite 400A 
Philadelphia, PA 19104, USA • 
strubeglinc, cis .upenn. edu 
Abstract 
In order to produce coherent text. natural lan- 
guage generation systems must have the ability 
to generate pronouns in the appropriate places. 
In the past, pronoun usage was primarily inves- 
tigated with respect to the accessibility of ref- 
erents. We.argue that generating appropriate 
referring expressions requires looking at fac- 
tors beyond accessibility. Also important are 
sentence boundaries, distance from last men- 
tion, discourse structure and ambiguity. We 
present an algorithm for generating appropri- 
ate anaphoric expressions which takes the tent+ 
poral structure of texts and knowledge about 
ambiguous contexts into account. We back up 
our hypotheses with some empirical results in- 
dicating that our algorithm chooses the right 
referring expression in 85% of the cases. 
1 Introduction 
Anaphoric expressions are an important component to 
generating coherent discourses. While there has been 
some work on generating appmwiate refening expres- 
sions, little attention has been given to the problem of 
when a pronoun should be used to ref= to an object. In 
most instances the assumption has been that a pronoun 
should be generated when referring to a discourse entity 
that is highly prominent (accessible). However, a study 
of naturally occurring texts reveals that factors beyond 
accessibility must be ~g*n into account in order to ex- 
plain the patterns of pronoun use found. 
Other researchers have indicated that fuller descrip- 
tions tend to bbe found at the beginning of discourse seg- 
ments (Grosz & Sidaer, 1986; Reichman, 1985), even 
when the object being referred to is extzemely prominent 
in the preceding sentence. There may be two reasons for 
this: (1) it could be that the item is not accessible since 
the "focus space" associated with the previous sentence 
is '*popped" at the boundary (and perhaps an older fo- 
cus space, with respect to which accessibility should be 
judged, is restored in its place (Passonneau. 1996)), (2) 
the use of a fuller definite description is actually mark- 
ing the discourse segment boundary. If (2) is the case, 
because other methods for marking boundaries are also 
possible (and the writer need not use multiple markings), 
this may help explain why the correlation between dis- 
course segment boundaries and fuller referring expres- 
sions is not perfecL 
Supposing that discourse, segments are an important 
factor in choosing anaphoric expressions, in order to take 
advantage of them we must have a clear definition-Of 
what a discourse segment is. For generation, the dis- 
course segment boundary must be part of the input to 
a sentence generator (which, we assume, is responsible 
for making the referring expression choice). To evaluate 
proposals for generating referring expressions, the dis- 
course segment boundary must also be recognizable. 
While discourse segment boundaries may be impor- 
tant, note that fuller noun phrases are sometimes used 
when there is no discourse segment boundary (under 
any reasonable definition of boundary). This may oc- 
cur when the referent is not accessible because the near- 
est antecedent is too far away or because it is confusable 
with another referent. We attempt to explain such in- 
stances as well. 
In order to determine the circuntctances under which 
to choose a pronoun versus a definite descripfiont0 our 
tack has been to study naturally occurring examples and 
to try to hypothesize rules that explain the reference 
forms in those examples. To date we have concentrated 
our study on New York Times news articles. We hope 
to generalize some of our findings to other types of text 
genres as well. 
Consider the following passage from the first several 
lines of one.of the stories we analyzed: 
EaJmple 1: 
When Kenneth L. Curtis was wheeled 
into court nine years ago, mute, dull-eyed and 
crippled, it seemed clear to nearly everyone in- 
volved that it would be pointless t o put him 
on trial for the murder of his former girlfriend, 
'We use the term definite description to mean either a defi- 
nite noun phrase or a name.. 
63 
Donna Kalson, and the wounding of her com- 
panion. 
It had beenayear since Mr. Curtis had 
slammed his pickup truck into them, breaking 
their legs. He then shot them both and, finally, 
fired a bullet into his own brain. Mr. Curtis 
fingered in a coma for months, then awoke to a 
world of paralysis, pain and mental confusion 
from which psychiatric experts said he would 
never emerge. 
One expert calculated his I.Q. at 62 ..... 
For convenience, we have indicated all references to 
the main character in bold. 
A surprising thing to note about this passage is that not 
all of the anaphoric references to Mr. Curtis are pronouns 
even though he is arguably the focus of every sentence 
included. Some previous work on pronoun generation 
would predict that a pronoun should be used if the same 
item remains in focus. Thus it appears that something 
other than a straightforward application of focusing or 
other pronoun resolution algorithms is necessary. 
• A second thing to note about this passage is that the 
sentences are generally long and complex and often con- 
rain several references to the same character. These 
types of sentences are very different from those that 
have been considered by any generation system that has 
rules for generating pronouns. In addition, it is not clear 
how most focusing or pronoun interpretation algorithms 
would handle them. 
One hypothesis that might be made is that the under- 
lying structure of the text might affect pronoun gener- 
ation. Care must be taken in choosing such a structure. 
The chosen structure must (1).explain the patterns of pro- 
nouns found in naturally occurring text' and (2) be based 
on information available to a sentence generation system. 
Notice that constructs such as paragraph breaks meet nei- 
ther of these criteria. 
In this work we hypothesize that discourse structure 
(segmentation) is indeed vital in the decision of whether 
or not to generate a pronoun. However, we argue that 
a single definition of disburse segment is not sufficient 
to explain the patterns of pronoun use found, and seek 
a more general notion. Here, to distingni~h o~ notion 
from other notions of discmuse segmentation found in 
the literature (e.g., Reidmum (1985) or Grosz & Sidner 
(1986)), we use the term discomae thread to capture the 
structuring notion to whidz we refer. We propose that a 
discourse generally contains multiple threads which run 
through the discourse and can serve to structure the dis- 
course. In general, a single thread is evident at a par- 
ticular point in the discourse, but this thread may be re- 
placed by another thread and then picked up again at an- 
other point in the discourse (cf. Ros6 et al. (1995)). The 
"threading device" used to structure will be different for 
different kinds ofdisc0urses. For instance, in the kinds of 
discourses studied in Grosz & Sidner (1986) the thread- 
ing device may be the intentional structures (and each of 
their discourse segments would constitute a thread of the 
discourse), in the discourses that we studied (New York 
Times a~cles), threads defined in terms of the time ref- 
erenced in a clause appeared to be quite prominent. 
In this paper we present our preliminary work in un- 
covering factors that affect pronoun generation deci- 
sions. Our work so far has led us to hypothesize several 
factors including: 
Sentence Boundaries - pronouns appear to be the pre- 
ferred referring form for subsequent reference to an 
item within the same sentence. 
Distance from Last Mention - when the last mention • 
of an item is several sentences back in the text' a 
definite description is preferred. 
Discourse Structure in terms of multiple threads - 
when the previous reference to an item is in the 
same thread as the current reference, a pronoun is 
preferred. A definite description is preferred when 
the threads are differenC 
Ambiguity - potential ambiguity must be taken into ac- 
count when choosing an anaphoric expression in 
that a pronoun should only be generated if it can 
be resolved correctly. 
In the next section we discuss previous research on 
pronoun generation. "finis is followed by a discussion of 
some anaphoric expressions that do not require consid- 
ering discourse structure to generation. This is followed 
by an introduction of time as a slructuring device which 
affects pronoun generation. Next we investigate ambigu- 
ous anapboric references. We follow this with an algo- 
rithm which decides when to use a pronoun versus a def- 
inite description when referring to some discourse entity. 
After this, we report on empirical results of the appli- 
cation of our algorithm to a corpus of New York Tunes 
articles. Finally, we provide some/~'i'~ted work, future 
research, and conclusions. 
2 Previous Work on Pronoun Generation 
Few researchers have given serious consideration to the 
problem of pronoun generation. The most common fac- 
tot considered has been the accessibility of the refer- 
eaL If the referent is sufficiently prominent in the 
ceding text, a pronoun is usecL Some early generation 
work (e.g., McDonald (1980), McKeown (1983), McKe- 
own (1985), Appelt (1981)) used a simple mleto imple- 
ment this idea based on focus (Sidner, 1979) that roughly 
stated that if the current sentence is about the same thing 
that the previous sentence was about, use a pronoun to 
refer to that thing. As was pointed out above, this rule 
does not provide a very good match with the referring 
expressions in our corpus. 
Dale (1992)also ~ the generation of pronouns 
in the context of work on generating referring expres- 
sions (Appelt' 1985; Reiter, 1990). Dale specified an al- 
gorithm that essentially generated the smallest refen'ing 
expression that distinguished the object in question from 
64 
0 @ 
@ 
0 @ 
0 @ 
0 
0 
0 
0 @ 
0 
0 @ 
@ 
@ 
@ 
0 
0 
0 @ 
0 @ 
@ 
@ 
0 
0 
0 @ 
0 @ 
@ 
0 
0 @ 
@ 
@ 
0 @ 
@ 
O 
O 
O 
O 
0 @ 
@ 
O 
O 
O 
O 
O @ 
O 
O 
O 
O @ 
O 
O 
O 
O 
O 
O @ 
@ 
O 
O @ 
O 
O 
O @ 
O @ 
@ 
@ 
@ 
@ 
O @ 
O 
• nil others in the context) He generated a pronoun (or 
ellipsis) if one were adequate and if the object being re- 
ferred to was the center of the last utterance (where the 
notion of center was defined in a domain dependent fash- 
ion). As an example of the kinds of texts he generated 
consider: "Soak the butterbeaus. Drain and rinse them." 
Such an account of pronoun generation, based on cen- 
ter constancy, appears to work quite well in the domain 
Dale considered. However, as pointed out in Example 1, 
it does not seem to explain the patterns found in the texts 
we analyzed. 
The centering model (Orosz et al., 1995) itself makes 
predictions about pronoun generation only in a specific 
instance - that where Rule 1 is appficeble. Centering's 
Rule 1 states that if any element of the previous utter- 
ance's forward looking center list is realized in the cur- 
rent utterance as a pronoun, then the backward looking 
center must be realized as a pronoun as well (Grosz et al., 
1995. p.214). Notice that the Mr. Curtis at the beginning 
of the second sentence in Example I is an apparent vio- 
lation of this rule. But, more generally, we must have a 
theory that is able to handle all cases of pronoun use. 
A pronoun interpretation algorithm based on centering 
which relied on centering transition preferences was de- 
veloped in Brennan et aL (1987)~ Using transition pref- 
erences in a pronoun generation rule would cover more 
cases of pronoun use than is covered by Rule 1, but the 
application of such transition preferences also proved un- 
helpful in explaining pronoun patterns in our corpus. 
Reichman (1985) and Grosz & Sidner (1986) indi- 
cate that discourse segmentation has an effect on the lin- 
guistic realization of referring expressions. While this 
is intuitively appealing, it is unclear how to apply this 
to the generation problem (in part because it is unclear 
how to define discourse segments to a generation sys- 
tem). Passonneau (1996) argues for the use of the prin- 
ciples of information adequacy and economy. Her algo- 
rithm takes discourse segmentation into account through 
the use of focus spaces which are associated with dis- 
course segments. Thus, Passonneau explains that a fuller 
description might be used at a boundary bec__~_use the 
set of accesu'ble objects changes at discourse segment 
boundaries (though she combines this consideration with 
centering theory which may override the decisions due 
to segment boundaries). While Passonneau's algorithm 
seems quite appealing, notice that it provides no explana- 
tion of how a discourse segment should be defined. The 
evaluation that she provided used the discourse segments 
provided by aset of naive subjects who indicated dis- 
course segment boundaries in her texts. Without such 
boundaries provided, it is impossible to apply her algo- 
ritlun. In some sense, the work presented here is con- 
sistent with Passonneau's theory..What we attempt to 
add is a genre-dependent definition of discourse segment (thread) 
which is well-defined and can be derived from 
2This algorithm was later revised in Dale & Reiter (1995) m 
more adequately reflect human-generated referring expressions 
and to be more computationally tractable. 
input that any sentence generator must have in order to 
generate a sentence. On the other hand, we differ from 
Passonneau in that we do not attempt to make use of fo- 
cus spaces in generation. Rather we argue for evaluating 
informational adequacy on the basis of confusable ob- 
jects near the current sentence in a discourse. 
In the following section we hypothesize that discourse 
structure in terms of multiple threads does have an ef- 
fact on appropriate anaphoric expression choice and that 
the temporal structure of a discourse is a particular in- 
stantiation of these threads (in the stories that we have 
analyzed). We hypothesize that if there is a difference in 
time between the last reference to an entity and the cur- 
rent reference, a definite description is used (and when 
the time of the previous and the current reference is the 
same, a pronoun is used). 
3 Long- and Short-Distance Anaphor/c 
Expressions 
In studying naturally oceu~ng texts in order to identify 
patterns of pronoun usage, we found consistent patterns 
over both long distances (i.e., where the last reference 
to the entity being referred to was more than two sen- 
tences back in the discourse) and over short distances 
(i.e., where the last reference to the entity being referred 
to was in the current sentence). 
In long-distance situations we have found that a defi- 
nite description is almost always used. In short-distance 
situations a pronoun is almost always used (except in sit- 
uations where this pronoun is ambiguous). Thus, the sen- 
tence seems to be a very important construct to consider 
in choosing anaphoric expressions. 
We now turn to consider the factors that might be 
affecting the expression choice in situations other than 
these. We hypothesize that discourse structure must be 
considered for such cases. 
4 "lime-Threaded Discourse Structure 
In this section, we describe our approach to using dis. 
course structure for choosing the fight referring expres- 
sion. Since we are working with stories from hewspapers 
we were not able to identify the kind of discourse struc- 
ture as assumed by Orosz & $idner (1986), whose dia- 
logues are more task-oriented and have clear intentional 
goals. Instead, the texts have a structure consisting of 
multiple story lines which we call threads (cf. also Ro~ 
et al. (|995)). A thread describes a particular part of the 
story. It can be interrupted by other threads and contin- 
ued later. Thus the tiueaded structure is more compli- 
cated than the hierarchical (~ee-like) structure posited in 
Cn~sz & Sidner (1986) and Mann & Thompson (1988). 
Therefore, we do not think that the texts we looked at can 
be analyzed using a stack (for example). 
We needed to find a structuring device that was part 
of the input to a sentence generation system and that was 
recognizable on the surface (so that we could evaluate 
our algorithm on naturally occurring text). After invest/- 
65 
gating some work on narrative structure (Genette, 1980; 
Prince. 1982; Vogt, 1990), "we determined that changes 
in the deictic center of the story (Nakhimoysky, 1988: 
Wiebe, 1994) not only must be part of the input to a sen- 
tence generator (e.g., for appropriate tense generation), 
but were also both well marked in the text and seemed to 
have an influence on anaphoric expression choice. 
A shift in the deictic center can be signaled by a 
shift of topic, a shift of time scale, a shift in spatial 
scale, or a shift in perspective (Nakhimovsky, 1988; 
Wiebe, 1994). Since a shift in time scale is often indi- 
cated by linguistic means and the time being referred to 
must be part of the input to a sentence generator, we con- 
centrated on this point. We also acknowledge that the 
changes in time seem quite important in the news stories 
that we analyzed. Other genres of text might depend on 
other kinds of structuring devices. 
Changes in time scale or time, as we redefined the cat- 
egory, may require world knowledge reasoning to rec- 
oguize but are often indicated by either cue words and 
phrases (e.g., "n/he years ago ", " a year", "for months'; 
"several months ago"), a change in .grammatical time 
of the verb (e.g., past tense versus present tense), or 
changes in aspect (e.g., atomic versus extended events 
versus states as defined by Moens & Steedman (1988)). 
In considering how time change might affect 
anaphoric expression choice, we consider the choice for 
the first mention of a discourse entity in a sentence where 
that entity has recently been referred to in the discourse. 
Our hypothesis is that: Changes in time reliably signal 
changes of the thread in newspaper articles; definite de- 
scriptions should appear when the current reference to a 
discourse entity is in a different thread from the last ref- 
erence to that entity and pronouns should occur when the 
previous mention is in the same thread 3. 
In order to evaluate this hypothesis, we mapped out 
the time being referenced in our texts on a clause-by- 
clause basis. For each clause in the texts we indicated 
the time which was referred to. We distinguished be- 
tween events that occurred at a single instance in time 
(atomi c events) and events or states that occurred over a 
span of time (repeated atomic events, extended events, 
and states). For atomic events we allowed for both a spe- 
cific time at which it occurred and for a non-specific time 
that indicated the range of uncertainty. We allowed time 
spans to have both specific end points and unspecified 
end points as well. 
An example from our corpus with its associated tem- 
poral structure may illustrate these labels, the complex- 
ity of the texts under consideration, and how we propose 
pronoun generation is affected. 
3Note that tbe discourse thread maychange between the two 
references in question. This would be signaled by a change in 
time in e_laeses between the two references in question. We are 
interested in whethor the two references are in the same (use a 
pronoun) or different (use a definite description) thread. 
Example 2: 
(47a) Questioned about the criminal activities 
of the football club, 
(47b) Mrs. Mandela maintained 
(47c) that she had never had any control over 
them. 
(48a) This despite testimony from a half 
dozen former members 
(48b) that they even had to get permission to 
go in and out of her yard. 
(49a) Mrs. Mandela also said 
(49b) she had disbanded the club 
(49c) after her husband asked her to, despite 
evidence to the contrary. 
(50) Mrs. Mandela faced questions from 
more than 10 lawyers representing vari- 
ous victims and the panel of commission- 
ers and their investigators. 
.1985 
1990 
1991 
lyear 
Imonth 
lweek 
now 
............. ....... °.-.°oo o.o.." 
......... I .... i .... 
. .............. .:. ......... , .............. ~..... 
**.°., ..... ..oe°o°,.o°°.~..°° ...... ..o.:..° oo, 
a' D' el a' b a' b' el I 
47 48 49 50 
Figure 1: Temporal Structure for Example 2 
Figure 1 contains the temporal structure for (each 
clause) ofsentences 47-50 of one of our texts. We ob- 
serve two threads in this discourse fragment, one deal- 
ing with events at the "now"-time of the story, the other 
telling past events (1985-1991). The corresponding sen- 
tences (also broken into clause, s t ) are contained in Ex- 
ample 2. 
Notice that sentence 47 consists of three clauses. The 
first two (47a and 471)) describe atomic events that are 
taking place at the "now" time of the story (during the 
proceedings against Mrs. Mandela). The third clause 
(47c) refers to an indefinite span of time in the past (dur- 
ing which Mrs. Mandela's football club existed). Note, 
the use of the past perfect in (47c) indicating the change 
in time and setting. 
• As Figure I illustrates, there is a name (N) reference to 
Mrs. Mandela in (47b), and a pronominal 0P) reference 
-to her in (47c). This pronoun is used even though there 
is a change in time between (47b) and (47c) and is ex- 
plained because this is a sho~-distance reference since it 
4We take a Clal~ to ~ a finite verb. 
66 
is the second reference to Mrs. Mandela within the same 
sentence (a condition that overrules the time change hy- 
pothesis). 
(48a) represents a change hack to the time of the pro- 
ceedings (note the discourse-deictic reference (Wehber, 
1991) "This"). (48b) again points to the time in the past, 
though this is not explicitly marked linguistically as it 
was in (47c). Here world knowledge must be used to un- 
derstand the time referenced in this clause. Note, how- 
ever, that the time would have to be part of the input to a 
generator, and thus our rules are completely well-defined 
from the generation perspectiv e . 
The use of a pronoun to refer to Mrs. Mandela in 
(48b) is warranted by our hypothesis, because the pre- 
vious mention of Mrs. Mandela in (47c) references the 
same time as is referenced in (48b). " 
Because there is a time change between (48b) and 
(49a), our hypothesis explains the appearance of the 
proper name in (49a) even though it occurs just after 
a pronoun (in (48b)) co-specifying the same character? 
The pronouns in the remainder of(49) are explained be- 
cause they are subsequent references within the same 
sentence (despite the fact that they refer to an unspec- 
ified time in the past which is different from the time 
referenced in (49a)). Finally, theuse of a name in (50) is 
again indicated by the change in time between (49c) and 
(50). 
5 Ambiguities 
Of course, the choice of referring expression is not only 
guided by discourse structure, there is also an influence 
due to ambiguities. Dale (1992) generated referring ex- 
pressions so that their referents could be distinguished 
from the other discourse entities mentioned in the con- 
text. This strategy can be interpreted as: Generate a pro- 
noun whenever it is not ambiguous. However, how one 
should define context is not quite clear. For this defi- 
nition we choose a span of text considered important in 
our previous work on anaphora resolution (Slrube, 1998), 
and define a referring expression as ambiguous if there 
is a competing antecedent (i.e. another discourse entity 
matching in number and gender) mentioned in the pre- 
vious sentence or to the leR of the referring expression 
in the current sentence. Of the 437 referring expressions 
in the texts we analyzed, 104 were considered ambigu- 
ous by this definition. Of these only 51 were realized as 
a definite description. Thus a rule which specifies use 
of a definite description if a pronoun would be ambigu- 
ous according to this definition appears to be too strict. 
Therefore we need to consider ambiguous cases in more 
detail. 
sNote that the use of this definite description cannot be 
explained by a topic shift since there is no topic shift in be- 
tween the previous text and (49). At least two discourse en- 
tities ("Mrs. Mandela" and the "football club") are constant, 
only Mrs. Mandela's husband does not occur in the immedi- 
ately preceding sentences. 
Consider the following excerpt taken from later in the 
same story that contains Example i. For convenience. 
references to the main character appear in bold, while 
those to a competing antecedent appear in italics. 
Example 3: 
Mr. Curtis might have lived out his live in 
obscurity if it were not for a New Haven tele- 
vision reporter. Jim Hoffer, of station WTNH. 
who got an anonymous tip last summer that 
Mr. Curtis had been attending college. Us- 
ing a hidden camera, a crew taped Mr. Cur- 
lis going to his classes at Southern Connecti- 
cut State University, and a producer recorded 
a conversation with him in a cafeteria. In a 
clear, steady voice, Mr. Curtis can be heard to 
say that he... 
Concentrating on the references to Mr. Curtis, we 
question when a pronoun can be used, and when the 
name should be used. To handle cases where compet- 
ing antecedents occurred, we turned to pronoun resolu- 
tion algorithms. Our intuition was that a pronoun could 
be generated to refer to a particular discourse entity if 
• a pronoun resolution algnr/thin woul d choose that entity 
as a referent for the pronoun. To our knowledge, there 
are only two focus-based pronoun resolution algorithms 
that are specified in enough detail to work on unrestricted 
naturally occurring text: Brennan et al. (1987) using the 
definition of utterance according to Kameyama (1998), 
and Struhe (1998). Strube evaluated the effectiveness of 
these two algorithms on the task of pronoun resolution in 
some naturally occurring texts. Because Strube's algo- 
rithm showed significantly better results, we have turned 
to it for guidance in pronoun generation. 
The idea is that if we want to refer to a discourse en- 
tity, E, but there is a competing antecedent, C, we look to 
Strube'S algorithm in the following way. If Strube's algo- 
rithm would resolve a pronoun to be K we use a pronoun. 
If, instead, Strube's algorithm would prefer C as the ref- 
erent of the pronoun, we would use a definite description 
to refer to ~-. 
We evaluated this idea along with several other alter- 
natives (e.g., using discourse structure alone, using long- 
and short-distance rules ignoring the ambiguity, distin- 
gnishing between first and subsequent reference within 
a sentence, using a definite description whenever there 
was ambiguity) on the ambiguous examples in our cor- 
pus. Our analysis showed that the use of Strube's al- 
gnrithm showed improvement, but it seemed to be too 
liberal with suggesting pronouns when the competing 
antecedent was in the previous sentence. Assuming 
Strube's algorithm reflects human processing of referring 
expressions (something NOT claimed), what this means 
is that in our texts the writer chose to generate a definite 
description even though a pronoun would have been in- 
formafionally adequate to resolve the referent correctly. 
This occurred most frequently over sentence boundaries. 
When a competing antecedcnt is within a sentence, how- 
ever, Strube's algorithm appears to be quite effective. 
67 
• Thus the rule we .settled on acknowledged the impor- 
tance of sentence boundaries and is shown in Figure 2. 
I. if this is the first occurrence of X in the current 
sentence and 
(a) /./'there is a competing antecedent in the pre- 
vious sentence, use a definite description; 
(b) /f there is a competing antecedent in the 
same sentence (i.e.. to the left) and 
i. if Struhe's algorithm would resolve a 
pronoun in this position to be X. use a 
pronoun; 
ii. e/se use a definite description; 
2. /./'this is a subsequent occurrence of X in the cur- 
rent sentence and 
(a) U" there is an intervening competing an- 
tecedent, use a definite description; 
Co) ~ there is no intervening competing an- 
t_ _,y~,,tttttJent, use a pronoun. 
Figure 2: Realization of X when Competing Antecedents 
Exist 
6 Anaphoric Referring Expression 
Generation Algorithm 
In the previous sections we have argued that in some 
instances of anaphoric expression choice, the threaded 
discourse structure must be taken into account. For the 
particular texts that we analyzed, we argue that threads 
defined in terms of the time referenced in a clause are 
appropriate for use. Other kinds of discourses will 
also exhibit a threaded structure, but the threads them- 
selves might be defined by different means. In addition 
to discourse structure, the pote, ndal for ambiguity must 
be considered. Additionally, cases of long- and short- 
distance anaphora should be handled indepandantlyof 
the threaded discourse slructure considerations. 
. Based on these findings, we propose the algorithm for 
realizing anaphoric expressions shown in Figure 3. Note 
in this algoriflun we refer to the notion of a discourse 
thread which might be defined differently for different 
kinds of texts. 
7 Empirical Data 
We applied the algorithm described in the previous sec- 
tion to three texts from the New York Tunes. Articles. 
ranged from a frontpage article •to local news. We ap- 
plied the algorithm to all references to persons in these 
texts. The algorithm was correct in 370 cases (84.7%), 
and wrong in 67 cases (15.3%). In Figure 4 we show the 
distribution over the rules specified in the algorithm. 
In order to interpret the results of the algorithm, we 
must have some comparison. We use a simple scheme 
which does not consider threaded discourse structure 
and which bandies ambiguous cases very conservatively. 
I. If this is a long distance anaphoHc reference 
(i.e., if the previous reference to X was more than 
two sentences prior) use a definite description; 
2. else 
if this is an unambiguous reference (i.e., there is 
rio competing ang~_~_~t) and this is an lntra- 
sentential anaphor (i.e., this is not the first men- 
tion of X in the current sentence) use a pronoun; 
3. else 
if this is a thread change (i.e., the previous refer- 
ence to X occurred in a thread different from the 
one in which the current reference occurs) use a 
definite description; 
4. else 
/f there is a competing antecedent (i.e., another 
object in the previous or current sentence that 
matches the type and number of X) use the rule 
found in Figure 2; 
5.. else 
for the remaining cases (i.e., unambiguous cases 
when time remains the same) use a pronoun. 
Figure 3: Algorithm for Generating an Anaphoric Refer- 
ring Expression for Discourse Entity X 
This scheme is shown in Figure 5 for comparison pur- 
poses. 
The results of applying these rules give 343 correct 
cases (78.5%) and 94 incorrect ones (21.5%). Hence our 
algorithm reduces the error rate by 28.9%. 
8 Related Research 
A significant amount of work in linguistics has investi- 
gated the use of different kinds of anaphoric referring 
expressions in discourse and their relationship to ease 
of comprehension. See Arnold (1998) fora discussion 
of several of the factors involved in refem'ng expression 
• choice. In many cases the various factors seem to af- 
Rule Name 
-All Rulzs 
correct 
wrong 
Lang Distance 0) 
COlleCt 
.wrong 
Intra-sententiat (2) " 
colrcct 
wrong 
Time Rules (3 • 5) 
correct 
wrong 
Ambiguous (4) 
correct 
wrong 
mber percemge 
370 
67 15.3~ 
46 97.99~ 
1 Zl% 
168 96% 
7 495 
• 116 7Z5% 
44 27.5% 
• 40 72.7% 
15 27.2% 
Figure 4: Results of the Algorithm 
68 
8 
0 
0 
0 
I 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 a 
0 
0 ® 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
A 
I. /f this is a long distance anaphoric ~ refexence 
(i.e., if the previous reference to this item is 
greater than two sentences prior) use a definite 
description; 
2. else 
/./'there is a competing antecedent (i.e., another 
object in the previous or current sentences that 
matches the pronoun which would be used to refer 
to this entity) use a definite description: 
3. e/se 
the anaphoric expression would be unambiguous 
so use a pronoun. 
Figure 5: Simple Algorithm 
fect the accessibility of a referent (where accessibility 
is intended in a broad sense to cover both "topic acces- 
sibility" (Givon, 1983) and accessibility due to factors 
such as recency of mention). Basically, the more acces- 
sible a referent the more underspecified a refen'ing ex- 
pression should be. Accessibility explains the apparent 
"name-name penalty" as examined in Gordon & Hen- 
chick (1998), for example. 
Our work argues that factors beyond accessibility must 
be considered in anaphoric expression choice. It is con- 
sistent with work such as Vonk et ai. (1992) whose ex- 
periments indicate that a referring expression "... that is 
more specific than is necessary for the recovery of the 
intended referent ... marks the beginning of a new theme 
concerning the same discourse referenC'(Vonk et al.. 
1992, page 304). They argue that such overspecified ex- 
pressions are serving a discourse function of indicating 
boundaries. This work does not define what a discourse 
segment boundary actually is. On the other hand, using 
the definition of time change as a boundmy condition, 
our work is consistent with their hypothesis. Interest- 
ingly, Vonk et al. (1992) found that in discourses where 
a theme change was wen marked by other means (e.g., 
by a preposed adverbial phrase or a subordinate clause 
indicating time or place) that pronouns were much more 
common even though a new theme was begun. Presum- 
ably such phrases mark the theme change well, and thus 
it is not necessary to also mark the change via an over- 
specified description. 
Approaches which define discourse segments on the 
basis of reference resolution (Sidner, 1979; Suri & Mc- 
Coy, 1994; Strobe & Hahn, 1997) are not useful for our 
purposes because they require referring expressions for 
recognizing segment boundaries. In contrast to these ap- 
proaches, we define segment boundaries independently 
from reference resolution so that in this respect our work 
is in line with Grnsz & Sidner's (1986) definitions. 
9 Future Directions 
In analyzing our data, there are several places for further 
consideration. One problem is that our rule which in- 
dicates a definite description should be used in a time 
change overgenerates definite descriptions. Following 
Vonk et al. (1992) we plan to investigate whether definite 
descriptions might best be viewed as boundary markers 
and whether other markers of discourse boundaries (e.g., 
preposed adverbial phrases) are found in places where 
our algorithm suggests a definite description because of 
a time change but a pronoun appear~ in the text. 
In addition to evaluating more texts under this scheme, 
there are several places where we will attempt to tighten 
our methodology. One of these is in the time analysis. 
Our current analysis distinguishes between four types of 
time and is driven by both semantic cues in the text (e.g., 
adverbial time phrases) and changes in tense. Nakhi- 
movsky (1988) also uses changes in "`time scale" as a 
marker for changes in time. We plan to investigate this 
to see whether it explains more of the examples. Nakhi- 
movsky also describes several other markers for a setting 
change, and these will also be investigated to see if they 
are indicative of definite description use. 
Another line of future research involves further in- 
vestigation of the ambiguous cases. Our current rule 
was developed by evaluating several different possibil- 
ities (e.g., using time change rules, different pronoun 
resolution algorithms) and selecting a rule that explains 
most of the cases. Still, the number of ambiguous cases 
is fairly small and analyzing more texts and concentrat- 
ing on cases where the current rule makes an incorrect 
prediction may lead us to a more robust rule. 
Finally, further thought must be put into evaluating the 
algorithm. In particular, our current evaluation method- 
ology presupposes that the human writer has chosen the 
best anaphoric expression. It may be interesting to see if 
there are differences between reading time, eye move- 
ments, or comprehension of stories with the human- 
generated expressions and those using our algorithm. 
Thought must be put into setting up such experiments 
and into interpreting results. For example, a faster read- 
ing time may not indicate a better choice of referring ex- 
pression (since the writer may have been interested in an 
effect other than ease of comprehension). On the other 
hand, such experiments have the potential for produc- 
ing a more adequate evaluation of the methodology than 
does mimicking the human-produced text and should be 
looked into further. 
10 Conclusions 
Pronouns occur frequently in texts and have been hy- 
pothesized to play a significant role in text coherence. 
Yet, pronoun generation has not been studied in detail. 
If future natural language generation systems are to pro- 
duce coherent, natural texts, they must use rules for gen- 
erating pronouns that produce pronouns in roughly the 
same places that human-produced texts do. M~oreover, 
the rules must be based on informationthat would bc 
available to a sentence generator. At thesame time, in 
order to evaluate rules, they must be based on informa- 
tion that can be gleaned from a text. 
In this work we have argued that discourse structure 
69 
in terms of multiple threads provides an explanation for 
patterns of pronoun use in naturally occurring text. As a 
particular instantiation of a threaded discourse Structure, 
we looked at changes in setting, as indicated by changes 
in time. That is, even in places where a pronoun would be 
unambiguous, a definite description might be used when 
the time of the sentence is different from the time of the 
sentence in which the previous mention was made. This 
hypothesis provides an explanation for many of the uses 
of definite descriptions found in the studied texts. Other 
uses of definite descriptions occur because of ambigu- 
ities. We have suggested a rule which addresses when 
such ambiguities should not preclude the generation "of 
a pronoun. Our scheme appears to be a reasonable ex- 
planation for the patterns of pronoun use found in our 
corpus. 
Acknowledgments. This work was done while the first 
author was visiting the Institute for Research in Cog- 
nitive Science and while the second author was a post- 
doctoral fellow there (NSF SBR 8920230). We would 
fike to thank Jennifer Arnold and the centering group at 
UPenn for helpful discussions. We would also like to 
thank the anonymous reviewers. 

References 
Appelt, D. E. (1981). P/arming Natural-Language Ut- 
terances to Satisfy Multiple Goals, (Ph.D. thesis). 
Stanford University. Also appeared as: SRI Inter- 
national Technical Note 259, March 1982. 
Appelt, D. E. (1985). Planning English referring expres- 
sions. Ani~cial Intelligence, 26(1): 1-33. 
Arnold, J. E. (1998). Reference Form and Discourse Pat- 
terns, (Ph.D. thesis). Stanford University, Depart- 
ment of Linguistics. 
Brennan, S. E., M. W. Friedman & C. J. Pollard (1987). 
A centerin.g approach to pronouns. In Proceedings 
• of the 2Y m Annual Meeting of the Association for 
Computot/ona/Linguis6c$, Stanford, Cal., 6.-9 July 
!987, pp. !55-162. 
Dale, R. (1992). Genera~nf Referring E, rpressions: 
Constructing Descriptions in a Domain of Objects 
and Processes. Cambridge, Mass.: Mrr Press. 
Dale, It. & E. Reiter (1995). Computational interpre- 
tations of the Gricean maxims in the generation of 
referring expressions. Cognitive Science, 18:233-- 
263. 
Oenette, G. (1980). Narrative Discourse: An Fasay in 
Method. Ithaca, N.Y.: Cornell University Press. 
Givou, T. (1983). Topic continuity in spoken English. 
In T. Given ted.), Topic Continuity in Discourse: 
A Quantitative Cross-Language Study. Amsterdam, 
Philadelphia: John Benjamins. 
Gordon, P. C. & R. Hendrick (1998). The representation 
and processing of coreference in discourse. Cogni- 
tive Science, 22(4):389--424. 
Grosz, B. J.o A. K. Joshi & S. Weinstein (1995). Cen- 
tering: A framework for modeling the local co- 
herence of discourse. Computational Linguistics, 
21(2):203-225. 
Grosz, B. J. & C. L. Sidner (1986). Attention, intentions, 
and the structure of discourse. Computational Lin- 
guistics, ! 2(3): ! 75-204. 
Kameyama, M. (1998). lntrasentential centering: A case 
study. In M. Walker, A. Joshi & E. Prince (Eds.), 
Centering Theory in Discourse, pp. 89-112. Ox- 
ford, U.IL: Oxford University Press. 
Mann, W. C. & S. A. Thompson (1988). Rhetorical struc- 
ture theory. Toward a functional theory of text orga- 
nization. Text, 8(3):243-281. 
McDonald, D. D. (1980). Natural Language Production 
as a Process of Decision Mata'ng Under Constraint, 
(Ph.D. thesis). MIT. • 
McKeown, K. R. (1983). Focus constraints on language 
generation. In Proceedings of the 8 th International 
Joint Conference on Artificial Intelligence, Karl- 
sruhe, Germany, August 1983, pp. 582-587. 
McKeown, IL IL (1985). Text Generation: Using Dis- 
course Strategies and Focus Constraints to Gener- 
ate Natural Language Text. Cambridge, U.K.: Cam- 
bridge University Press. 
Moens, M. & M. Steedman (1988). Temporal ontology 
and temporal reference. Computational Linguistics, 
14(2):15-28. 
Nakhimovsky, Ao (1988). Aspect, aspoctual class, and 
the temporal structure of narrative. Computational 
Linguistics, 14(2):29-43. 
Passonneau, R. (1996). Using centering to relax Gricean 
constraints on disocursz anaphoric noun ophrase~ 
Language and Speech, 39(2):229-264. 
Prince, G. (1982), Narratology: The Form and Function. 
ing of Narrative. Berlin: Mouton. 
Reichman, R. (1985). Getting Computen to Talk like You 
andMe. Cambridge, Mass.: MIT Press. 
Reiter, E. (1990). Generating descriptions that exploit a 
user's domain knowledge. In R. Dale, C. Mellish & 
M. Zock (Eds.), Current Research in Natural Lan- 
guage Generation. London: Academic Press. 
Ros6, C. P., B. DiEngenio, L. S. Levin & C. Van Ess- 
Dykema (1995). Discourse processing of dialogues 
with multiple threads. In Proceedings of the 33 rd 
Annual Meeting of the Association for Computa- 
tional Linguixtics, Cambridge, Mass., 26-30 June 
1995, pp. 31-38. 
Sidner, C. L. (1979). Towards a Computational The- 
ory of Definite Anaphora Comprehension in En- 
glish. Technical Report AJ-Memo 537, Cambridge, 
Mass.: Massachusetts Institute of Technology. AI 
Lab. 
Strube, M. (1998). Never look back: An alternative to 
centering. In Proceedings of the 17 th International 
Conference on Computational Linguistics and 36 th 
Annual Meeting of the Association for Computa- 
tional Linguistics, Montr~.ad, Quebec, Canada, I0- 
14 August 1998, Vol. 2, pp. 1251-1257. 
StTube, M. & U. Hahn (1997). Centered segmentation: 
Scaling up the centering model to global referential 
discourse structure. In Proceedings of the 19 th An- 
nual Conference of the Cognitive Science Society, 
Palo Alto, Cal., 7-10 August 1997. 
Suri, L. 7_, & K. F. McCoy (1994). RAFT/RAPR and cen- 
tering: A comparison and discussion of problems 
related to processing complex sentences. Computa- 
tional Linguistics, 20(2):301-317. 
Vogt, J. (1990). Aspekte erzllhlender Prosa: Eine 
Einfllhrung in Erzilhltechnik and Romantheorie 
(7 ta ed.). Opladen: Westdeutscher Verlag. 
Vonk, W., L. G. Hustinx & W. H. Simons (1992). The use 
of referential expressions in structuring discourse. 
Language and Cognitive Processes, 7(3/4):301- 
333.
Webber, B. L. (1991). Structure and ostension in the in- 
terpretation of discourse deixis. Language and Cog- 
nitive Processes, 6(2):107-135. 
Wiobe, J. M. (1994). Tracking point of view in narrative. 
Computational Linguistics, 20(2):233-287. 
