O 
O 
O 
O @ 
O @ 
$ 
@ 
O @ 
@ 
O @ 
O 
O @ 
O 
O 
O 
O @ 
O @ 
@ 
O @ 
@ 
@ 
O 
Knowledge-Lean Coreference Resolution and its Relation to 
Textual Cohesion and Coherence 
Sanda M. Harabagiu 
Southern Methodist University 
Dallas, TX 75275-0122 
sanda@seas, smu. edu 
Steven J. Maiorano 
AAT 
Washington, D.C. 20505 
stevejm~ucia, gov 
Abstract 
In this paper we present a new empirical 
method for coreference resolution, imple- 
mented in the COCKTAIL system. The re- 
suits of COCKTAIL are used for lightweight 
abduction of cohesion and coherence struc- 
tures. We show that referential cohesion 
can be integrated with lexical cohesion 
to produce pragmatic knowledge. Upon 
this knowledge coherence abduction takes 
place. 
I Motivation 
Coreference evaluation was introduced as a new 
domain-independent task at the 6th Message Under- 
standi~ Conference (MUC-6) in 1995. The task fo- 
cused on a subset of coreference, namely the ide~tiQ/ 
coreference, established between nouns, pronouns 
and noun phrases (including proper names) that re- 
fer to the same entity. In d~-;,~ the coreference 
task (d. (Hirschnum and Chinchor, 1997)) special 
care was taken to use the coreference output not 
only for supporting Information Extraction(IE), the 
central task of the MUCs, but also to create means 
for re.arch on corefea~mce and discourse phenom~ 
independent of IE. 
Annotated corpora were made available, using 
SGML tagging with~, the text stream. The anno- 
tated texts served as tralz~g examples for a variety. 
of corderence resolution methods, that had to focus 
not only on precision and recall, but also on robust- 
ness. Two general classes of approaches were distin- 
guished. The first class is characterized by adapta- 
tions of previously known reference algon'thms (e.g. 
(Lappin and Leass, 1994), (Brennan et al., 1987)) 
the scarce syntactic and semantic knowledge avail- 
able m an w. system (e.g. (Kameyama, 1997)). 
The second class is based on statistical and machine 
learning techniques that rely on the tagged corpora 
to extract features of the coreferential relations (e.g. 
(Aone and Bennett, 1994) (Kehler, 1997)). 
• In the past two MUC competitions, the high scor- 
ing systems achieved a recall in the high 50's to low 
60's and a precision in the low 70's (d. (Hirschman 
et al., 1998)). A study z of the contribution of 
each form of coreference to the overall performance 
shows that generally, proper name anaphora resolu- 
tion have the highest precision (69%), followed by 
pronominal reference (62%). The worse .precision 
is obtained by the resolution of d~_ n!te nominals 
anaphors (46%). However, these results need to be 
contrasted with the distribution of coreferential links 
on the tagged corpora. The majority of coreference 
links (38.42%) connect names Of people, organiza- 
tions or locations. In addition, 19.68% of the tagged 
co~ce links are accounted by appositives. Only 
16.35% of the tagged coreferences are pronominal. 
Nominal anaphors account for 25.55% of the coref- 
erence links, and their resolution is generally poorly 
represented in IE systems. 
Due to the distribution of coreference links in 
newswire texts, a coreference module that is merely 
capable of handling recognition of appositives with 
high precision and incorporates rules of name alias 
identification can achieve a baseline coreference pre- 
cision up to 58.1%, without sophisticated syntactic 
or discourse information. Precision increase is ob- 
tained by extending lfigh-performance pronoun res- 
olution methods (e.g. (Lappin and Leass, 1994)) to 
nominal corderence as well. Such enhancements rely 
on semantic and discourse knowledge. 
In this paper we describe COCKTAIL, a high- 
performance coreference resolution system that op- 
eratas on a mixture of heuristics that combine se- 
mantic and discourse information. The resulting 
tThe study, reported in (Kameyama, 1997), was per- 
formed on the coreference module of SRI's FASTUS (Ap- pelt et 
al., I993), an IE system representative of today's 
IE technology. 
29 
coreference chains are shown to contribute in the 
derivation of cohesive chains and coherence graphs. 
Both cohesive and coherence structures are consid- 
ered, partly because of their incremental complex- 
ity and partly because the tradition (started with 
(Hobbs, 1979)) of studying the interaction of coref- 
erence and coherence. Section 2 presents COCKTAIL 
and the coreference methods it built upon. Sections 
3 and 4 describe the derivation the cohesion and co- 
herence structures. 
2 Coreference Resolution 
Coreference resolution relies on a combination of lin- 
guistic and cognitive aspects of language. Linguis- 
tic constraints are provided mostly by the syntactic 
modeling of language, whereas computational mod- 
els of discourse bring forward the cognitive aesump- 
lions of anaphora resolution. Three different meth- 
ods of combining anaphoric constraints am known 
to date. The Rrst one integrates anaphora resolution 
in computational models of discourse interpretation. 
Dynamic properties of discourse, especially focusing 
and centering are invoked as the primary b~-~|~ for 
identifying antecedents. Such computational meth- 
ods were presented in (Grosz et al., 1995) and (Web- 
ber, 1988). 
A second category of approaches combines a v~ 
riety of syntactic, semantic and discourse factors as 
a multi-dimensional metric for ranking antecedent 
candidates. Anaphora resolution is determined by 
a composite of several distinct scoring procedures, 
each of which scores the prominence of the candidate 
with respect to a specific.type of information. The 
systems described in (Asher and Wada, 1988) (Car- 
bonell and Brown, 1988) and (Rich and Luperfoy, 
1988) are examples of the mixed evaluation strat- 
egy. 
Alternatively, other discourse-based methods con- 
sider co~eference resolution a by-product of the 
recognition of coher~ce relations between sentences. 
Such methods were presented in (Hoblm et al., 1993) 
and ~flensky, 1978). Although M-complete, this 
approach has the appeal that it resolves the most 
complicated cases of coreference, uncovered by syn- 
tactic or semantic cues. We have revisited these 
methods by setting the relation between coreference 
and coherence on empirical grounds. 
2.1 Pronominal Coreference 
Two tendencies characterize current pronominal 
coreference algorithms. The first one makes use of 
the advances in the parsing technology or on the 
availability of large parsed corpora (e.g. Trcebank 
(Marcus et al.1993)) to produce algorithms inspired 
by Hobbs' baseline method (Hobbs, 1978). For ex- 
ample, the Resolution of Anaphor~ Procedure (RAP) 
i~itroduced in (Lappin and Leass, 1994) combines 
syntactic information with agreement and salience 
constraints. Recently, a probabilistic approach to 
pronominal coreference resolution was also devised 
(Ge et al., 1998), using the parsed data available 
from Treebank. The knowledge-based method of 
Lappin and Leass produces better results. Never- 
theless, RkPSTAT, a version of RAP obtained by using 
statistically measured preference patterns for the an- 
tecedents, prodticed a slight enhancement of perfor- 
mance over RAP. 
Other pronominal resolution approaches promote 
knowledge-poor methods (Mitkov , 1998), either by 
using an ordered set of general heuristics or by 
combining scores assigned to candidate antecedents. 
The CogNIAC algorithm (Baldwin, 1997) uses six 
heuristic rules to resolve coreference, whereas the 
algorithm presented in (Mitkov, 1998) is based on a 
limited set of preferences (e.g. definitiveness, lexical 
reiteration or immediate reference). Both these al- 
gorithm rely only on part-of-speech tagging of texts 
and on patterns for NP identification. Their per- 
formance (dose to 90% for certain types of pro- 
nouns) indicates that full syntactic knowledge is not 
required by certain forms of pronominal coreference. 
The same claim is made in (Kennedy and Bogu- 
raev, 1996) and (Kameyama, 1997), where algo- 
rithm~ approximating RAP for poorer syntactic input 
obtain precision of 75% and 71%, respectively, a sur- 
prising small precision decay from RAP's 86%. These 
results prompted us to devise COCKTAIL, a corder- 
ence resolution system, as a mixture of heuristics 
performing on the various syntactic, semantic and 
dL~ourse cues. COCKTAIL is a composite of heuris- 
tics learned from the tagged corpora, which has the 
following novel characteristics: 1. C0cErIIL covers both nominal and pronoun cord- 
er~ce, but distinct sets of heuristics operate for 
different forms of anaphors. We have devised sepa- 
rate heuristics for reflexive, possessive, relative, 3rd 
person and 1st person pronouns. Similarly, de/inite 
nomlo-t~ are treated differently than bare or indd- 
inite nominals. 2. c0crr/IL performs semantic checks between an- 
tecedents and ~phorL These chedm combine sot- 
tal co~aints from WordNet with co-occurance in- 
formation from (a) Treebank and (b) conceptual 
3. In COCET~L antecedents are sought not only in the 
ac~e~ble text region, but we also throughout the 
current co~efe~nce chains. In this way cohesive in- 
formation, represented in corderence chains, is em- 
ployed i~ the resolution pr _~___. 4. The heuristics d ~cErAIL allow for lexi~dizations 
(e.g. when the anaphor is an adjunct ofa commmd- 
cation verbs) and of simplified coherence cues (e.g. 
30 
V 
when the anaphor is the subject of verb add, the 
antecedent may be a preceding subject of a com- 
munication vehb). 
To exemplify some COCKTAIL heuristics that re- 
solve pronominal coreference, we first present heuris- 
tics applicable for reflexive pronoun and then we list 
heuristics for possessive pronouns and 3rd person 
pronoun resolution. Brevity imposes the omission 
of heuristics for other forms of pronoun resolution. 
COCKTAIL operates by successively applying the fol- 
lowing heuristics to the pronoun Pro~ 
Oif (Pron is reflezive) then apply successively: 
oHenristic 1-Reflexive(H1R) 
Search for PN, the closest proper name from Prgn 
in the same sentence, in right to left order. 
if (PN agrees in number and gender with Pron) 
if (PN belongs to core/erence chain CC) 
then Pick the element from CC which is 
closest to Pron in Text. 
else Pick PN. 
o Henr/stic 2-Refle~'ve(H2R) 
Search for a sequence Noun.Relative.Pronoun, 
in the same sentence, in rigld to left order. 
if (Noun agrees in number and gender with Pron) 
if (Noun belongs to earefe~n~ chain CC) 
then Pick the dement from CC which is 
closest to Iron in Tezt. 
else Pick Noun. 
oHeur/sHc $-Refle~'/~e(H3R) 
Search for Pron" the closest prenoun from Pron 
in the same sentence, in right to left order. 
if (Pron" agrees in number and gender with Pron) 
if (Pron' ~gs to wreferen~ chain CC) 
then Pick the dement from CC which is 
closest to Pron in Tcet. 
eLse Pick Pron: 
o Heuristie 4-Reflezive(H4R) 
Search/or Noun.e, the dosest noun .from Pron 
in the same sentence, in right to left order. 
if (Noun.c a#n~a in number and gender with Pron) 
then Pick Noun.~ 
Resolution examples for reflexive pronouns are il- 
lustrated in Table L The antecedents produced by 
COCKTAIL are boldfaced, whereas the referring ex- 
pressions are emphasized. Both referring expressions 
and resolved antecedents and underlined. Precision 
results are listed in Table 2. 
Antecedents of reflexive pronouns are always 
sought in the same sentence. Antecedents of other 
types of pronouns are sought in preceding sentences 
too, starting from the immediately preceding sen- 
tence. Inside the sentence, the search for a specific 
word is performed from the current position towards 
the beginning of the sentence, whereas in the pre- 
Before Pennzoii's court fight with Texaco over the 
Getty purchase, Mr. Liedtke - one of the ploy's fore- 
most practitioners - portrayed him.~elfas something 
of an oil-patch tube, a notable f~---'~"~-~nsidering his 
diplomas from Amherst College and Harvard Business 
School. 
The' woman who is kuown to me as hard-working and. 
responsible, clearly isn't hersel/. 
Unlike many of her peers, m~t of whom are males 
in their 30s, s.he never takes herself too seriously. 
Table h Examples of reflexive pronouns 
Heuristic HIR H2R H3R H4K 
Precision on a test 
set of I00 randomly 95% 92% 98% 89% 
selected pronouns 
Table 2: Coreference precision (reflexive pronovns) 
ceding sentences, the search starts at the beginning 
of the sentence and proceeds in a left to right fash- 
ion. The same search order was used in (Kameyama, 
1997). From now on, we indicate this search by 
Searchl. This search is employed by heuristics for 
possessive pronoun resolution: 
Oif (Pron is possessive) (i.e. we have a sequence 
\[Pron nouno\], where nouno is the head of the NP 
containing Pron) then apply suco,~_~sieely: 
o Henris6¢. l-Pouessive(H IPos ) 
Searchl /or a posses~ve comb'uct of the form \[.ounl's ,~n2\], 
if (\[Pron nouno\] and 
\[nounl's noun2\] agree in gender, n-tuber and 
are semantically consistent) 
then if (noun2 belonga to coreyerence chain CC) 
and there is andement from CC which is 
closest to Pron in Tezt, Pick that dement. 
Pick noun,. 
oHcur/sfc 2-pouess/ve(H2Pos) 
Senrchl for PN, the closest proper name from Pron 
if (PN agrees in number and gender with Pron) 
if (PN belongs to corefe~n~ chain CC) 
then ~ the dement from CC ~hich is 
closest to Pron in Tezt 
else Pick PN. 
oHeuris6c 3-Possessive(H3Pce). 
Search for Pron" the closest pronoun .from Pron . 
if (Pron" egre~ in number and fender e~h Pron). 
if (Pron' belongs to coreferen~ chain CC) 
and there is an dement from CC which is 
closest to Pron in Text, Pick that element. 
else Pick Pron' 
oHenrist~ ~.Possessiee(H4Pos) 
Search for Noun, the closest eammon noun from Iron 
if (Noun agrees in number and gender with Pron) 
31 
if (Noun belongs to coref~ chain CC) 
and there is an element from CC which is 
closes~ to Pron in Tezt, Pick that element. 
else Pick Noun 
Examples and precision results are listed in Ta- 
ble 3 and Table 4, respectively. 
The timing of Mr. Shad's departure is likely to 
depend on how rapidly the Senate Banking 
Committee moves to confirm his successor. 
Ronald Reagan sends him-a list of h/s film roles. 
The 20-minute tiigfit )~elps him forget h/s troubles. 
The president renewed h/s promise to veto 
"tax-rate increases." 
Table 3: Examples of possessive pronouns 
Precision on 
100 random 96% 93% 78% 86% pronouns 
Table 4: Coreference precision (possessive pronouns) 
Given a possessive pronotm in a sequence \[Pron 
Noon0\], the antecedent Ante of Pron is semanti. 
cal\]y consistent if the same possessive relationship 
can be established between Ante and Noono. the 
problem is that the possessive relation semantically 
corresponds to an open list of relations. For exam- 
ple, Nouno may be a feature of Ante. Ante may own 
Noono or Ante may have pe, formed the action lex- 
ical/zed by the nominali~-~on Nouno. 
COCKTAIL's test of semantic consistency blends to- 
gerber information available from WordNet and on 
statistics gathered from ~ebank. Different consis- 
tency checks are modeled for each of the heuristics. 
We detail here the check that applies to heuristic 
HIPos, that resolves the possessive from the first ex- 
ample listed in Table 3. For this heuristic, we have 
to test whether from the possessive \[Ante Nount\] 
we can grant the pos~_~ve \[Ante Noone\] as well. 
There axe three cases that allow us to do so: 
• ~ase 1 Nount and Nouno corder. 
• Case ~Theceis ase~se ss of Nounx and asense so 
of Nouno such that a synonym of Noun~ i or of its 
immediate hypernym is found in the gloss of Noon~ 
or vicevers& 
• ~ There is a sense st of Nounx and a sense 
So of Nouno such that a common concept is found 
in their glosses. 
Cases 2 and 3 extend to synsets obtained through 
derivational morphology as well (e.g. nominaliza- 
tions). For cases 2 and 3 COCKTAIL reinforces 
the coreference hypothesis by using a possessive. 
similarity metric based on Resuik's similarity mea- 
sures for noun groups (B___,~m_ i_k, 1995). From a subset 
of Treebank, we collect all possessives, and measure 
whether the similarity~clam of Nouno, Noun1 and 
their eventual common concept is above a threshold 
produced off-line. 
Other pronominal coreference heuristics employ 
Search2, a search procedure that enhances Searchx, 
since it prefers antecedents that are immediately 
succeeded by relative pronouns. This search is in. 
corporated in COCKTAIL's heuristics that resolve 3rd 
person pronominal coreference: 
o Heuristic 1-Prono.un_(HIPron) 
Search2 in the same sentence for the same 
5rd person pronoun Pron' 
if (Pron' belongs to coreference chain CC) 
and there is an element from CC which is 
closest to Pron in Text, Pick that dement. 
else Pick Pron" 
oHeuristic ~-Prenoon(H2Pron) 
Search2 for PN, the closest proper name from Pron 
if (PN agrees in number and gender with'Pron) 
if (PN belongs" to coreference chain CC) 
then Pick the element from CC which is 
closest to Pron in. Text. 
else Pick PN. 
oHeuristic 3-Prenoon(H3Pron) 
if Pron collocates with a communication verb 
then Searcht for pronoon Pron'--I 
if (Pron" belongs to ¢oreference chain CC) 
and there is an clement from CC e~hich is 
closest to Iron in Tezt, Pick that dement 
else Pick Pron" 
oHeuristic ~-Pronoun(H4Pron) 
if Pron collocates with a communication verb 
thell Search\] communicator Noun 
if (#oun belongs to coreyeren~ chain CC) - 
and there im an clement from CC u#sich is 
clos/Jt to Pmn in Te.zt, Pick that dement. 
else Pick Noo~ 
o Heuristic 5-Pmnoon(HSPron) 
. Searcha for Pron', the closest pronoun from Pron 
if (Pron' agrees in number and gender with Pron) 
if (Pron' belonga to ¢oneference chain CC) 
and there is an dement from CC tnhich is 
do, eat to Pron in Teffit, Pick that dement 
else Pick Pren" 
oHfu~ 6-Proooen(H6Pron) 
Search2 for Noun, the closest noun from.Pron 
if (Noun agrees in number and gender with Pron) 
if (Noon belongs to coreferen~ chain CC) 
and there is an element from CC which is 
dosest to Iron in Tezt, Pick that dement. 
else Pick Noun 
COCKTAIL doesn't employ semantic consistency 
checks for this form of pronominal coreference res- 
32 
0 
O 
O 
O 
O 
0 
O @ 
@ 
O 
O 
0 
0 
O 
O @ 
@ 
0 
0 
0 
e 
O 
0 
0 
O 
O 
0 
0 
0 @ 
0 
0 
0 @ 
@ 
0 
0 
0 
0 
0 
0 
O @ 
@ 
olution. FYom our initial experiments, we do not 
see the need for special semantic consistency checks, 
since all heuristics performed with precision in ex- 
cess of 90% Part of this is explained by our usage of 
pleonastic filters and of recognizers of idiomatic us- 
age. Table 5 illustrates some of the successful coref- 
erence resolutions. 
H_qe says that in many years as a banker he has grown 
accustomed to "dealing with honest people 99% of 
the time. 
sen. Byrd takes pains to reassure the voter that he 
will see to it that the trade picture improves. 
A..nurse who deals with the new patient ~Jmits sh.._~e 
isn't afraid of her temper. 
Table 5: Examples of 3rd person pronouns 
2.2 Nominal Coreference 
Noun phrases can represent referring expressions in 
a variety of cases. For example, it is known that 
not all definite NPs are anaphoric. Conditions that 
define anaphoric NPs are still under research (cf. 
(Poesio and Vieira, 1998)). In the tagged corpora, 
we have found only 20.93% of the nominal corefer- 
ence cases to be definites, the majority (78.85%) be- 
ing bare nominals 2, and only 1.32% were inclefiuites. 
However, more than 50% of the nominal referring 
expressions were names of people, org~n!-~tions or 
locations. Adding to this, 15.22% of nominal coref- 
erence links are accounted by appositives. Based 
on this evidence, COCKTtIL implements special rules 
for name alias identification and for robust recog- 
nition of appositions. Moreover, the heuristics for 
nominal coreference resolution apply Senrchs, and 
enhancement of Search~ that searches starting with 
the coreference chains, and then with the accessi- 
ble text. To resolve nominal coref~eace, COCKTAIL 
successively applies the following heuristics: 
oHeuristic l.Nominal(H1Nom) 
if (Noun is the head of an appositive) 
then Pick the preceding NP. 
o Heuristic P..Nor~inal(H2Nom) 
if (Noun belongs to an NP, Searchs /or NP' 
such that Noun'ffiaame_name(head(NP),head(NP')) 
or Noun'--same.name(adj(NP),adj(Ne'))) 
then if (Noun' belongs to core/erence chain CO) 
then Pick the element ~vm CC which is 
closest to Noun in Text. 
else Pick Noun: 
oHeuristif. 3-Nominal(H3Nom) 
if Noun is the head of an NP 
then Searchs for proper name PN 
2We count as bare nominals coreferring adjuncts as well. 
such that head(PN)-Noun 
if (PN belongs to coreference chain CG) 
and there is an element from CC which is 
closest to Noun in Text, Pick that element. 
else Pick PN. 
o Houristie 4-Nominal( H 4N om) 
Searchs \]or a proper name PN with the same 
category as Noun 
if (PN belongs to core-ference chain CC) 
and there is an element from CO which is 
closest to Noun in Tezt, Pick that element. 
else Pick PN. 
oHeuristic 5-Nominai(H5Nom) 
Searchs Noun" a spnenym or hyponyrn of Noun 
if (Noun' belongs to core/erence chain CC) 
and there is an element fl'om CO which is 
closest to Noun in Text, Pick that dement. 
else Pick Noun'. 
oH. euristic 6-Nominal(H6Nom ) 
Searchs for Noun either in definites or 
in NPs having adjuncts in coreyerence chain CU) 
if Ante 8emantieally consistent with Noun 
if (Ante belongs to core/erenee chain UC) 
and there is an dement from UU which is 
closest to Noun in Text, Pick that element: 
else Pick Ante. 
oHeuristic 7-Nomine/(H7Nom) 
if (Noun or one ol his hz~n~nrts 
or holonyms is a nominalization N) 
then Search/or the verb V deriving N 
or one o/ its synen~ns) 
then P/ok NP, the closest adjunct o/V 
if (NP belongs to ¢ore!erence chain 00) 
az~d there is an dement from CO which is 
closest to Noun in Te~, Pick that element. 
else Pick NP 
oHeuristi¢ &N0m/na/(H8Nom) 
if (Noun is the head o/a prepositional 
phrase preceded by a nominalization N) 
then Search/or the verb V deriving N 
or one oI its s~um~ns) 
if (Noun" is on adjunct o/ V) and 
(Noun" and Noun have the same category 
• if (Noun' belongs to ¢ore/erenea chain CC) 
and there is an dement from CC which is 
closest to ~Voen in Text, Pick that dement~ 
else Pick Noun" 
oHouristi~ 9-Nominal(H9Nom) 
Searchs Jar Noun', a metonymp whose 
coercion is Noun 
Pick Noun' 
me o p. es l 
by appositions, whereas heuristic H2Nom promotes 
33 
IMB and Mr. York would;t discuss his compensation 
package which could easily reach into seven figures. 
~ect is sensitive at a time when IMB 
is !aying off thousands of employees 
Mr Iacocca led Chrysler through one of the 'largest 
stock sales ever for a U.S. industrial company, raising 
.$1.78 billion. Chrysler is using most of the proceeds 
to reduce its $4.4. billion unfunded pension liability. 
We read where the Clinton White House 
is seeking a deputy to chief of staff Mack McLarty to 
impose some disciplined coherence on the p/ace's 
• ambunctious young staff. 
Table 6: Examples of nominal coreference 
the term repetition indicator, when consistency 
checks apply. For this heuristic, consistency checks 
are conservative, imposing that either the adjuncts 
be identical, coreferring or the adjunct of the ref- 
erent be less specific than the antecedent. Speci- 
ficity principles apply also to HSNom, where hy- 
ponymy is promoted, similarly to (Poesio and Vieirs, 
1998). Heuristic H3Nom allows coreference between 
"the Securities and F_,z~ange Commission n and .~he 
commission ~ but it bans links between ~Reardon 
Steel Co." and "tons of steal". 
Many times coreferring nomln~l~ share a~o se- 
mantic relations (e.g. synonym#). Heuristic HSNom 
identifies such cases, by applying consistency checks. 
Based on experiments with the coreference module 
of FASTUS, where this heuristic was initially imple- 
mented, we require that most frequent senses of 
nouns be promoted. The same precedence of f~- 
quent senses is implemented in the assi~ment of 
categories, defined as the immediate WordN~ h~ 
pernTpn. The category of proper names is dictated 
by the proper name recognizer, ~qlo~ing such cate- 
gories m Person, Organization or 
In this way, coreference between "IBM ~ and ~he 
wo,mded computer 9lent ~ can be estab!|~bed, since 
sense 3 of noun #/ant is Organim6on, the category 
of ~IBM~. Simi!m- ~tegory-based semaatic cheCkS 
allow the recognition of the antecedent of proceeds 
from the second example listed in Table 6. The 
h~l~ern~ of ~eceezk is ga/n, whose glou genus is 
amount, the category of $1.78 biUio~ Semantic 
checks are also required in H?Nom and HSNom, 
heuristic that rely on derivational morphology. The 
first example from Table 6 is resolved by HTNom, 
since d/scass/on the nominalization of d/scuss b~_q 
the category communication, a hypernym of subject, 
The antecedent is the object of the verb d/scuss. 
The last heuristic, H9Nom identifies coreferring 
links with coerced entities of nominals. Coercions 
are obtained as paths of meronyms or hypernyms. 
(Harabagiu, 1998) discusses a coercion methodol- 
ogy based on WordNet and Treebank. Since in our 
test corpus there we very few cases of metonymic 
anaphors, Table 7 lists the precision of the other 
heuristics only. 
I Heuristic I\[ H1Nom i H2Nom H3Nom j H4Nom 1 Precision on \[\[ 98% 95% 82% 88% 
100 random \[\['HSNo.m He~om ti7Nom HSNom 
Table 7: Nom!~nal coreference precision 
The empirical • methods employed in COCKTAIL are 
an alternative to the inductive approaches described 
in (Cardie and Wagstatf, 1999) and (McCarthy and 
Lehnert, 1995). Our results show that high-precision 
empirical techniques can be ported from pronominal 
coreference resolution to the more difficult problem 
of nominal coreference. 
3 Lexical Cohesion 
The heuristics encoded in COCKTAIL make light 
use of textual cohesion, i.e. the property of 
texts to Ustick together s by using related words. 
Both pronominal and nominal coherence resolution 
heuristics use cohesion cues indicated by term rep- 
etition while nominal corofexence relies on semantic 
relations between anaphors and their antecedents. 
In addition, coreference chains are a form of textual 
cohesion, known as referential cohesion (d. (Halli- 
day and Haesan, 1976)). 
Until now, lex/m/cohes/on, arising from semantic 
connections between words, was successfully used as 
the only form of textual cohesive structure, known as 
• lez/cd chdn& At present there are three methods 
of generating lexical chains. The first one, imple- 
mented in the TextTning algorithm (Hearst, 1997), 
counts the f~lUencies of term repetitions and is an 
ideal, lightweight tool for segmenting texts. The sec- 
ond method, adds knowledge from semantic dictio- 
naries (e.g. Roget's Thesaurus in the work of (Mor- 
ris and Hirst, 1991) or WordNet in the methods 
presented in (B~y and Elhadad, 1997), (Hirst 
and St-Onge, 1998)). Besides term repetition, this 
approach reco~i,~s relations between text words 
that are connected in the dictionaries with prede- 
fined patterns. This method was applied for gen- 
eration of text ~lmmm'ies, the recognition of the 
intentional structure of texts and in the detection 
of malapropism. The third method is based on a 
path-finding algorithm detailed in (Harabagiu and 
Moldovan, 1998). This method creates a richer 
SDefiuition introduced in (Halliday and Ha.man, 1976) 
and (Morris and Hirst, 1991) 
34 
0 
0 
0 @ 
0 @ 
0 
O 
O 
0 @ 
0 
0 @ 
0 
0 
0 
0 
0 
0 @ 
0 @ 
0 @ 
0 
0 
0 
0 @ 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
O @ 
O 
O 
O 
O 
O 
O 
O 
O 
O 
O 
O @ 
O 
O 
O 
0 
O 
O 
O 
O 
O 
O 
O @ 
O 
0 
O @ 
O 
O 
O 
O @ 
O 
O 
O 
O 
O @ 
O @ 
O 
O @ 
structure, useful for the al~duction of coherer~e re- 
lations from the knowledge encoded in WordNet. 
Here we describe a new cohesion structure that 
(a) incorporates both lexical and referential cohesion 
and (b) produces a unique chain that contains not 
only single words, but also textual entities encom- 
passing head-adjunct lists. We use the finite-state 
parses of FaSTU$ (Appelt et al., 1993) for recogniz- 
ing these entities, but the method extends to any 
basic phrasal parser 4. 
We produce this novel cohesive structure to ex- 
ploit the close relation between text cohesion and 
coherence. It is known (cf. (Harabagiih 1999)) that 
cohesion, as a surface indicator of the text coherence, 
can indicate the lexico-semantic knowledge upon 
which coherence is inferred. Our aim is to use this 
cohesive chain for producing axiomatic knowledge 
for CICERO, a TACITUS-like system that abducts co- 
herence relations. TACITU$ (Hobbs etal., 1993) is a 
successful abductive system when provided with ex- 
tensive pra£~n~ic and linguistic knowledge. CICERO 
is des~ned as a Jightwe~t version of TACITUS, that 
performs reliable abductions, with minimal knowl- 
edge and effective searches. Translating all the lexi- 
cal, morphological, synta~'c and semantic ambigu- 
ities from texts would make the search intractable. 
Out solution for CICERO is to use a cohesive chain 
to create manageable knowledge upon which the ab- 
duction can be performed. Section 4 describes this 
knowledge and the operation of CICERO. 
Our cohesive chain is a link~! structure consist- 
ing of three parts: (1) the connected text entity, (2) 
its incoming and outgoing pointers and (3) a fez/co- 
semantic. ~ph~ containing paths of WordNet con-- 
cepts and relations. The lexico-semantic structure 
is later translated in the axiomatic knowledge that 
supports coherence inference. To exemplify the co- 
hesion chain, we use the following text, spanned by 
the coreference chains produced with COCKTAIL: 
\[Toys R Us\]~ named Midmd Goldstein \[chief ezecut/ve 
o~, ending ymrs o l ~eculation~ about ~oho 
u,~/l su~d \[6'horla,/,aw~\]s, \[the \[to~ retaaer\], 's 
founder and chief architetrt.\]s 
\[Robert Nalmsone\]4, \[former vice chairman and 
udddy regarded as t~ other st~riou~ cont~ufer for 
\[the top ezecuti~e\]~ '8 job\]4, ~m named president 
and chief opueting o~=r, both ne~ positions. 
The indexes indicate the four coreference chains. 
This text has only two repeating terms, the verb 
name and the noun executive, thus it generates little 
information with the TeztTiling algorithm. The co- 
hesion method detailed in (Barzilay and Elhadad, 
4Such a parser operates on part-of-speech tagged text, with several noun and verb grouping rules. 
1997) can detect one lexical chain: \[chief execu- 
tive o~cer, chairman, executive, .president\]. We 
would like to obtain richer lexico-semantic informa- 
tion, thus we build a cohesion chain that contains 
larger textual entities. To recognize the entities, we 
use the coreference chains and the following parse, 
pro duc.ecl, by FASTUS: 
#<P~(OR~ISlZATION-NANE) :"Toys R gs"> 
J<PBBISE (B~5I¢) : ~named'> 
#<PHKASE(PERSOIJ-g~) : "Nicl~el ~l~teln"> 
8<PHRISE(|G):mch/ef executive officer"> 
8<P~5~(~) : -oadlag'> 
8e~J~U~(|~) :~yt~ Of epOCl~latiOIlw~ 
#<pna=~=(pKEp) : "shoutS> 
8<PHUSEfB~SIC) :"~iZ1 succeed"> 
IcPRRISE(PELS~I-NIM~) :'Charles Lazaru , the toy retailer "s 
founder and ch/ef erchigect"> 
S<PHRISE(PEP, SmI-JI~) : "1obor~ Nakasou. 
formrly vice chsirnu"> |<pNltA~(COLI) 
: walMl"> 
8<Pwst~_(BISIC) : "gidely rogsrdod"> 
8<Pnt~(PR~) : "u"> 
8<PH~SE(i~) :'the other sorioas contenderS> 
8<l~mt-~_fJ~):%ho top executive J8 Jo\]}m> 
8<Pna~(CO~A) :". "> 
8~qLt~E(BLSIC) : ~ nued'> 
#<PHIJ~E(B~) :'president and chief operatin K officer. 
both nee poeiticm8"> 
Textual entities are either basic phrases contained 
in the coreference chaln.q or lists of phrases collected 
from the parse, by scanning for all NGs or NAME- 
phrases directly connected to a verb phrase through 
a S~bject, Ob3ect or prepositional relations. For ex- 
ample, as phrase "Toys R (.ramie the antecedent from 
a coreference chain, its corresponding textual entity 
is: 
|a~oys R Us"-Subject-+ '%ame"\] 
\[ |nm~-Objectl- "Mid~d GoMste/n~ . 1 " 
\[n~-Object2- "cA/e/e~cuffve offu~,~ 
The cohesion chain for our. text is illustrated in 
Figure 1. The algorithm that generates cohesion 
chains is: 
Algorithm Cohesion-Chaln-Builder 
I. if (current N@ belongs to a core/erenc~ chain) . 
Create its te.z'htal entity TE and place .it on the Chain 
~,. if (the an ~__dent @ already/in the chain) 
Place the corefer~ pointer bet~eea the two TEs 
3. if (the ooreferen~ is not an appositive) 
Populate the lezico-semantic s~ctu~(TE) " 
The derivation of the lexico-semantic structure 
(LSS) follows the steps: 
Lfor every re/at/on r(wt,,n2) from a TE 
if(there is st a sense of wl a.d s2 a sense of 
wa such that the same relation r'(ws,w4)/8 found 
in a gtoss ~ the hierorchies of ,z~' or w~' ) 
Add relation r" to LS$ 
~.for every word tv in a TE 
35 
if (there is a concept C in LSS such that there is a 
collocation \[~1 c\] in a gloss from the hierarchy(w)) 
Add to to LSS 
3. if (word w is already in LSS) 
Add ne~a connection to w in LSS 
For example, in the first TE illustrated in Fig- 
ure I, we have the relation Object(name, CEO). We 
find art Object relation also in the gloss of appoint, 
the hypernym of sense 3 of verb name. The new Obo 
\]ect relation connect verb assume with the synset 
{duty, responsibility, obligation}. A hypernym of 
CEO is manager, collocating with position in the 
gloss of managership. Noun position belongs to the 
hierarchy of duty, thus the new Object relation can 
be added to the LSS. 
+ n rea~olD ~ L~t~.5~ Smmmre (LSS) 
I W+.,.'*' 1 "x 
l~n~ l " Obj¢,¢'l I i + ,.+,j 
I -- 
II 
I 
T I~ M_-.-,*j~ . 
$ h mp e.~m~'*.l~ 
Figure 1: Cohesion chain 
4 Text Coherence 
We base our consideration of textual coherence on 
the definitions introduced in (Hobbs, 1985). The 
formal definition of relations that capture the coher- 
ence between textual assertious is based on the re- 
lations between the states they infer, their changes 
and their logical connections. States, changes and 
logical connections can be retrieved from pragmatic 
knowledge, accessible in lexical knowledge bases like 
WordNet. The complex structure of our cohesion 
chains help guiding these inferences. 0 
For each textual unit, defined from the parse of the O 
text, axiomatic knowledge produced. The acquisi- 
tion of axiomatic knowledge is cued by the concepts O 
and relations from the LSS portion of the cohesion O 
chain, and is mined from WordNet. CICERO, our sys- 
tem, adds to this knowledge axioms that feature the O 
characteristics of every coherence relation. CICER0's O 
job is to abduct the coherence structure of a text. 
To do so, it follows the steps: 0 
/.for every textual unit TUi @ 
~. Derive pragmatic knowledge for TUi @ 
3. for every pair (TUi,TUj),i ~ j 
4- for every coherence relation 7~k O 
5. hypothesize R~(TU. TUj) 
6. Perform abduction R~ (TU. TUj) O 
7. Choose cheapest abduction @ 
For the text illustrated in Section 3, this proce- O 
dure generates the coherence graph illustrated in @ 
Figure 2. 
Remit Elab, m~en e 
',,im,m.immwnd~oma" I l ~olmmi~omo+, J 
" J t "' 
Figure 2: Coherence graph 
We exemplify the operation of CICERO on this text 
by presenting the way it derives the Elaboration rela- 
tion between the textual unit from the first sentence 
that announces the nomination of Michael Goldstein 
(TU.) and the textual unit from the same sentence 
that deals with the succession of Charles Lazarus 
(TUb). l~st, CICERO generates the knowledge upon 
which the abductions can be performed. This knowl- 
edge is represented in axiomatic form, using the no- 
tation proposed in (Hobbs et al., 1993) and previ- 
ously implemented in TACITUS. In this formalism 
each text unit represents an event or a state, thus 
has a special variable e associated with it. Events 
are lexicalized by verbs, which are reaped into pred- 
icates verb(e,z,y), where z represents the subject of 
the event, and y represents its object (in the case of 
intransitive verbs, y is not attached to a predicate, 
36 
e 
o 
o @ 
o 
o 
o 
o 
o 
o 
o @ 
o 
o @ 
o @ 
o @ 
o @ 
o 
o @ 
o 
o @ 
@ 
o 
o 
o 
e @ 
o 
o 
o @ 
o @ 
o @ 
o 
o 
e 
whereas in the case of bitransitive verbs, y is mapped" 
into Yl and l~2)."Moreover, predicates from the text 
are related to other predicates, derived from a knowl- 
edge base. These relations are captured in first or- 
der predicate calculus. For example, the pragmatic 
knowledge used for the derivation of the Elaboration 
relation between TUa and TUbis: 
TU,: 
assiqn( e~ , z~ )&positionz~ =~ 
vac'ant-position( el ) ~ aJsign( e~ , z l )& Positionz l 
TUb: . . I l~e(e~, zl, z2),t'l~rson(zl)&~jmon(z2) =~ 
I ~acanLposition(e~) 
\[ lea~'e(et , zh z~)k~rs(m(z~ )&position(z,)& 
\[ a~sume(e2, zs, z2)&person(zs) =~ 
In the next step, ~!1 coherence relations are hy- 
pothesized, and the cost of their abduction is ob- 
tained. The appendix lists the LISP function cre- 
ated on the fly by CICERO that produces the ab- 
duction of the Elaboration function. Because of the 
computational expense, an intermediary Step sim- 
plifies the axiomatic knowledge. The appendix lists 
also the full abduciton and its cost. CICERO is a sys- 
tem still under development, and at present we did 
not evaluate the precision of its results. 
5 Conclusion 
We have introduced a new empirical method for 
coreference resolution, implemented in the COCgTtIL 
system. The results of this algorithm are used to 
• guide the abduction of coherence relations, as per- 
formed in our ClC~0 system. In an intermediary 
step, a rich cohesion structure is produced. This 
novel relation between coreference and coherence 
contrasts with the traditional view that coreference 
is a by-product of coherence resolution. Moreover, 
we reiterate the belief that coherence builds up from 
cohesion. 

References 
Chinatsu Aone and Scott W. Bennett. 1997. Evaluating 
automated and mam~ acqui~tioa of anaphom res- 
olution strat~e& In Proceedings of the $Sth Annual 
Meeting of the A~odation for Computational f, ingu~. 
tics (ACL.gT), pages 122-129, Madrid, Spain. 
Douglas E. Appelt, Jerry R. Hobbs, John Beat, David 
Israel, Megumi Kameyama and Mabry Tyson. 1993. 
The SRI MUC-5 JV-FASTUS Information Extraction 
System. In Proceedings of the Fifth Me.uage Under- 
standing Conference (MOC-5). 
Nicholas Asher and Henri Wad& 1988. A computational 
account of syntactic, semantic and discourse principles 
for anap.hora resolution. Journal of Semantics, 6:309- 
344. 
Brack Baldwin. 1997. CogNIAC: high precision corefer- 
ence with limited knowledge and linguistic resources. 
In Proceedings of the A CL '97/EA CL '97 Workshop on 
Operutional factors in practical, robust anaphora reJ- 
olution, pages 38-45, Madrid, Spain. 
Regina Barzilay and Michael Elhadad. 1997. Using Lex- 
ical Chains for Text Summarization. In Proceedinga of 
the A CL '97/BA CL '97 Workshop on Intelligent Scal- 
able Text Summarization, Madrid, Spain. 
Susan E. Brennan, Marilyn Walker Friedman and Carl 
J. Pollard. 1987. A centering approach to pronouns. 
In Proceedings of the P.Sth Annual Meeting of the ACL 
(ACL-87), pages 155-162. 
Jaime Carbonell and Richard Brown. 1988. Anaphora 
Resolution: A Multi-Strategy Approach. In Proceed- 
ings of the l~h International Conference on Com.m~- 
rational Linguistics, pages 96-101. 
Claire Cardie and Kiri Wagstaff. 1999. Noun phrase 
coreference as clustering. In Proceed imJs of the Joint 
Conference on Bmpirical Methods in NI, P and Very 
Large Corpor¢ 
Niyu Ge, John Gale and Eugene Charuiak. 1998. 
Anaphora Resolution: A Multi-Strategy Approach. In 
Proceedings of the 6th Workshop on Very Large CoP. 
pore, (coLnvG/ACL 'gS). 
Barbara J. grca, Aravind K. Joshi and Scott Weinstein. 
1995. Centering. A Framework for Modeling the Local 
Coherence of Discourse. Computational Linguistics, 
21(2). 
M.A.K. Halliday and 1~ Hassan. 1976. Cohesion in Eno 
glisk Longman, London. 
Sanda M. Harabagiu. 1998. Deriving metonymic coer- 
dons from WordNet. In Proceedings of the Worlc~hop 
of the OsmJe of WordaVet in Natural .Language Pro- 
ce~ir~ SysternJ, CO LING.A CI, "gs, pages 142-148. 
Sanda M. Harabagiu and Dan I. Moldovan. 1998. A Par- 
allel System for Text Inference Using Marker Propaga- 
tions. IBB8 Tmnaactions on Pandlel and'D~ib~ ,ff/sl~s, 9(8):729--747. 
Sands M. Harabagiu. 1999. From Lexical Cohesion to 
Textual Coherence:. A Data Driven Persoective. lm 
ternational Journal of Pattern Recofnition and Arti- 
tidal Intelligence, 13(2):1-18. 
Ma~ A. Hearst. 1997. TextTiling: Segmenting Text 
into Multi-paragraph Subtopic Passages. Computa- 
tional Lin~tics, 23(1):33--64. 
Lynette Hirslunan and Nancy Chinchor. 1997. MUC-7 
Coreference Task D,~nition. 
Lynette Itirshman, Patricia Robinson, John Burger and 
Marc Vilain. 1998. The role of Annotated Training 
Data. 
Graeme Hirst and David St-Onge. 1998. Lexical Chains 
as Representations of Context for the Detection and 
Correction of Malapropism. In WordNet - An Elec. 
tronic Lexical Databaze, Edited by Christiane Fell- 
baum, MIT Press. 
Jerry R. Hobbs. Resolving pronoun references. Lingua, 
44:311-338. 
Jerry R. Hobbs. 1979. Coherence and coreferen'ce. Cog- 
nitive Science, 3(1):67-90. 
Jerry R. Hobbs. 1985. On the coherence and structure 
of discourse Technical Report CSLI-85-37, Stanford 
University. 
Jerry It. Hobbs, Mark Stickel, Doug.E. Appelt, and Paul 
Martin. 1993. Interpretation as abduction. Artificial 
InteUigence; 63:69--142. 
Shalom LapPin and Herbert Learn. 1994. An algorithm 
for pronominal anaphora resolution. Computational 
Linguistics, 20(4)'535-662. 
Megumi Kameyama. 1997. Recognizing Re~erential 
Li, I~: An Information Extraction Perspective. In 
Proe~-~-inos o/the Workshop on Operational Factorm 
in Practica~ Robu~rt Anaphom Resolution for Un- 
re.drict~d Texts, (A CL-97/BA OL. 97), pages 46-53, Madrid, Spa=. 
Andrew Kehler. 1997. ProbabilLstic Coreference in In- 
formation Extraction. In Pro,:ee~_ings of the Second 
Con/erenc~ on Empirical Methods in Natural Lan- 
guage PreceJsin9 ($IGDA T), pages 163-173. 
Christopher Kennedy end Braulmir Bagure~v. 1996. 
Aaaphora for everyone: Pronomln~d anaphora reso- 
lution without a parser. In Proc_eedings of the 16th 
International Conferenc~ on Computational Linguis- 
tic~ (COLhVQ.96). 
William (2. Mann and Sandra A. Thompson 1988. 
Rhetorical Structure ~ I Toward a functional 
theory of text organization. Te~ 8:243-281. 
M. Marcus, B. Santorini and M.A. Mar~/-t~ewics. 
1993. Building & large annotated corpm of En- 
glish: The Penn Tteebank. Computational Linguis- 19(2):313-330, 1993. 
Joseph F. McCarthy and Weady Lelmert. 1995. Us- 
Lug ded~'on trees f-~ corefereace resolution. In Pro- 
eenlin~ o/the IJth lntmmtior~ Joint Con/.~,.-.~e on 
Artificial Intdligen~ (IJOAI-95), pages 1050-1055. 
Kathy McKeown. 198,5. Discourm~ strategies for gem- 
eragmg natural-language text. Artificial Intdligenee, 
27:1--41, 1985. 
George A. Miller. 1995. WordNet: A Le~d Database. 
Communication of the A CM, 38(11):39-41. 
Ruslan Mitkov. 1998. Robust pronoun resolution 
with limited knowledge. In ProeePdings of COLING- 
A 0L'98, pages 869-875. 
Jane Morris and Graeme Hirst. 1991. Lexieal cohesion 
computed by thesaura\] relations as an indicator of the 
• structure of text. Computational Linguistics, 17:21- 
48. 
Massimo Poesio and Renata VieLra. 1998. A corpus- 
baaed investigation of definite description use. Com. 
putational Linguistics, 24(2):183-216. 
Philip Resnik. 1995. Using information content to evalu- 
ate semantic similarity in a taxonomy. In Proceedings 
of the 14th lnt~natiomd Joint Conference on Artifi. 
cial Intdligenee (IJCAI-95), pages 448-453. 
Elaine Rich and Susan Luperfoy. 1988. An architecture 
for anaphora resolution. In Proceedin9s of the £6th 
Annual Meeting of the Association for Computational 
Linguistics (ACL-88), pages 18-24. 
Bonnie Webber. Discourse deixis: Reference to discourse 
segments. In Prveeedings of the ,~6th Annual Meet- 
ing of the Association for Computational Linguistics 
(ACI,.88), pages 113-121. 
Robert W'flensky. 1987. Understondin90oal.B~ed Sto- 
r/¢J. Phi) thesis, Yale University, New Haven, CT. 
