Anaphora Resolution in Multi-Person Dialogues
Prateek Jain and Manav Ratan Mital and Sumit Kumar and Amitabha Mukerjee and Achla M. Raina
Indian Institute of Technology Kanpur,
Kanpur 208016 INDIA
{pjain,sumit,manavrm,amit,achla}@iitk.ac.in
Abstract
Anaphora resolution for dialogues is a difficult
problem because of the several kinds of com-
plex anaphoric references generally present in
dialogic discourses. It is nevertheless a criti-
cal first step in the processing of any such dis-
course. In this paper, we describe a system for
anaphora resolution in multi-person dialogues.
This system aims to bring together a wide array
syntactic, semantic and world knowledge based
techniques used for anaphora resolution. In
this system, the performance of the heuristics is
optimized for specific dialogues using genetic
algorithms, which relieves the programmer of
hand-crafting the weights of these heuristics. In
our system, we propose a new technique based
on the use of anaphora chains to enable reso-
lution of a large variety of anaphors, including
plural anaphora and cataphora.
1 Introduction
Anaphoric references abound in natural language dis-
courses and their resolution has often been identified as
the first step towards any serious discourse processing re-
lated tasks. However, any comprehensive anaphora reso-
lution scheme is expected to entail the use of rich seman-
tic and pragmatic knowledge representation and process-
ing, and is, therefore, a complex problem. As a result of
such problems, several heuristics-based approaches have
been developed and adopted over the years to achieve par-
tial solutions to the problem.
The pioneering work in the area of anaphora resolu-
tion was done by Hobbs (Jerry R. Hobbs, 1978) who
designed several early syntactic and semantic heuristics
for the same. (Hirst, 1981) discusses several early ap-
proaches to anaphora resolution in discourses. (Denber,
1998) and (Lappin and Leass, 1994) describe several syn-
tactic heuristics for reflexive, reciprocal and pleonastic
anaphora, among others. Often domain-specific heuris-
tics are used for anaphora resolution and fine tuned to
perform well on a limited corpus, such as in (Mitkov,
1998). (Ng and Cardie, 2002) proposes a machine learn-
ing approach to Anaphora Resolution but generally sta-
tistical learning approaches suffer from the problems of
small corpuses and corpus dependent learning. A more
general and comprehensive overview of state-of-the-art
in anaphora resolution is given in (Mitkov, 1999) and also
in (Mitkov et al., 2001).
Few systems have been developed that are specifically
aimed at the task of anaphora resolution in discourses.
ROSANA, an algorithm for anaphora resolution that fo-
cuses on robustness against information deficiency in the
parsed output, is described in (Stuckardt, 2001). MARS,
the Mitkov Anaphora Resolution System, is another au-
tomatic, knowledge-poor anaphora resolution system that
has been implemented for several languages including
English, Bulgarian and Japanese.
In this paper, we describe the design and implementa-
tion of Jepthah1, a rule-based system for resolving a wide
variety of anaphora occurring in multi-person dialogues
in English. In this system, we integrate several different
knowledge-poor constraints and heuristics, and operate
them over a naive character model of the entire dialogue
to perform effective anaphora resolution. In addition to
using standard heuristics, we have developed our own se-
mantic and pragmatic heuristics, specific to dialogue sit-
uations, that operate on this character model. There is
a weight assigned to each of these heuristics and these
weights are fine-tuned using a learning mechanism im-
plemented by genetic algorithms. We use the linguistic
feature of anaphoric chains, present in dialogues, to re-
solve a relatively large class of anaphora.
1name of a wise Isreali judge in the Bible
2 Jepthah
In Jepthah, we adopt an integrated approach towards re-
solving various different kinds of anaphors occurring in
dialogue situations. In this approach we fuse together
several heuristics with a new kind of computational lin-
guistic insight – that of the deployment of anaphora
chains and we develop a graph-based technique for han-
dling the resolution of various anaphors. An anaphora
chain may be described as a referential chain compris-
ing series of mutually co-referential anaphoric elements,
generally of more than one type, headed by a referential
element.
The class of anaphors that we aim to resolve is
fairly large and includes pronouns, reflexives and deic-
tic anaphors. In terms of distribution, we are dealing with
anaphors in subject, object and modifier positions, pos-
sessive reflexive, and cataphora. It is may be mentioned
here that we deal only with unambiguous cases of plural
pronouns, such as both of us, two of you, etc. These are
the cases in which the domain of the pronouns is clearly
quantified, unlike the case of such instances as all of us
or they, etc.
2.1 Graph-theoretic Approach
The entire operation is centered around a graph formu-
lation of the resolution problem in the perspective of the
dialogue. We extract all the nouns and pronouns present
in the dialogue. Assume there are n nouns and p pro-
nouns in the dialogue. Let the ith noun be represented as
Ni, with i ≤ n and that Pi represents the ith pronoun,
with i ≤ p. Now, we construct a graph representation for
the problem as follows. Let G be the graph that we are
interested in formulating, comprising of a node for every
Ni and Pj.Let NGi be the node corresponding to the noun
Ni and PGj be the node corresponding to the pronoun Pj.
Thus, we can split the set of vertices of this graph VG into
two parts, the set consisting of NGi , ∀i ≤ n and the set
consisting of PGj , ∀j ≤ p. The set of edges EG for this
graph G comprises of two types of directed edges and is
constructed as follows. Construct a set of edges E1 which
includes a directed edge Ei→j from PGi to NGj , for all
pairs PGi and NGj . The other set E2 includes a directed
edge Eprimei→j from PGi to PGj for all pair of nodes PGi and
PGj such that i negationslash= j. Clearly, we have EG = E1 ∪E2. Let
us define a property L on the paths in this graph as fol-
lows – a path p satisfies the property L iff it consists of a
sequence of edges Ei ∈ EG (i ≤ length(p)) with exactly
one edge Ef from the set E1 and the condition that this is
the last edge in the sequence, i.e., Elength(p) ≡ Ef.
Intuitively, this graph represents the set of possible
anaphor-antecedent relationships. The set of possible ref-
erents of an anaphor represented by the node PGi in the
graph G consists of all possible distinct nodes NGk that
can be reached from PGi using paths that satisfy the prop-
erty L. Let this set be represented as Si. Note here
that paths as above of length ≥ 2 represent anaphoric
chains present in the dialogue. One or more edges in
these paths are from one anaphor to another and represent
co-reference amongst these anaphors. The antecedent
space of an anaphor Pi consists of all nouns and pronouns
whose corresponding nodes in the graph G are reachable
from PGi by traversing a single edge belonging to EG.
Now, the idea here is to process this antecedent space and
rank all the nodes in Si to determine the most likely an-
tecedent for the anaphor Pi. This ranking is done by at-
taching weights to the edges present in the graph.
Every edge is awarded a particular weight (less than
1.0), that is evaluated for every edge using a set of heuris-
tics described in section 2.4. The rank of each node NGk
in the set Si is determined by the total weight Wik for that
node. Wik is computed as follows – let the weight Wp of
each path p be defined as the product of the weights of
all the edges lying on that path. Then, Wik is the sum of
the weights of all the paths from PGi to NGk , i.e.,summationtextp Wp.
Hence, for anaphora resolution, we need to basically de-
sign an algorithm or a function to compute the weight for
each edge in the graph.
2.2 System Design
The input dialogue is passed to the front end which com-
prises of the Stanford Serialized Parser and PoS tagger.
The parser gives the best parse for every input sentence,
each of which are then subsequently processed. In the
first step we extract all the proper nouns present in the
dialogue and initialize our character model base and the
graph G that was explained in section 2.1. We then
take the sequence of parses corresponding to each sub-
sequent dialogue by a speaker and process them sequen-
tially. Techniques for anaphora resolution are then ap-
plied in two phases. In the first phase, a set of constraints
is applied to this graph, to prune out edges that represent
any unfeasible co-references. In the second phase, a set
of heuristics are applied to award weights to edges repre-
senting these relationships. After the processing is over
and all weights have been obtained, the permissible an-
tecedents for each anaphor are ranked and the most likely
antecedent for each is outputted. In case there is a plu-
ral anaphor, with quantification over x nouns, the top x
likely antecedents are outputted.
While processing the dialogue, a naive character build-
ing is implemented. This is done mainly by focusing on
the verbs in the sentences. The subject and object nouns
associated with these verbs are selected and their relation-
ship is put in the character model base associated with the
speaker of the corresponding dialogue. The system main-
tains an apriori knowledge base with it containing infor-
mation like ontology and functionalities of several nouns.
This combination of apriori and assimilated knowledge
is then used to apply certain semantic and pragmatic con-
straints/heuristics on the graph, as shown in the following
sections.
2.3 Constraints
We apply the set of restrictions prior to the set of prefer-
ences, thereby narrowing down the candidate set as early
as possible. The list of constraints that implement these
restrictions in Jepthah are listed as follows –
1. Deictic Constraint: This is a set of simple con-
straints that are specific to dialogue settings because
in such settings we can have the concept of frames
of reference with regard to the various speakers in-
volved in the dialogue action.
2. Non-coreference (Mitkov, 1999): Syntactic fea-
tures present in a sentence often lend themselves
to be expressed as constraints on anaphora refer-
ence. These features are captured by our non-
coreference constraints which stipulate that certain
pairs of anaphor and noun phrases within the same
sentence cannot refer to the same antecedent.
3. Gender, Number and Person Agreement: This is
a low level constraint which requires that anaphors
and their antecedents must agree in gender, number
and person respectively.
4. Constraint on Reflexive Pronoun: A reflexive pro-
noun such as himself, herself, etc must refer to the
subject or the object of the verb in whose clause it
lies. In case of ellipsis, however, it may refer to the
subject or object of the next higher verb to which the
clause is attached.
5. Semantic Consistency (Mitkov, 1999): This con-
straint enforces same semantics of the antecedent as
the anaphor under consideration.
2.4 Heuristics
Each preference or heuristic, has a certain weight and
awards certain points to every anaphor-antecedent rela-
tionship. These points are a measure of the likelihood of
that anaphor-antecedent relationship. The weight of an
edge is the sum total of the weights awarded by each in-
dividual heuristic to the anaphor-antecedent relationship.
The heuristics used in our system are enumerated as fol-
lows –
1. Definiteness (Lappin and Leass, 1994): Accord-
ing to this heuristic, nouns that are preceded by a
demonstrative pronoun or a definite article are more
likely to be antecedents and are awarded higher
credibilities.
2. Non-prepositional NP (Lappin and Leass, 1994):
This heuristic states that a noun phrase which occurs
within a prepositional phrase is less probable to be
an antecedent to an anaphor and consequently, it is
awarded less credibility.
3. Pleonastic (Lappin and Leass, 1994): This heuris-
tic is based on the observation that there exist some
syntactic patterns such that every it anaphor occur-
ring in any of those patterns must be pleonastic.
4. Syntactic Parallelism (Lappin and Leass, 1994):
As per this heuristic, preference is given to noun
phrases with the same syntactic function as the
anaphor.
5. Recency (Mitkov, 1999): This is a very simple
heuristic according to which, everything else being
comparable, a higher credibility is awarded to the
antecedent nearer to the anaphor.
6. Semantic Parallelism (Lappin and Leass, 1994):
This heuristic gives preference to those noun phrases
which have the same semantic role as the anaphor
in question. This is a useful heuristic and can be
implemented by a system that can identify semantic
roles.
7. Pragmatic Heuristics: We use certain pragmatic
heuristics that we have identified to be very spe-
cific to dialogue settings. These are of the following
kinds
• If one speaker asks a question, then the next
speaker is likely to be the antecedent of the you
that may occur in the former’s sentence.
• If a speaker makes an exclamation then he is
likely to be the antecedent of the you in the
speech of the speaker just before him.
8. Naive Character Building: This refers to a naive
character model that we have used to implement a
restricted knowledge-based representation of the di-
alogue, woven around all the noun entities that are
present in the dialogue. To this end, we use a certain
amount of world knowledge that is present apriori
with the system, in the form of ontology and func-
tionality of possible noun entities. For instance, we
associate actions with each character based on their
subject object relationship with the verbs that occur
in the dialogues. Now for an anaphor we see if a
possible antecedent has functionality of the action
associated with the anaphor, implied by the verb of
the sentence. if it is so, we then give higher credibil-
ity to this particular antecedent.
Table 1: Results
Corpus % Accuracy
Shaw’s play - Pygmalion 62
Shaw’s play - Man and Superman 67
Hand-Crafted Dialogue I 83
Hand-Crafted Dialogue II 81
2.5 Learning approach
In most systems ((Mitkov, 1998),(Lappin and Leass,
1994)) the weights that are assigned for different
anaphor-antecedent relationships are programmer depen-
dent. Fixing these values in a adhoc fashion can clearly
give rise to unstable behaviour. In our work, we use
manually tagged corpora to evaluate the effectiveness
of a given weight assignment; these can then be tuned
using Genetic Algorithms(Goldberg, 1989). We use 2-
point crossover and mutation which are used in Standard
Genetic Algorithm for Real Variables(Deb and Kumar,
1995).
3 Results
We used our system for anaphora resolution in the fol-
lowing types of dialogue corpora:
• Dialogues written manually, woven broadly in a stu-
dent environment
• Fragments from the plays by the writer G. B. Shaw
Our System gave nearly 65% accuracy on Shaw’s plays
and almost 80% accuracy on our own “hand crafted” dia-
logues [Table:1]. In the table, the name “hand-crafted di-
alogues” refers to sample dialogues that the authors wrote
themselves to test the performance of the system.
The genetic algorithms that we use help in fine-tuning
weights according to the particular corpus, and show ap-
preciable increase in accuracy.
4 Conclusions
We have implemented an automatic, knowledge-based
anaphora resolution system that works for dialogic dis-
courses. The lack of availability of any standard corpora
(Mitkov, 1999) is a major drawback in case of anaphora
resolution systems in general and those for dialogues in
particular. The original contribution of this system is
mainly two-fold. First, the anaphora resolution system
that we have implemented uses an innovative graph tech-
nique, based on the idea of anaphora chaining, that makes
it possible to resolve such references as cataphora and
plural anaphora. Secondly, we give an algorithm which
uses naive character building to apply various semantic
and world-knowledge based heuristics to the process of
anaphora resolution. The results obtained from the sys-
tem indicate a fairly high accuracy, though an extensive
evaluation of the various resolution algorithms as well as
the system as a whole remains to be done.

References
K. Deb and A. Kumar. 1995. Real-coded genetic al-
gorithms with simulated binary crossover: Studies on
multimodal and multiobjective problems. Complex
Systems, 9(6):431–454.
M. Denber. 1998. Automatic resolution of anaphora in
english. Technical report, Eastman Kodak Co., Imag-
ing Science Division.
D. E. Goldberg. 1989. Genetic Algorithms in Search,
Optimization, and Machine Learning. Addison-
Wesley Publishing Company, Reading, MA.
Graeme Hirst. 1981. Discourse-oriented anaphora
resolution in natural language understanding: A re-
view”. American Journal of Computational Linguis-
tics, 7(2):85–98, April-June.
Jerry R. Hobbs. 1978. Resolving pronoun references.
Lingua, 44:311–338.
Shalom Lappin and Herbert J. Leass. 1994. An algo-
rithm for pronominal anaphora resolution. Computa-
tional Linguistics, 20(4):535–561.
Ruslan Mitkov, Branimir Boguraev, and Shalom Lappin.
2001. An Introduction to the Special Issue on Com-
putational Anaphora Resolution. Computational Lin-
guistics, 27(4).
Ruslan Mitkov. 1998. Robust pronoun resolution with
limited knowledge. In COLING-ACL, pages 869–875.
R. Mitkov. 1999. Anaphora Resolution: The State
of the Art. Working paper (Based on the COL-
ING’98/ACL’98 tutorial on anaphora resolution).
Vincent Ng and Claire Cardie. 2002. Combining sample
selection and error-driven pruning for machine learn-
ing of coreference rules. In Proceedings of the 2002
Conference on Empirical Methods in Natural Lan-
guage Processing, Association for Computational Lin-
guistics.
Roland Stuckardt. 2001. Design and Enhanced Evalua-
tion of a Robust Anaphor Resolution Algorithm. Com-
putational Linguistics, 27(4):479–506, December.
