EXTENDING DRT WITH A FOCUSING MECHANISM FOR PRONOMINAl. ANAPHORA AND 
ELLIPSIS RESOLUTION 
Jos6 Abraqos, Jos6 Gabriel Lopes - (jea,gpl) @fct.unl.pt 
CRIA/UNINOVA, Faculdade de Ci6ncias e Tccnologia, 2825 Monte da Capafica, Portugal 
ABSTRACT 
Cormack (1992) proposed a framework for 
pronominal anaphora resolution. Her proposal integrates 
focusing theory (Sidner et al.) and DRT (Kamp and 
Reyle). We analyzed this methodology and adjusted it to 
the processing of Portuguese texts. The scope of the 
framework was widened to cover sentences containing 
restrictive relative clauses and subject ellipsis. Tests were 
conceived and applied to probe the adequacy of proposed 
modifications when dealing with processing of current 
texts. 
1. INTRODUCTION 
Pronominal anaphora resolution, as part o1' a more 
general process of anaphora resolution, is a determinant 
step in constructing a semantic representation of a text. 
Although "general cognitive processes DO play a role in 
establishing anaphoric dependencies (...)" (Kempson, 
1990 p.14), inference is, in computational terms, a very 
expensive process, both for the amount of processing 
involved and for the extension of the knowledge bases 
required. Therefore, any system aiming at efficiency in 
anaphora resolution should minimize the role of 
inference. 
As far as DRT is concerned, the construction rule for 
pronouns slates that the referent introduced by the 
pronoun should be bound to a suitable referent, chosen 
among those that are accessible (Kamp and Reyle, 1993 
p.122). The accessibility is based on semantic 
constraints and is expressed by the structure of DRS 
representing the text. However the suitability of referents 
is ill-defined. 
Another perspective for anaphora resolution is 
founded on the principle of relevance, i.e. on "the 
presumption that every utterance is selected to convey 
the intended interpretation while imposing to the hearer 
the least amount of processing effort in constructing that 
interpretation" (Kempson 1990 p.17). Focusing/ 
centering theories (Grosz; Sidner; Brenn,'m, Friedman and 
Pollard et al.) can be considered as having this 
perspective. They try to keep track of the focus of 
attention along the text and bind pronouns preferentially 
to focused entities. The choice of antecedents is based on 
pragmatic constraints, which put an ordering on 
preferences between ,antecedent czmdidates. 
Cormack proposes the integration of focusing and 
DRT, "(...) adding semantic constraints to a model of 
attention in discourse" (Cormack, 1992 p.5). This 
integration compensates for two shortcomings of DRT: 
it considers too many possibilities for anaphoric binding 
and doesn't provide an ordering between antecedent 
candidates. From the focusing point of view, the addition 
of semantic constraints, provided by DRT, to the 
pragmatic ordering further restricts the determination of 
possible antecedents. 
We analyzed Cormack's proposal, and found out that 
it was lacking some features that we consider more 
adequate, as it will be shown in the next few sections. 
Therefore we adapted it, and applied the modified version 
to the processing of texts written in Portuguese. The 
scope of d~osc mcthods was widened to cover sentences 
containing restrictive relative clauses and subject ellipsis. 
Tests were conceived and applied to probe the adequacy of 
proposed modifications when dealing with processing of 
current texts. 
2. SIMPLE SENTENCES 
2.1. Alterations to DRT 
Cor,nack defends that pronouns of the current 
sentence can only have access to two groups of referents: 
focused referents and those unfocused ones that were 
introduced by the preceding sentence. Referents not 
fitting any of these two groups can be forgotten. Let us 
look at an example (Connack, 1992 p.350): 
(la) John took apart the chest 0 f drawers. 
(lb) it was full of clothes pegs. 
The DRS representing the first sentence will be (foct, sed 
referents are shown on the left, unfocused ones on the 
right): 
5> I< .j c > 
John0) 
chest of drawers(c) 
Iook_apan(j,c) 
The second sentence introduces another DRS. Anaphors 
are resolved with referents of previous DRS, 
I <>l<jc> <c>I<P> John(j) 'clothes_pegs(P) chest of drawers(c) \[ full of(c,P) took_apart(i,c) 
, 
and then previous DRS can be "R~rgotten": 
1128 
i , <c>l<P> 
clothes_pegs(P) 
t'nll of(c,P) 
Referent John, who was introduced by (la), was only 
available for anaphor resolution in (lb). Since it was 
never focused, it is "forgotten". This means that it is no 
longer included in the referents of the DRS representing 
the text after processing of the second sentence, 
becoming unavailable as antecedent candidate for 
pronouns in following sentences. This claim may seem a 
little strange if we look at (lc) as an acceptable third 
scntenee: 
(lc) lie didn't like dmir color. 
Two other aspects of Cormack's representation led us 
to prefer to keep to the original DRT formalism. First, 
Cormack's representation is too conditioned by 
pronominal anaphora resolution. Referents that become 
unavailable for pronominal reference, and are therefore 
"forgotten", may still be cospecified hy definite 
descriptions. Eliminating them from tbe representation 
would be a limit to the possibilities of expanding the 
system in the fltture. Second, "forgetting" conditions 
introduced by previous sentences leads to a situation 
where the DRS representing Ihe text at a given moment 
will contain little information about the text, and no 
information at all about some of the "surviving" 
referents• For instance, looking at the last DRS 
presented, we no longer know what entity introduced 
referent cl. 
2.2. Focusing algorithms 
Most focusing theories keel) referents that can be 
relevant in future anaphora resolution in focus stores. 
Sidner considers two groups of focus stores, which in a 
very short and simplistic way can be described as: 
those related to agent tAG) role: 
actor locus (AF) - AG of current sentence or previous 
AF, if current sentence has no AG; 
potential actor focus list (PAFL) - other animate 
referents of current sentence; 
actor focus stack (AFS) - previous AFs; 
those related to other thematic roles: 
discourse focus (DF) - 
• DF of previous sentence, if referred with a pronoun in 
current mntence; 
• referent of the highest ranking pronoun 2 in current 
mntence; 
• theme, in discourse initial sentences; 
1 We can, of course, overcome this limitation by creating a 
text knowledge base where all the restrictions upon 
referents are present. 
2 sue (Sidner. 1979), (Cormack, 1992) for details about this 
ranking 
potential discourse locus list (PDFL) - referents of 
current sentence excluding DF; 
discourse focus stack (DFS) - previous DFs. 
In determining the antecedent of a pronoun, 
algorithms go through some preliminary considerations 
(such as recency rule) and a basic ordering of focus 
stores. 
AF - DF distinction 
Although taking Sidner's algorithms as a starting 
point, Cormack renounces the distinction between actor 
focus and discourse focus, in the final part of her work. 
The algorithms become more simple but they loose in 
discriminatory power. This is particularly more 
significant in a language like Portuguese, where 
nominals can only be masculine or feminine (not 
neuter)• In a text like 
(2a) O Jofio escreveu um livro. 
John wrote a book. (AF = John, DF = a book) 
(2b) A Maria lets-(). 
Mary read it. 
eliminating the distinction between AF and DF would 
lead to Jodo (lohn) being proposed as preferred antecedent 
of the masculine pronoun o (it). Rejecting this binding 
would require an appeal to inference, wlfich is something 
that we want to minimize. Keeping AF - DF distinction 
will also be significant in dealing with another 
phenomenon very common in Portuguese: subject (SU) 
ellipsis. 
Recency rule 
"if the pronoun under consideration occurs in the 
subjcct position, and there is an alternate focus list 
noun phrase which occurs as the last constituent in 
the previous sentence, test that alternate focus list 
phrase for co-st)ecification before testing the current 
focus. (...)" (Sidner, 1979 p.144). 
Sidner admits that "the recency rule makes focussing 
seem somewhat ad hoc" (ibid.), Carter states that "its 
inclusion in SPAR led to considerable inaccuracy" 
(Cartes" 1987 p.114) and Cormack decides to ignore it too 
(Cormack, 1992 p.54), ttowever, it seems that, in 
Portuguese, this rule should be considered for pronouns 
in AG position: 
(3a) A Maria i deu um livm a Anaj. 
Mary i gave Annj a book. 
If tile agent of the ncxt sentence is Mary there are two 
ix)ssit)ilities of pronominalization: the prontmn ela (she) 
or the null pronoun (~ (SU ellipsis). This last option 
will be l)mferrexl: 
(3b) ~i comprara-o num leilfio. 
¢i had bought it at an auction• 
1129 
But if the agent of the next sentence is Ann, the only 
possibility of pronominalization will be the cxplicit 
pronoun ela (she): 
(3b') Elaj Icu-o. 
Shej read it. 
So thc speaker will tend to use a null pronoun in AG 
position to eospccify the agcnt of the previous sentcncc, 
reserving the explicit pronoun a use that conforms with 
thc recency rule. 
Intrasentential anaphora 
Carter inserts intrasentential candidates (ISC) between 
current feel and potential loci, in the basic ordering. 
Cormack distinguishes between focused ISC and 
remainder oflSC. In our implementation this distinction 
seemed unnecessary and we decided to insert ISC alter 
potential foci, in thc basic ordering. A special casc of 
ISC is the reflexivc pmn(mn se (himself/herse!f/itself/ 
themselves). We always bind it to the agent of the 
scntcnce. 
(4) O camelo i dcitou-se i na arcia. 
The camel i laid (itse!f i) down on the sand. 
Intrasentential catapbora 
In our implementation, syntatic parsing is done 
according to grammar (levclopment formalisms hased on 
barricrs, movcment and binding (Lopcs 1991). It is an 
cxtension of thc extraposition grammar formalism 
(Pereira 1981) and allows for movement of constituents 
of a scntencc in a rcstrictcd area delimited by harricrs. 
The resulting synUltic trcc will always show the intcrnal 
arguments of the verb on it's right, no matter what 
positions they had in the original sentence. For instance, 
the syntatic trec for 
(5) Near her, the blond girl saw a man. 
will be: 
S 
NP VP 
NP PP 
lhe blond girl saw a man near her 
The anaphora resolution process works on the restdts 
of the syntatic parser, so this kind of cataphora will be 
trc~lted as intrasentential anaphora. 
Subject ellipsis 
As mentioned above, this is a very common 
phcnomcnon in Portugncse language. Null pronoml in 
AG position seems to behave differently from onc in 
non-AG position. In thc first case it cospecifies AF or a 
combination of foci including AF: 
(6a) A Maria i dccidiu ofcrcccr aquele perfume h Arm. 
Mary i decided to offer Ann that perfume. AF = Mary 
(6b) ~i gostava muito dole. 
(b i liked it very much. 
A null pronoun in non-AO position cospccifics DF or a 
combination of foci including DF: 
(7a) O Joao poisou o livro i sobrc o piano. 
John put the book i on the piano. DF = the book 
(7b) (Diem grandee pesado. 
~i was big and heavy. 
Ratification procedure 
Both Sidner and Cormack leave all verifications of 
syntactic agreement and consistency with world 
knowlextge to a ratification procedure, to be appliexl after 
completion of focusing process. Efficiency can be 
improved if inexpensive number and gcnder agreement 
and reflexivity verificatk)ns arc included in the focusing 
proccss. Thus, scvcral inadcqrmte candidates can be ruled 
out without a call to the ratification procedure. 
3. SENTENCES CONTAINING 
RESTRICTIVE RELATIVE CLAUSES 
Going beyond simple sentcnces, we widcncd thc 
scope of the prescntcd methods to includc sentcnccs with 
restrictive relative clauses (for short, we'll just use the 
form relative clauses in the remainder of this paper). 
Rules for focus movement and refcrcnts accessibility 
were formulated and tests werc dcsignetl to probe their 
adcqt, acy. In this secti(m we refer to the results of a 
qucstionnairc answered by 40 collcge students. 
Focus movement 
(8a) O Joao leu unl livro i. 
John read a book i. DF = a book 
(8b) O homemj que o i cscreveu morreu. 
The manj who wrote it i died. 
(8c) Os eruditos cnalteceram-no i v j ? muito. 
Erudite people praised him~it i v j ? much. 
According to focusing rules, tile pronoun in (8c) 
cospecifies DF of (8b). ff lnXmouns in relative clauses 
were able to influence focus then (81)) would confirm a 
book as DF and this would be the antecedent of the 
pronoun in (8c). That doesn't sccm to be the case. The 
intuitively preferred antecedent is the man. Examples like 
this show that pronouns occurring within relative clauses 
dofft seem to inlluence focus movement. This 
colmlusion was confirmed by 83% of the answers to the 
alx)ve mentioned questionnaire. 
Access of following sentences to relative 
clause referents 
Referents introduced 1)y the relative clause arc 
acccssiblc hut arc not preferred to main clause rcfcrcnts. 
The qucstionnairc prcscnted the text: 
1130 
(9) 0 homem a quem u,n ladr~o roubou o rcldgio 
chamou a polfcia. Ele ... 
The man whom a thief stole the watch from calh:d 
the police, lie ... 
58% of the continuations proposed bind the pronoun to 
the main clause referent the man while only 28% indicate 
binding with file relative clause referent a thief. 
Access of the relative clause to main clause 
referents 
(10) O Joao deu um livro i at) ahmo qnc o i merecia. 
John gave a book i to the student who deserved it i. 
Pronouns in the relative clause can cospecify both main 
clause referents or focus stores. The first situation seems 
to be preferred except, perhaps, for pronouns in AG 
position, that show a weak preference (suplx)rted by 61% 
of the answers) for cospecification with AF or a member 
of PAFL. 
Access of the main clause to relative clause 
referents 
(11) O homem que escreveu um livro i deu-o i a Maria. 
The man who wrote a book i gave it i to Mary. 
Pronouns in the main clause, occunilLg after the relative 
clause, can cospecify it's rclerents, lake Cormack, we 
conskler access to focus storcs to be more likely, lint this 
preference was not confirmed by the results of the 
questionnaire (60% of the answers were against). 
Access of relative clause to relative clause 
(12) O homem qtte a Maria i viu escreven um livro title a i 
imprcssionou. 
The man who was seen by Mary i wrote a book that 
impressed her i. 
Pronouns in the second relative clause can cospccify 
referents of the first one. ttowever, it seems that main 
clause referents should be prelerrtxl as antecedents. The 
example used to test this preference was not very clear 
and so we've got 63% of negative answers. 
Transitive access to a main clause 
(13) A Maria i casou corn o cliente qlce conlprou 0 livro 
que ela i e~reven. 
Mary i married the client who bought the book that 
she i wrote. 
Pronouns in ,'1 nested relative clause cnt| cospccify main 
clause referents. Preference seems to be given to 
antcccxlent candidates of the main clause over those el: the 
nesting relative chmse, but this hypothesis was not 
testexl. 
Transitive access to a relative clause 
(14) O cliente que comprou () livro que a emprcgada i 
escrcveu casou corn ela i. 
The client who bought the book that was written 
by the employee i married her i. 
Pronouns in the main clause can cospccify nested relative 
clntnse referents. Candidate nnteccxlents occurring in the 
nesting relative clause seem to be preferred though. This 
preference is supported by 75% of the answers. 
Ordering autecedent candidates 
We can summarize this analysis in the following 
rules lot predicting antecedents. These rules were 
implemented without significant changes to the 
algorithm establishcd lbr simple sentences. 
Relative clause pronotms: 
AG position: 
: main clause AG 
not null: Ab', PAFL, main clause refs., 
remainder of focus stores 
non-AG position: main clause refs., lbcus stores 
,Main clause pronouns: 
Prcc(xling a relative clause: focus stores 
\[q)llowing a l+clative clause: idem excluding stacks, 
relative clause rel~., slacks 
Folk)wing sentence pronouns-main clause refs., relative 
clause refs., slacks 
Nested relative clauses: They have transitive access to 
main clause refs. Main clause pronouns prefer 
nesting clause refs. to nested clause ones. 
Relative clauses as conditionals 
Both Kamp (1993 p.81) and Cormack (1992 p.347) 
propose a "flat" treatment of relative clauses. Both it's 
referents (with the possible exception of proper names) 
and conditions are introduced in current DRS. 
(15) Jones owns a book which Smith adores. 
(Kamp and Reyle, 1993 p.78-83) 
xyz 
Jones(x) 
\[xx)k(y) 
Smith(z) 
z adorcs y 
X !)~'I)S y 
(16) A man who owns a donkey pays. 
(Cormack, 1992 p.347) 
<> I <tn, d> 
mau(m) 
donkey(dr 
Ill owns d 
m pays 
According to Marcus (1979 1).289) the interpretation 
conveyed hy this kind of representation wouldn't be 
1131 
adequate to all kinds of relative clauses in Portuguese, 
namely those whose verb is in subjunctive mood. 
(17) Um agricultor que tenha um burr() bate-lhe. 
A farmer who (subjunctive of ~) a donkey beats 
it. 
This kind of sentences is associated to non-factual, 
hypothetical presuppositions and is semantically 
equivalent to an implication relation between two 
clauses: 
(17') Seum agricultor tern um burro entfio bate-lhe. 
If a farmer owns a donkey then he beats it. 
So, our implementation represents this kind of sentences 
as conditionals: 
\[xy 
donkey(y) 
owns(x,y) 
zw 
=7> Z=X 
w=y 
beats(z,w) 
Our rules for anaphora resolution will then be applied as 
usual, taking in consideration both focusing and 
semantic (DRT-detennined) acessihility constraints. 
4. TESTING 
Tests were conceived with the only purpose of 
probing the adequacy of proposed modifications. One of 
the tests, the questionnaire, has already been inentioned. 
It consisted of two parts. In the first one there were short 
texts (2-4 sentences) where some referents were 
introduced. The last sentence was always incomplete and 
contained a pronoun. The continuation proposed by the 
student was supposed to show which co-specification he 
had chosen. Since the evaluation of this part might be 
influenced by intuition, it was committed to 3 
independent evaluators, who were found to agree on 80% 
of the answers. The second part consisted of texts of the 
same kind, but where all sentences were co,nplete. The 
student was asked to identify explicitly the 
co-specification of a pronoun introduced by the last 
sentence. The results concerning relative clauses were 
presented in last section. Recency rule and rules for 
subject ellipsis were confirmed respectively by 77% and 
85% of the answers. 
The two other tests consisted of applying the rules 
for relative clauses to all anaphome found in current 
texts, and whose antecedent or anapho," were introduced 
by a relative clause. "Fhe first target text was a novel by a 
famous Portuguese writer of lhe end of last century, Eqa 
de Queiroz (19(/0). The news of a Portuguese news 
agency (Lusa, 1993) provided 637 kbytes of fresh 
(June93) raw material for the last test. The rules 
performed correctly in respectively 96% and 92% of the 
cases. 
5. CONCLUSION 
We developed and implemented a mechanism \['or 
pronominal anaphora resolution, integrating focusing and 
DRT, and adjusted to Portuguese language processing. 
Modifications to other authors proposals included 
recovering AF - DF distinction and recency rule, 
handling intrasentential anaphora, cataphora, subject 
ellipsis, restrictive relative clauses and, in particular, 
Ihose containing subjunctives. 
Focusing mechanisms enabled the reduction and 
ordering of the set of possible antecedents for each 
anaphor. Final ratification or rejection of each suggested 
co-specification would require the use of world 
knowledge and reasoning. That was beyond the aim of 
this work. 
The analysis made for restrictive relative clauses 
should be extended to other constructions of 
subordination and coordination, in order to establish 
more general rules. We believe that many questions 
raised here might be relevant to processing of other 
romance languages. 
REFERENCES 
Brennan, S., M. Friedman and C. Pollard (1987). A 
centering approach to pronouns. In Proceedings of the 
25th Annual Meeting of ACL, Sumford University, 
California, p. 155-162. 
Carter, D. (1987). Interpreting anaphors in natural 
language texts, E. Horwood Limited Ed., Chichester, 
England. 
Cormack, S. (1992). Focus and Discourse 
Representation Theory, Ph.D. Thesis, University of 
Edinburgh, 
Grosz, B. and C. Sidner (1986). Attention, intensions, 
and the structure of discourse. Computational 
Linguistics, 12 (3), p. 175-204. 
Kamp, H. and U. Reyle (1993). From discourse to lo~lic, 
Kluwer Academic Publishers. 
Kempson, R. (1990). Anaphora: a unitary account. In 
Proceedings of the Workshop on Anaphora, Ofir, 
Portugal, p. 1-36. 
Lopes, J. (1991). Movement, barriers and binding 
formalism for logic Grammars development: parsers, 
Master Course Lectures on NLP, Universidade Nova 
de l,isboa (manuscript). 
l,usa (1993). Files containing news of Lusa news 
agency, liom the 41h to the 1 lth June 93. 
Mateus, M. et al. (1989). Gramdtica da Lingua 
l'ortuguesa. Camiuho Ed., 3rd ed., IAsboa, Portugal. 
Pcreira, F. (198 I). Extrapositiou Grammars. American 
,lournal of Computational Linguistics, 7(4), p. 
243-255. 
Queiroz, E. (1900). A ilustre casa de Ramires, Livros do 
Brasil Ed., Lisbon, Portugal. 
Sidncr, C. (1979). Towards a computational theory of 
definite anaphora comprehension in english discourse, 
MIT Artificial Intelligence Latx)ratory. 
Siduer, C. (1986). Focusing in the comprehension of 
definite anaphora. In Readings in Natural Language 
Processing, Morgan Kaufmann Ed., p. 363-394. 
1132 
