Reference Hashed 
Frank Schilder 
Department for lnformatics 
University of Hamburg 
Vogt-K611n-Str. 30 
22527 Hamburg, Germany 
schi Ider@ in formatik, uni - hamburg, de 
Abstract 
This paper argues for a novel data structure for the 
representation of discourse referents. A so-called 
hashing list is employed to store discourse referents 
according to their grammatical features. The ac- 
count proposed combines insights from several the- 
odes of discourse comprehension. Segmented Dis- 
course Representation Theory (Asher, 1993) is en- 
riched by the ranking system developed in center- 
ing theory (Grosz et al., 1995). In addition, a tree 
logic is used to represent underspecification within 
the discourse structure (Schilder, 1998). 
1 Introduction 
Discourse referents are represented quite differently 
by current diso~urse theories. Discourse Represen- 
tation Theory (DRT), for example, employs a rather 
unstructured data structure for the domain of dis- 
course referents: a set (Kamp and Reyle, 1993). 
A DRT-implementation by Asherand Wada (1988), 
however, employs a more complex data type: a tree 
representation. In further work by Asher (i 993) ref- 
erents are grouped together into segments depend- 
ing on Ihe discourse structure. His Segmented DRT 
(SDRT) uses a tree-like representation for the dis- 
course sUuctui~ I 
Centering Theory (CD proposes a//st structure 
for the entities one preferably refers to in subse- 
quent sentences. In order to cover coreference over 
discourse segments the centering model was ex- 
tended by a stack mechanism (Grosz and Sidner, 
1986). Recently, these data structures have been 
criticized by Walker (1998), because they seem to 
be too restrictive. She proposes a cache storage for 
the referenis in the focus of attention. 
! propose instead a novel data structure for the 
representation of discourse referents. A/rushing list 
tSimilarly, Rhetorical Structure Theory (RST) makes use of 
a tree representation (Mann et aL,1988). 
100 
is used to distinguish between the different types 
of referents according to their grammatical features 
wrt. number or gender. This list structure is further- 
more combined with a hierarchical tree structure. 
The remaining pan of the paper is organised as 
follows. Section 2 introduces the main claims made 
by past theories. Focusing on SDRT and CT, I 
will highlight the (dis.-...) advantages of these two ap- 
proaches. Section 3 provides the reader with an 
introduction to hashing lists and how they can be 
used for linguistic dam. Section 4 discusses how 
the different advantages of former approaches can 
be combined. First, DRT will be amended by us- 
ing a hashing list for the discourse referents instead 
of a simple set. Second, the centering model will 
be applied to the representation gained. Finally, the 
shortcomings of a flat representation are presented 
and the introduction of discourse segments is dis- 
cussed. Subsequently, section 5 describes a detailed 
formalisation of one example sequences by the rep- 
resentation proposed, before section 6 concludes. 
2 Background 
It has been commonly agreed that a discourse is 
hierarchically organised. However. this is already 
the lowest common denominator among current ap- 
proaches to discourse grammars and text compre- 
hension. There is a wide range of views of what 
a formal representation of a discourse should look 
like. The following sections give a short introduc- 
tion to two suands of research concerned with dis- 
course processing. The first one, DRT and follow- 
ers, is a linguistically-oriented approach that. gen- 
erally speaking, captures the hierarchical structure 
of a discourse by a tree-like representation. The 
second strand, based on CT, is motivated by psy- 
chological experiments and models the structure of 
a discourse as a list representation of possible dis- 
course referents. Further developments of CT have 
employed a stack structure or a cache storage. 
0 
t 
0 
0 
0 
0 
0 
0 
0 Q 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
2.1 Hierarchical discourse grammars 
SDRT (and RST) assume that so-called discourse 
(or rhetorical) relations are the links between dis- 
course segments. A discourse relation has to be de- 
rived in order to achieve a coherent discourse. More 
importantly, the choice of this relation has a crucial 
influence on possible antecedents for anaphoric ex- 
pressions, z Asher (i 993) defines in SDRT the terms 
subordination and openness that specify where open 
attachment sites are in a discourse structure. A tree- 
like representation illustrates the hierarchical struc- 
ture of the discourse. Basically, the nodes on the 
so-called "right frontier" of the discourse structure 
are assumed to be available for further attachment 
(Webber. 199 i ). 
Generally speaking, all nodes which dominate the 
current node of the newly processed sentence are 
open (i.e.D.Subordination). However, a restriction 
is introduced by the term D.Freedom which applies 
to all nodes that are directly dominated by a topic 
(i.e. ~ ~/3), unless it is the current node (see fig- 
ure 1). An informal definition for possible attach- 
ment sites looks like the following: 
I. The last clause represented as a Discourse Rep- 
resentation Structure (DRS) K. 
2. Any DRSs that are embedded in K. 
3. Any DRSs that dominate the DRSs in I. and 2. 
through Explanation, Elaboration or ~\[ 
K0:a~ d-free 
Kt s :~ Kto :& 
KlOl:~ 
KI0ll:~ KI010:~ 
Rgure i: Openness and D-Freedom 
SDgr exploits discourse relations to establish a 
hierachical ordering of discourse segments. A so- 
called constituent graph indicates the dependencies 
Zl will concentrate on how SORT deals xvith this i.,~ue in the 
folltm, ing. A study that .shows how RST can be u.~d to make 
predictions regarding anaphora resolution in a text can be found 
in Fox (1987). 
between ~cgments,e.,~pecially highlighting the ()pen 
attachment pt>h\]ls. 
SDRT has been sucessful whet) phenomena are 
considered thai are explainable because of the hier- 
archical structure of the discourse. This approach is 
too restrictive when an anaphoric reference is drawn 
over .~egment. boundaries: 
(I) (a) Mary once organised a party..(b) Tqm 
bought the beer. (c) Pete~" was in charge of 
the food. (d) Years later Mary still com- 
plained that it was too spicy. 
The sentence (Id) continues at the top level of the 
discourse, but the antecedent of it (i.e. food) is still 
available even though it is deeply embedded in an 
Elaboruzirm segmenr(i.e. (! a-c)). 
Other shortcomings concern formal, features. 
First," SDRT is not capable Of expressing under- 
specification for ambiguous sequences. Second, the 
derivation of the di~'~turse structure is not mono- 
tonic. Once derived, SDRSs are overwritten by an 
uLulate. 
2.2 Centering 
CT proposed by Grosz et al. (1995) offers a text 
comprehension model that describes the relation be- 
tween the focus of attention and the choices of re- 
ferring expressions within a discourse segment. The 
main idea of this theory is that a sentence possesses 
a center arid that normally one Continues to write 
(or talk) about this center. Each utterance 0~ gets a 
list of forward-looking centers C/(U~) assigned to 
it. Basically, all the entities mentioned by the sen- 
tence are ranked according to their degree of being 
in the center of the utterance. Each sentence also has 
a unique backward.looking center C'b(Ui). A main 
claim by the theory is that the most likely C6(Ui+t) 
is the most highly ranked Cl(\[/i)~ Hence, the cri- 
teria for ranking the entities on the forward-looking 
center list are crucial for the predicative power of 
this theory. No tm~ly, the grammatical relations 
subject, object etc. determine the preferred Cp(Ui) 
(i.e. the first entity on the Cl(Ui ) list). 
As mentioned earlier in this section, the initial ac- 
count to centering is only concerned with the choice 
of referring expressions within a discourse segment. 
Since a more general theory to referring expressions 
is needed, an extension is presented by Grosz and 
Sidner (1986). They use a stack mechanism for rep- 
resenting the different discourse segments. If one 
segment is closed off. the information regarding the 
101 
forward- and backward looking centers is popped 
off the stack. The new top clement of the stack 
contains the centering information from the old seg- 
ment that the subsequent discourse continues with. 
This simple stack mechanism has been criticised. 
In particular Walker (1998) points out that (i) a long 
intervening discourse segment Can make it difficult 
to return back to earlier mentioned discourse refer- 
ents and (ii) discourse referents introduced in a sub- 
ordinated segment can easily be carded over to a 
higher segment (e.g. (I)). Note that a stack model 
would discard the information of a closed-off dis- 
course segment. Walker proposes acache storage 
that keeps often-used discourse referents within a 
storage. If reference is made to an antecedent men- 
tioned earlier in the discourse the information is re- 
stored from long term memory. 
Unfortunately, it is not quite clear how this re- 
trieval operation can be fonnalised. In addition, it 
should be acknowledged that there are structured 
constraints of the discourse structure that do not al- 
low the choosing of a recently mentioned referent. 
Data discussed within DRT, such as the sentence be- 
low, have been presented as evidence for a notion of 
(in-)accesibility. 3 Negation is a standard example 
that does not allow a reference to a discourse entity 
in the prevous sentence: 
(2) No man walks in the park. #He whistles. 
In the given example sequences the pronoun he 
cannot refer back to the discourse refett~ts intro- 
duced inthe previous sentence. Another example 
can be found in (3) that involves a conditional: 
(3) If a farmer owns a donkey, he beats it. #He 
hates it. 
Again, neither a pronominal reference by he nor 
by it is possible. It may be concluded from these 
data that a cache approach is not restrictive enough. 
The discussion so far has shown that the data 
structures used for discourse processing are either 
too restrictive or not restrictive enough. The next 
Section presents a novel way of representing dis: 
course referents introduced by a text. The data 
structure pre.sented is called a hashing list and al- 
lows for an efficient way to access stored informa- 
tion. 
"Gordon et al. (1998), for example, blend DRT with CT. 
102 
3 Hashing lists 
The following section describes how a hashing list 
works before the subsequent section shows how this 
data structure can be used for discourse processing. 
3.1 Data hashed 
One of the main problems for the de~,;ign of com- 
puter systems is the question of how data is stored 
and efficiently accessed. Hashing lists are often 
used for this purpose since this data structure is 
specifically designed for easy retrieval of stored 
data. 
I will first describe the data structure in more de- 
tail and then I will give an example of how data can 
be retrieved from a hashing list. 
Hashing lists. The basic data structure for a hash- 
ing list is an array A\[min..max\] (i.e. an indexed 
• list A that has a preset length of n elements). An 
array with the name year could be defined as fol- 
lows: 
TYPE hash = ARRAY\[0..99\] of 
integer; 
The random access structure of this data type al- 
lows the programmer to assig n a single cell of the 
array directly (e.g. hash \[99 \] : = 9; ). This is an 
advantage over other data structures such as trees. 
Hashing functions. A function has to be designed 
that tells us how to store data on the hashing list. 
This function takes the item to be stored and gives 
back an appropriate key k. The item can now be 
stored at the fight place on the list. 
Suppose we want the program to store the inte- 
ger 2000 on the hashing list year defined earlier. 
A hashing function H ( i ) has to be be chosen such 
that this function gives back an index k. With this 
information the assignment hash \[k\] : =2000; 
can take place. A hashing function for integers may 
be the Modulo function. For the given example the 
key k would be 2 0 (i.e. 2000 rood 99 = 20). 
The hashing function can also give back an in- 
dex k for a new item that has already been taken by 
another item (e.g. 119 has the same key). For the 
case of a-collision a special treatment is required. 
The most common one is the administration of an 
overflow area. The single places on the hashing list 
are lists that Would handle colliding • items. Figure 2 
shows a pan of the hash list hash \[ 19.. 21 \] with 
two items 2000 and 119 inserted. 
0 Q 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
\[19\] 
l'J 
\[20\] 
2o'oo 
119 
° 
\[l 
\[21\] 
6 
Figure 2: hash \[ 19.. 21\] with collision resolu- 
tion for two items 
3.2 Discourse hashed 
I now show how a hashing list can be employed as 
a data structure for linguistic data. This may not 
be obvious after using only integers for storing on a 
hashing list. 
Domains of referents. Natural language process- 
ing requires a richer data structure than storing in- 
tegers. However, in the end a hashing function for 
linguistic data will also consist of an array. 
Considering the different types of discourse refer- 
ents, we can assume at least the following list of mr. 
erents to be relevant: singular male, singular female, 
singular neuter, plural and event referents. 4 We now 
take these conceivable referents and reserve each of 
them a slot in the domain array: 
domain\[sgM, sgF, sON, pl, ev\] 
Note that this way of writing the hashing list is 
actually only syntactic sugar for a normal definition 
such as domain \[ 1.. 5 \]. 
Referent function. A function is needed that 
can assign a cell on the array domin to a newly 
introduced discourse referent. The semantic and 
syntactic information that comes with the a dis- 
course referent gives us the key for this. Take 
for example a proper name such as Peter. The 
information that comes with it could be encoded 
as a feature value matrix such as proposed by 
Dale (1992) (see figure 3). The hashing function 
INDEX: ;CI 
\['AT pn 
SYN. \[NUM. sing 
" AGR:LGEND: male 
SEM: NAMED: peler\] 
Figure 3: The representation for Peter 
The function rezums sgM (or 1) as key for the array 
domain in the example given. 
Summing up, a hashing list was proposed to store 
discourse referents while processirlg natural lan- 
guage discourse. This kind of list contains several 
"slots" that await discourse referents described by a 
discourse. The grammatical features of gender and 
number distinguish the different referents. 
The following section discusses how this data 
structure is embedded into a discourse grammar. 
4 Referents in discourse 
The linguistic data presented earlier demonstrates 
the need for hierarchical constraints on anaphora 
resolution. But the data also show that previous 
approaches such as SDRT overemphasise this re- 
striction. A refusal of any discourse structure con- 
suaints, on the other hand, also does not seem to 
be appropriate. A cache storage that stores the fre- 
quently used discourse referents does not account 
for the data that were explainableby (S)DRT. 
This section describes how a hashing list can be 
used for the storage of discourse referents. The list 
is integrated into an SDRT framework. The infor- 
mation about the discourse segments is kept in order 
to cover data that is explainable by thehierarchicai 
discourse structure. In addition, the insight that a 
takes the information under the agreement feature sentence has a center as proposed by CT is also re- 
AGR and checks for the values regarding number': flected by the theow proposed. The discourse refer- 
and gender, ents are ordered according to centering preference. 
NUM: ,nng \] 
~F.JqO: nude\] 
*This is only a first list of very fundamental referents. But . 
the list can easily he extended .by more differentiated plural 
types, speech acts or types of referents. 
The following sections describe in more detail 
how the different concepts are integrated in the sys- 
tem proposed by this paper. First, the way discourse 
referents are stored via a hashing list is explained. 
Second, the ordering regarding the centering prefer- 
ence is imposed on the slots of the hashing list. And 
finally, a tree structure is presented that binds all the 
components together. 
103 
4.1 Referents hashed 
In the system proposed, a hashing list stores the 
rel'ercnts introduced by the discourse. The hashing 
list contains at least the following slots: scjl, 
sgF, sgN, pl, ev. Since the basic formalism 
is DRT, we need to incorporate the hashing list 
into the formalism. In DRT, a DRS consists of 
the domain of discourse referents and the set of 
conditions, imposed on the referents. A sentence 
such as Peter sighs is represented by the box 
notation as follows: 
4.2 Referents re-centered 
Alter blending a I)RT representation with a hash- 
ing list lor a structured representation of discourse 
referents. 1 will introduce the centering I'~ature into 
the formalism. The different slots already contain 
the ordering of the referents regarding the centering 
preference. An apparent advantage over the center- 
ing approach should become clear: the referents are 
already separated from each other. 
A discourse such as (4) without any competing 
antecedents for the pronoun she is formalised by a 
HDRS as follows: 
Xl el 
p~ter( x D 
ez : sigh(xz) 
A hashing list substitutes, for the set of discourse 
referents offering ~ffezcnt slots for the discourse 
referents to be stored in: 
Peter(x z ) 
el : sigh(xl) 
The representation of a more complex sentence 
such as Peter gave John a book containing several 
discourse referents is in the following DRS: 
• ::1 I: '11 "' 
p~.er(xl) 
john(z2) 
book(zs) 
ez : give(:l, ::, :a) 
This Hashed DRS (HDRS) contains a complex 
domain sub-box. '\['he slot for male and singular 
discourse referents is filled with the two items xt 
and zz. The two referents are on a collision list as 
described earlier. Additionally, the list reflects the 
ordering for the centering list. The subject NP Pe- 
ter was processed before the object NP John "and 
• is therefore the first entry on the preferred center- 
ing list. Note that only referents that share the same 
grammatical features are listed in the same slot. 
104 
(4) (a) Peter gave Mary a book. (b) It was about 
sailboats. (c) She was thrilled. 
xz x2 Fz el 
S2 
S3 
peter(xl) 
mary(x2) 
book(y\] ) 
et:give(xz,x2,!/I) 
s2:about(yz; s_boats) 
ss:thrilled(x2) 
CT would predict for (4.b) that the book is the 
preferred forward looking center Cp. The back- 
ward looking center of (4c) is Mary. This is called 
a rough slu'ft in CT. A continuation of the center 
(Cp(Ui) = Cs(Ui+t)) is the preferred and most co- 
herent constellation according to this theory. How- 
ever, contrary to what CT would predict, it is no 
problem to read(4). 
The HDRS format seems to work fine with 
pronominal references to persons or objects. But 
we run into problems when the slot regarding the 
descihed events and states is considez~l. The fol- 
lowing example (5) illustrates that a simple flat list 
representation as indicated above by et, a2, ss is not 
sufficient for more complex anaphori¢ expressions 
such as event anaphora (Allen 1995): 
(5) (a) When Jack entered the room, everyone 
threw balloons at him. (b) In retaliation, 
he picked up the ladle and started throwing 
punch at everyone. (c) Just then; the chair-. 
man walked into the room. (d) Jack hithim. 
with a ladleful, right in the face. (e) Every- 
one talked about it for years afterwards. 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 Q 
m 
I 
I 
I 
I 
I 
I 
I 
O 
O 
O 
O 
O 
O 
O 
O 
O 
O 
O 
O 
O @ 
O 
O 
O 
O 
O @ 
@ 
@ 
O 
O 
O 
O 
O @ 
O 
O 
O 
O 
O 
O 
O 
O 
O 
KI 
I 
Km: ehzh(Ktt~t. K;~t; 
S2 
Figure 4: The discourse structure for Eluboruthm 
The pronoun h in (5e) may refer to the entire sit- 
uation described by (5a) through (5d). BUt this is 
not the only conceivable antecedent for it. The sit- 
uation described by (5d) may be referred to by it as 
well, if we consider an alternation of (5e) as in the 
following: 
(Se') It was a foolish thing to do. 
Note that the situation in (5d) is the only situation 
available from the sequence (5a-d). The list struc- 
ture for the evene slot does not reflect the structure 
of the discourse. A segmented discourse structure is 
needed here. 
4.3 Discourse segments 
The derivation of discourse structure used in this 
account is that.proposed by SDRT. This discourse 
grammar, as well as others, claims that discourse 
• segments originate from the derivation of so-called 
discourse relations (e.g. Narration, Elaboration 
etc.) due to our background or world knowledge. 
The account proposed by this paper assumes that 
HDRSs are grouped together wrt their discourse 
segment. Consider now the following sequence (6) 
with the possible continuation (6e) with a male and 
female pronoun (depending on whether a male or 
female protagonist was introduced by the first sen- 
tence). 
(6) (a) Mary/Mark once organised a party. (b) 
Tom wrote the invitation cards. (c)Peter 
bought the booze. 
(re) She/He was glad that everything worked 
out so nicely. 
The first continuation does not cause any prob- 
lems. although the antecedent for she was intro- 
thlced by the iirst sentence of the sequence. Since 
no odler COml~.'ting discourse refcrellls have been 
menlioned, the resolulion process works without 
problem. However; .substituting a male protagonist 
called Mark for the female protagonist in the first 
sentence does cause problems for the understanding 
of 16). In this case, it is unclear who was meant 
by he. Note furthermore thai only two antecedents 
are available, even though three male antecedents 
have been introduced. Only the one in the last sen- 
tence (i.e. Peter), or the one introduced by the first 
sentence (i.e. Mark) are conceivable antecedents. A 
different continuation does not show this ambiguity: 
(6e') He decided just to buy beer. 
The continuation in (6¢') is an elaboration of the 
last sentence. Hence Peter, who was responsible for 
the booze, is the only possible antecedent. 
The following sentence is the last piece of evi- 
dence that the discourse segment allows only an- 
tecedents that are available on the so..ealled right 
frontier. The following sentence shows that it is not 
possible to refer to Tom, who wrote the cards, with 
the last sentence: 
(6¢") #He decided to use thick blue paper. 
5 Formalisation 
This section is an introduction to the formalism 
used. The formalism consists oftbe following parts: 
DRT The standard DRT theory is used to obtain 
a semantic representation for the meaning of 
a clau~ (Kamp and Reyle, "i993). However, 
the set of discourse referents is more structured 
than in the standard approach. It also goes be- 
yond the approach by Asher and Wada (1988) 
(see below for further details). 
Hashing lists The data structure of hashing lists is 
used to divide the set of discourse referents up 
into different slots. Each slot contains only ref- 
erents of the same type, as there are singular 
male, female, or neuter referents, plural enti- 
ties and events. 
SDRT A hierarchical discourse structure is needed 
to explain anaphoric expressions that refer 
back over segments boundaries. In addition, a 
theory is needed that takes into account world 
knowledge for the derivation of discourse rela- 
tions (Asher. 1993). 
105 
Ixl ly, ! \]el 
mary(zz) 
party(I/t) 
et : organise(zz, Z/z) 
KT" 
I ~, l y, \] z, l el 
mary(a:z) 
party(z/z) 
ez : organise(zz, gz) 
el _3/¢;~l-e 
Zl = zl ~ ~2 • X~l.Z 
I 
KRI : elab(K~l. K~I) 
82 : 
i 
i 
tom(z2) 
invitation_cards(Z2) 
e2 "wri~z2, ~) 
Figure 5: The Segmented HDRS for (6a-b) 
Underspecified Tree Structure Data have been 
discussed thatshow an ambiguity regarding the 
dis¢oune structm~ In order to express the am- 
biguity formally an underspecification mecha- 
nism is employed (Schilder, 1998). 
• I will now present the derivation of the sequence 
in (6). 
$.1 Elaberation 
First, a HDRS representation is to be de~ved for the 
first sentence. The HDRS for (6a)looks like a nor- 
real DRS, the only difference is the hashing list that 
contains the discourse referents in different slots. 
Second, a HDRS for the second sentence is derived ~ 
and, in addition, adiscourse relation is inferred from 
our world knowledge. An elaboration relation links 
the two HDRSs inthe given case. Within an under- 
specified version of SDRT this discourse structure 
is represented as shown in figure 4. 
106 
The nodes in the tree are labels for (Segmented) 
HDRSs. The two labels st and a~ denote the seman- 
tic content of the two first sentences, respectively. 
The label KRt refers to the derived relation dab- 
oration that holds between the two segments K~x 
and K~t. Note that the left daughter node of the 
K~ is already deiermined by setting K~t equal to 
at. The right daughter node, however, is left open. 
This is indicated by ~the dotted line between K,~ l 
and s2. This fine expr~ses graphically the domi- 
nance relation between tree nodes (<Z') in contrast 
to the straight line that indicates an immediate dom- 
inance relation (<l). s 
The underspecification of the tree structure al- 
lows us to define where possible attachment points 
are on the right frontier of the discourse structure. 
The tree structure in figure 4 possesses two attach- 
Sl follow here the description of a tree logic such as that 
used by Kallmeyer ( 1996} or Muskens (1995). 
@ 
0 
0 
0 
0 
0 
0 
0 @ 
@ 
0 
0 
0 
0 
0 
0 
0 @ 
0 
0 
0 
0 
0 @ 
0 
0 
0 
0 
0 @ 
@ 
0 
0 
0 @ 
0 
0 
0 @ 
@ 
0 
e \]'(T .. " 
K : " KT 
I "r I le, 
e e / • .K': 
0 l'.i z21 I IZ21e2 / "~ e / \ 
tom(z2) 
invitadon_cards(~2) 
e2 :write(z~, Z2) 
82 : 
..." f'J 
...-°°~ 
xsl i y31 103 
peter(z2) 
s3 : booze(z/s) 
es : buy(z2, Y3) 
Figure 6: The third sentence (6c) added 
ment points: one is between K~ ~ and s:~ and the 
other one is between/CT and K~ x. This latter node 
denotes .the topic of the current discourse segment 
which is the situation described by sz for (6a-b). 
Two further remarks are to be made regarding the 
representation in figure 5 before continuing with the 
sequence. First, an additional condition is added to 
the topic node. The information about the temporal 
relation between the situation-et and the subordi- 
nated situation was added on this level of the dis- 
course tree. Note that it is still open which event e 
will finally show up !n the node referred to by K~x. ~ 
Only afterclosing off this discourse segment will it 
be clear which event(s) elaborated the situation ez. 
Second, a plural entity Zl is stored in K~z.. This 
entity combines the singular entities into a plural 
one. A more elaborate mechanism is needed here 
in order to combine only entities of the same type 
(e.g. persons). For the time being, all plural entities 
s/C'~z.g can be described as a pointer to the plural slot of 
K,~t. 
are stored in this one slot. 
5.2 Continuing the thread 
A//st relation can be derived for the sentences (6b) 
and (6c) (see figure 6). The semantic content sa is 
added to the discourse tree linked by the discourse 
relation and furthermore a common topic is added 
at KTR2. The topic information has to be an abstract 
representation of the two HDRSs a2 and as. In or- 
der to achieve that, two new discourse referents am 
introduced: a plural entity Z4 comprising za.and zs 
(i.e. Tom and Peter) and a complex situation e4 tem- 
porally covering ea and e4. 
5.3 Looking back 
After the third sentence has been processed, the next 
sentence contains a pronoun. In case of she the pro- 
noun looks for a female singular-antecedent. The 
appropriate discourse referent is found on the right 
frontier in the appropriate slot. Alternatively, if 
sequence (6) contains the male protagonist named 
Mark in the first sentence instead of Ma~., the 
107 • 
Ixzrsonal Pronoun he could I£1v¢ two Posnll~lc ;m- 
l¢cCdc:nts: I~'ler or MCar/,'. How can Ihal be L'x- 
plaincd by the formalism? 
Figure 7 depicts the formalisation of the dis- 
coupe (6a-c) cmly showing the hashing lists on the 
right frontier. The dotted arrows indicate the hash- 
ing list as it is distributed over the right frontier of 
the discourse structure. There is only one entry for a 
female singular antecedent over the levels of nodes 
on the right frontier. However, if there were a male 
protagonist, the hashing list for the referents in node 
/x'~o would contain a discourse referent in the first 
slot. The list of possible antecedents for he would 
be xa and x t. 
The separation of different reference types also 
allows us to explain sequences such as ( I ). The dis- 
course continues on the highest level, but it is possi- 
ble to refer to discourse referents that got introduced 
on a lower level of the discourse structure. The link 
between two situations can be made via a rhetorical 
rela6on, and at the same time the slots for the other 
referents at the right frontier are still accessible. 
The hashing list also models a hashed right fron- 
tier. Past approaches always collapsed discourse at- 
tachment with the restriction regarding possible an- 
tecedents for anaphora (cf. the stack mechanism in 
CT or the tree representation for (S)DRT). 
The formalisation can also provide an explana- 
tion of why competing antecedents can cause an 
ambiguity for the pronoun resolution. The acces- 
sibility of hashifig lists on different levels of the dis- 
course structure explains why, in this example, a fe- 
male antecedent can be used as an antecedent even 
over several intervening sentences. It is important 
to highlight the difference of the account presented 
here to past approaches: The discourse referents are 
grouped together according to their agreement fea- 
tures. The DRT account by Asher and Wada, for 
instance, stores the discourse referents in a tree ac- 
cording to the accessibility conditions imposed by 
DRT and singles out the appropriate antecedents ac- 
cording to number and gender information as well 
as other criteria. There the agreement information 
is used to "weed out" possible antecedents, whereas 
within a HDRS the discourse referents are already 
accordingly stored. 
It should also be clear that an embedded dis- 
course cannot be extended infinitively, as shown by 
the cache approach. A restriction has to be imposed 
on the number of levels where an antecedent can be 
looked \['or. Future research, however, has to clarify 
108 
this is.~uc Iurlhcr. 
Nt)t¢ thai Ihis I'ormali.~atio. c:q~itahzcs on the in- 
sight gained From the cache al)proacli. An elabo- 
ration cannot be continued for too long. since the 
working memory of the reader might lose track of 
the protagonist(s)introduced on the highest level. 
On the other hand, this fonnalisation also cov- 
ers mor~ data than the cache approach. A text 
comprehension theory that employs a cache stor- 
age cannot account for a discou~e such as (6) 
with a non-competing female protagonist. The dis- 
course referent for Mary would have been stored 
in long term memory, because no differentiation is 
made betwee n the grammatical types of possible an- 
tecedents according to the cache approach. 
6 Conclusion 
I have proposed a new data structure for the process- 
ing of coreference in a discourse. A hashbzg list 
was employed to store referents according to their 
grammatical features such as number or gender. Be- 
cause of this, better accessibility to non-competing 
antecedents can be modelled by the approach pre- 
sented. 
The discourse grammar used combined insights 
from different approaches to discourse processing: 
(S)DRT and CT. In addition, a tree logic was used 
to allow underspecification in the representation of 
ambiguous. ~quences. 
Future research has to focus on the evaluation of ..'-.. . 
• the proposed theory to anaphora resolution. Co- 
pora investigation and psychological experiments 
will provide more evidence. In addition, an imple- 
mentation of hashed DRT is being programmed. 
Acknowledgments 
I would like to thank the two annomynous reviewers 
to their comments and feedback. Special thanks to 
Christie Manning for providing me with all her help. 

References
Allen. James. 1995. Natural I.zmguage Under- 
standing. The Benjamin/Cummings Publishing 
Company, Redwood City. California. 
Asher, Nicholas. 1993. Reference #J abstract Ob- 
je¢'ls in Discour.ve, volume 50 of 5uulie.v in Lin- 
gui,'tic.v mul Philosophy. Kluwer Academic Pub- 
lishers. Dordrecht. 
Asber, Nichola~ and Hajime Wada. 1988. A com- 
putational account of syntactic, semantic and dis- 
course principles for anaphora resolution. Jour- 
nal of Semantics, 6:309-344. 
Dale., Robert. 1992. Generating Referring Expres- 
sions. MIT Press, Cambridge, Massachusetts. 
Fox, Barbara. 1987. Discourse structure and 
anaphora. Cambridge University Press. 
Gordon, Peter and Randall Hendrick. 1998. The 
representation and processing of comferenC¢ in 
discourse. Cognitive Science, 22(4):389-424. 
Grosz, Barbara L, Aravind Joshi, and Scott Wein- 
stein. 1995. Contusing: A framework for mod- 
elling the local cohe~nc¢ of discourse. Compu- 
tational Linguistics, 21(2):203-225. 
Grosz, Barbara J. and Candac¢ L. Sidnex. 1986. At.. 
tentien, intention, and the structure of discourse. 
Compataaonal Linguistics, 12(3): 175-.-204. 
Kallmcycr, Laura. 1996. Undenp~ification in 
Tree Descri#on Grammars. Arbeitspapiem des 
Souderfonchungsbereichs 340 81, University of 
Ttibingen, T(lbingen, December. 
Kamp, Hans and Uwe Reyl¢. 1993. From Dis- 
course to Logic: lmroduction to Modeltheoretic 
Semantics of Natural Language, volume 42 of 
Studies in Linguistics and Philosophy. Kluwer 
Academic Publishers, Dordmcht. 
Mann, William and Sandm Thompson. 1988. 
Rhetorical structure theory: Toward a functional 
theory of text organisationn. Text. 80):243-28 I. 
Muskens, R. 1995. Order-independence and under- 
specification. In J. Groenendijk, editor, Ellipsis, 
Underspecification, Events and more in Dynamic 
Semantics, number R.2.2.C in DYANA Deliver- 
able. ILLC/Department of Philosophy, University 
of Amsmrdam, Amsterdam, The Netherlands, 
pages 15--34. 
Schilder, Frank. 1998. An underspacified seg- 
meated discourse representation theory (US- 
DR'I'). In Proceedings of th e 17 ~ International 
. Conference on Computational Linguistics (COL- 
ING "98) and of the 36 th Annual Meeting of the 
Association for Computational Linguistics (ACL 
"98), pages ! 188-1192, Universit6 de Moutr6al, 
Montr~,al, Qu6bec, Canada. 
Walker, Marilyn. 1998. Centering, anaphora res, 
olution, and discourse structure. In Marilyn 
Walker, Aravind Joshi, and Ellen Prince, editors, 
Centering Theory in Discourse. Clarendon Press, 
Oxford, pages 401-435. 
Wcbber, B. L. 1991. Structure and ostension in the 
interpmtatiou of discourse deixis. Language and 
Cognitive Processes, 6(2): 107-135. 
