Lexical Marking and the Recovery of Discourse Structure 
Kathleen Dahlgren 
InQuizit Technologies, Inc. 
725 Arizona Avenue, Suite 204 
Santa Monica, California 90401 
email- kd@inquizit.com 
Introduction 
In the theory presented here, discourse rela- 
tions are equated with coherence relations. 
The relata are taken to be sets of events or 
entities introduced into the discourse, as in 
SDRT (Asher, 1993). Our empirical stud- 
ies of commentary, narrative and news texts 
have shown that coherence relations are fre- 
quently signaled syntactically or semanti- 
cally rather than lexically. In a full natural 
language understanding design, this much 
discourse structure can be recognized com- 
putationally. However, there remain dis- 
courses in which the coherence relations are 
unmarked even by syntax, tense or aspect. 
Some of these relations cannot be recognized 
computationally because they require exten- 
sive world knowledge. Ultimately any re- 
lation between events or objects which is 
common knowledge among discourse partic- 
ipants can form the basis for a felicitous 
request for an unmarked coherence infer- 
ence. In order to recognize all coherence re- 
lations, a computational system needs full 
world knowledge. 
Theory of Discourse Structure 
Other theories have defined the relata in dis- 
course structure as clauses (Trabasso and 
Sperry, 1985), pieces of text (Hobbs, 1985, 
Mann and Thompson, 1987), pieces of text 
plus connectives ( Cohen, 1984, Reichman, 
1985), propositions (Van Dijk and Kintsch, 
1983, Polanyi, 1988), plans (Lochbaum, 
Grosz and Sidner, 1990) and segmented dis- 
course representation structures, the SDRT 
theory (Asher, 1993). 
The theory that we adopt here proposes 
that the relata in discourse structure are 
sets of events, states or entities introduced 
into the discourse, along the lines of SDRT. 
The reader builds a "cognitive model" of 
the text content. In the cognitive model or 
"situation model", the events in a discourse 
are connected by inferences concerning the 
surrounding events (causes, goals, parts of 
events, enabling conditions, and so on), as 
shown in many studies by Graesser and col- 
leagues, such as Graesser and Zwaan, 1995. 
In narrative, the cognitive model forms a 
causal chain of events (Trabasso and Sperry, 
1985). A discourse is coherent to the extent 
that a cognitive model of the discourse con- 
tent can be built by a qualified reader, tha~ 
is one with the requisite background knowl- 
edge. 
In our theory, coherence inferences are 
added to the discourse representation as 
predications added to the DRS (Dahlgren, 
1996). A discourse segment is a set of dis- 
course events and entities that cohere among 
themselves, and share a single coherence re- 
lation to another discourse segment (which 
could consist of just one event or entity, 
such as the discourse topic). In this the- 
ory the same set of coherence relations re- 
late the events introduced by individual sen- 
tences and also by sets of sentences (seg- 
ments), because it was found that the same 
naive theories of the relatedness of things 
and events explained both local and global 
coherence. Surface rhetorical relations such 
as "turning to" were not considered in this 
theory. 
65 
The theory of coherence informing the em- 
pirical studies summarized below claims that 
the basis of coherence relations and discourse 
structure is the naive view (or naive the- 
ory) of events and the relatedness of objects 
in the real world. It is supposed that co- 
herence inferences during discourse under- 
standing are made according to the same 
naive theories people use to understand real 
events. A coherent discourse is then one for 
which a cognitive model of events can be 
built in which events and things relate in 
ways people naively expect them to relate. 
By "naive" we mean non-scientific and non- 
truth-conditional (Hayes, 1985). The cog- 
nitive model is built by the reader is again 
a naive theory--a belief structure about the 
way the world would be if the writers' story 
were true. 
The set of coherence relations in Table I 
are justified from the above philosophical 
point of view and also by noticing that each 
of them can be grounded in a known psycho- 
logical process, and that each of them can 
be marked by an overt lexical cue phrase, as 
summarized in Table I. 
Studies of Coherence, 
Discourse Structure and 
Anaphora Resolution 
Preliminary studies were conducted in or- 
der to facilitate the design of a computation 
text understanding system. We examined 
texts in three genres: 1) commentary text 
(13000 and 20000 words), narrative ( the 
novel Wheels by Alex Halley), and wire ser- 
vice reports (MUC-3 terrorism texts). The 
commentary corpus was drawn from Wall 
Street Journal articles which might be called 
"news", but in every case there was a key 
section that had commentary or evaluation 
of events, and the discourse structure dif- 
fered from that of the terrorism news re- 
ports. The discourse segment boundaries, 
the coherence (discourse) relations and the 
anaphoric expressions throughout the com- 
mentary and narrative corpora were labeled 
and analyzed by two individuals (the author 
and another linguist). 
The purpose of investigation was to dis- 
cover: 
1. human mechanisms for coherence rela- 
tion assignment (including between seg- 
ments of discourse) during interpreta- 
tion 
2. human mechanisms for discourse seg- 
mentation during interpretation 
3. constraints on resolution of anaphoric 
expressions during interpretation 
The goal was to clearly define the mecha- 
nisms so that they could be imitated by com- 
putational algorithms. These studies and re- 
sults are fully described in (Dahlgren, 1996, 
Lord and Dahlgren, 1997, and Iwanska, et 
all, 1991). 
Lexical Marking of Coherence 
Relations 
In our theory, as in Knott and Dale (1994), 
it is not possible to have a coherence re- 
lation which is NEVER signaled lexically. 
Membership in the set of relations is jus- 
tified first by having some overt marker in 
English. If coherence relations which never 
have an overt marker are allowed into the 
theory, a hopeless multiplication of invisible 
relations could ensue. Thus the theory triv- 
ial.ly answers the symposium question, "Are 
there any that are never lex.ical_ly signaled?" 
As for the other part of that question, "Are 
there discourse relations that are always lexi- 
cally signaled?" our study replies in the neg- 
ative. Any of the coherence relations in Ta- 
ble I can occur without overt lex.ical cues, as 
illustrated in the examples in Figure I. 
The two-sentence examples in Figure I are 
replicated with discourse segments in the 
study corpora. 
Non-lexical Marking of 
Discourse Structure 
Our corpus studies directly answer one of 
the symposium questions, "What non-lexical 
66(i.e., syntactic or prosodic) means are used 
Coherence 
Relation 
cause 
Table I: Evidence for Coherence Relations 
Cognitive Capacity 
naive theories of causation 
Connective 
because, as a result, consequently 
background 
goal 
enablement 
constituency 
contrast 
elaboration 
evaluation 
perception of figure/ground 
and salience 
naive theories of intentionality 
naive theories of causation 
recognition of part/whole 
recognition of similarity 
perception of spatio-temporal 
contiguity 
preference (goodness ratings) 
when, while 
in order to, so that 
because 
in summary, for example, first, second 
similarly, fikewise, in contrast 
then, next, that is to say 
evidently, that means, modality, 
negation 
Cause 
Background 
Goal 
Enablement 
Constituency 
Contrast 
Elaboration 
Evaluation 
Figure I: Coherence Relations without Lexical Cues 
Fred died. Harry stabbed him. 
It was raining outside. Fred rushed in the front door. 
Fred bought at book. He read it. 
Fred was well-heeled. He invested heavily. 
Fred was tried for murder. The prosecution opened the case with a diatribe. 
Fred loves rain. Mary hates it. 
Fred rushed in the front door. He threw off his raincoat. 
The networks are losing money. They should cut back on the ads. 
to signal a relation?" In our study of seg- 
ments, coherence relations, and pronouns in 
Wheels, there was a cue phrase at only 41% 
of the segment boundaries. Other indicators 
were non-lexical (see Table II). In our study 
of 13000 words of commentary genre text, 
cue phrases signaled segment boundary at 
only 16% of the segment boundaries. 
Table II: Lexical and Non-Lexical 
Discourse Markers in Wheels 
change of coherence relation 88% 
change of sentence subject 72% 
segmenting cue phrase 41% 
change in tense or aspect 58% 
By non-lexical discourse marker we mean 
an indicator of a structural boundary in the 
discourse, one that requires the reader to be- 
gin a new segment, and also find a plausible 
attachment point for the new segment in the 
67 
discourse structure tree built so far (Asher, 
1993). Examples of lex_ical discourse mark- 
ers would be "while" (indicating the begin- 
ning of a background segment), "then" (in- 
dicating the continuation of the same seg- 
ment with a constituent of the larger event 
described in the segment), and so on. 
The non-lexical discourse markers found 
in all three genres are shown in Table III. 
In the study of Wheels it was noted that 
change of sentence subject marked 72% of 
the new segments. This relates as well to the 
pattern of pronoun use. There is a complex 
relation between personal pronouns and dis- 
course structure in Wheels. Antecedents of 
personal pronouns were found either in the 
same segment, in a dominating segment, or 
in an unclosed prior segment for which the 
use of the pronoun signals that the prior seg- 
ment should be reopened (popped). Infre- 
quently, the antecedent could be found in an 
Table III: Non-lexical Discourse Markers 
Change of sentence subject (change of local topic) 
and consequently, absence of personal pronoun reference to prior sentence 
Event anaphora 
Change of coherence relation 
Change of time 
Change of place 
Tense 
Aspect 
immediately adjacent closed sister segment 
in the discourse tree. 
Event anaphora refers to the use of ad- 
verbs like "so" and demonstratives like 
"this" to close a segment and then refer to 
the summation of all of the events in the 
segment proaominally. An example is from 
a Wall Street Journal article about Brazil 
(shortened), where "this" refers to the sum 
of the events in the prior segment : 
Brazil suspended debt payments. 
Mexico followed suit. Chile threat- 
ened to cancel all of its foreign 
debt. This caught international 
bankers unprepared. 
Computational Recovery of 
Discourse Structure 
The symposium organizers ask, "In analy- 
sis, is it possible to reliably infer discourse 
relations from surface cues?" Superficially, 
the answer is "no", because as described 
above, frequently there are no surface cue 
phrases at segment boundaries. Even para- 
graph boundaries in highly edited text are 
not reliable cues of segment boundaries. 
But at a deeper level, the answer is con- 
ditional. If by "surface cues" we mean all 
of the syntactic and semantic information 
available in the sentences of the discourse, it 
is conceivable to somewhat reliably (at least 
as reliably as humans can) infer discourse 
relations from that information. This task 
requires a full linguistic interpretive module 
that parses and disambiguates the discourse, 
producing a logical form. The logical form is 
input to a formal semantic module (such as 
an SDRT module). In the resulting SDRS 
for the discourse, along with the meaning 
representations of the word senses in the dis- 
course (naive semantics), all of the informa- 
tion required to recover the discourse rela- 
tions is available. 
In this design, world knowledge is encoded 
in a "naive semantic" lexicon, which reflects 
a shallow layer of knowledge, just enough 
to interpret the text. "Just enough" means 
enough to disambiguate the word meanings 
and the syntactic structure, and enough to 
recover the antecedents of anaphoric expres- 
sions (Dahlgren, 1991). 
Naive semantic representations capture 
some of the naive theories of the world which 
people associate with word sense meanings 
in a given culture (Dahlgren, 1988). Naive 
semantics is a lexical theory which equates 
the meaning of a word sense with a con- 
cept, so that concept representations and 
word sense meaning representations have the 
same form. In the contrasting classical tra- 
dition, word meanings are conjunctions of 
primitives which form conditions for being in 
the extension (or class of objects) named by 
a word sense (Katz and Fodor, 1963). The 
meaning of "water" is a formula 
water(X) 
.: :. clear(X)&colorless(X)&liquid(X) 
The classical theory doesn't work because: 
1) true scientific theories of the nature of cat- 
egories are not necessarily known by speak- 
ers of a language; 2) the categories Concepts 
name are gradient, with some members bet- 
ter examples than others; and 3) typical 
68 
properties of objects aren't necessary prop- 
erties. For example, muddy water is still wa- 
ter. 
Naive semantics posits that word mean- 
ings are shallow, limited naive theories of 
objects and events. The meaning of "wa- 
ter" has naive propositions equivalent to the 
following: 
Water is a clear, tasteless liquid; 
you find it in rivers, you find it at 
the tap; you drink it; you wash with 
it. 
The features in the representation are psy- 
cholinguistically justified. These are the 
types of propositions subjects hst when 
asked to give the "characteristics" of nouns. 
(In our computational lexicon, features are 
represented in a first-order logic form with 
temporal markings.) 
In naive semantics, the content of verb 
concepts is based upon psycholingnistic 
studies of story comprehension (Graesser 
and Clark, 1985). A verb is understood 
and recalled in terms of other events and 
states which typical surround the event type 
it names (rather than being understood as 
a metaphor for motion, as in other theo- 
ries). For example, the verb "stab" is as- 
sociated with the goal of harming someone, 
the goal of killing someone, the constituent 
event of piercing someone with a sharp in- 
strument, the consequence of killing some- 
one, the consequence of someone bleeding, 
the enabling state of having a knife and so 
on. These surrounding events are elicited by 
the wh-questions such as "What caused X?" 
and "What was the goal of X?". The cor- 
responding features are those employed in 
our computational naive semantic lexicon, 
namely "cause", "goal", "what_next", "con- 
sequence", "time", "location", and "how", 
along with selectional restrictions. 
Lexical naive theories arise in a culture or 
subculture, and are limited to those prop- 
erties and propositions shared among the 
members of the subculture. In addition to 
the shared naive theory of an object or event, 
speakers of a dialect may hold individual be- 
liefs which are at odds with the shared naive 
theory, but they have to use the shared the- 
ory in order to communicate. In other words, 
a scientist may know that an object which 
appears to be failing is not (such as the Sun), 
but must still understand such statements 
as "The Sun is setting" in terms of the in- 
correct naive theory underlying the use of 
"set" in the context. While naive seman- 
tic representations contain far more infor- 
mation than meaning representations in the 
classical theory, they are hmited as well to 
that knowledge which is very closely associ- 
ated with a word sense, and used to recover 
the interpretation of sentence structure and 
meaning while listening or reading. Included 
are the most typical propositions describ- 
ing an object or event, those which inform 
word sense disambiguation, structural dis- 
ambiguation and anaphora resolution pro- 
cessing, but not the elements which are used 
in deep inferencing or recollection of personal 
episodes. 
The shallowness of naive semantic repre- 
sentations is particularly important in ex- 
plaining the use of lexical markers in dis- 
course. Writers tend to employ markers 
when they cannot assume that the reader 
will easily and readily draw coherence in- 
ferences without them. Readers will be 
able to do so if the shared naive theory of 
events includes enough information. If the 
naive theory says that an event E1 typically 
causes an event E2, then it is felicitous to 
write two sentences describing just the two 
events, with no discourse markers relating 
the events, i.e., E1 E2 or E2 El. But if the 
naive semantics of the events does not pro- 
vide the connection, writers tend to make 
it explicit at some point, in order to aid 
the reader in building the intended cognitive 
model. 
The surrounding events in naive seman- 
tic verb representations are precisely the in- 
formation required to trigger unmarked co- 
herence inferences. A causal relation can be 
inferred by inspection of lexical information 
alone when no other cue is available as in the 
discourse below which has no cue phrase, no 
change in tense or aspect, and the reverse of 
temporal order. 
69 
Fred died. Harry stabbed him. 
Humans know to make the required co- 
herence inference (required because all fe- 
licitous literal discourses must cohere), and 
they infer that the cause of Fred's death 
was the stabbing. The naive semantic in- 
formation associated with senses of "stab" 
and "die" enable a computational inference 
of the same kind to be made. This is re- 
flected by adding to the DRS below a coher- 
ence predicate cause e2,el). 
ul,el,u2,e2 
rl,r2,hel 
rl < now 
fred(ul) 
el die(ul) 
el included in rl 
harry(u2) 
e2 stab(u2,himl) 
rl < r2 
e2 included in r2 
After assignment of the coherence re- 
lation the segmented DRS (or cognitive 
DRS) has an added coherence predicate 
"cause(e2,el)", which indicates that the 
cause of dying was stabbing. Also, the 
anaphoric expression "him1" is resolved to 
the same entity as Fred, namely ul in the 
ec uation ul = u2 in the cognitive DRS. 
ul,el,u2,e2 
rl,r2 
rl < now 
fred(ul) 
el die(u1) 
el included in rl 
harry(u2) 
e2 stab(u2,ul) 
rl < r2 
e2 included in r2 
u2=ul 
cause(e2,el) 
In another example, the precursor of our 
current implementation was able to build 
a shallow, topic-related discourse structure 
tree for MUC-3 message number 99 by notic- 
ing change of time, change of place, or seg- 
menting cue phrase (Iwanska et al, 1991). 
However, events and individuals in the 
world relate in indefinitely many ways. No 
matter how large the naive semantic lex- 
icon would get, no matter how detailed 
the knowledge would become, a natural 
language understanding system would en- 
counter discourses which required additional 
knowledge. The gap would prevent the 
system from drawing a coherence inference 
which would be easy for humans to draw. 
When they do have difficulty building the 
cognitive model, humans have a huge store of 
knowledge, and they dig deeper (while tak- 
ing more time). Even in simple secular texts 
which require no knowledge of jargon, it is 
possible to find many segments related by co- 
herence inferences which could not be drawn 
using a shallow naive semantic lexicon. 
The problem lies in the fact that coher- 
ence inferences are based upon naive theo- 
ries of the relatedness of events and objects 
in the world. Until a computer system can 
be taught the complete system of naive the- 
ories of the world, it can't form the full cog- 
nitive model of all discourses. It can only 
guarantee the derivation of the structure in 
those cases where lexical marking, change in 
sentence subject, event anaphora, change in 
time or place, tense or aspect are present 
as indicators. Nevertheless, a capability to 
derive that much of the structure is useful 
for many computational goals, including im- 
proved anaphora resolution, temporal rea- 
soning and locative reasoning. 
Conclusion 
Discourse relations are often not marked lex- 
ically. However, other indicators, including 
syntax, semantics and world knowledge, are 
available in commentary, narrative and news 
genre texts. These can be used by a compu- 
tational system that has a full syntax, for- 
mal semantics and a naive semantic lexi- 
con, to recover much of the discourse struc- 
ture. Complete recovery of discourse struc- 
ture computationally awaits machine learn- 
ing systems which can teach computers ex- 
tensive knowledge about objects, events and 
their relations in the world. 
70 

References 
Asher, N. 1993. Reference to Abstract Ob- 
jects in English. Boston, MA: Kluwer 
Academic Publishers. 
Britton, B. and J. Black (Eds.) 1985. 
Understanding ezpository tezt. Hillsdale, 
NK: Erlbaum. 
Cohen, R. 1984. A computational theory of 
the function of clue words in argument un- 
derstanding. Proceedings off COLING-84, 
251-258. 
Dahlgren, K. 1988. Naive Semantics for 
Natural Language Understanding. Boston, 
MA: Kluwer Academic Publishers. 
Dahlgren, K. 1991. The autonomy of shal- 
low lexical knowledge. In J. Pustejovsky 
and S. Bergler (Eds.), Lezical Seman- 
tics and Knowledge Representation. New 
York: Springer Verlag. 
Dahlgren, K. 1996. Discourse coherence and 
segmentation. In E. Hovy and D. Scott 
(Eds.), Burning Issues in Discourse Hills- 
dale, N J: Erlbaum. 
Graesser, A., and L. Clark. 1985. Struc- 
tures and Procedures of Implicit Knowl- 
edge. Norwood, NJ: Ablex. 
Graesser, A., and G.H. Bower 1990. In- 
ferences and Tezt Comprehension. San 
Diego, CA: Academic Press. 
Graesser, A., and R.A. Zwaan. 1995. In- 
ference Generation and the Construction 
of Situation Models. In Weaver, C.A., 
S. Mannes and C.R. Fletcher Discourse 
Comprehension HJllsdale, N J: Erlbaum. 
Grosz, B. and C. Sidner. 1986. Atten- 
tion, Intensions and the Structure of Dis- 
course: A Review. Computational Lin- 
guistics 7:85-98; 12:175-204. 
Hayes, P.J. 1985. The Second Naive Physics 
Manifesto. In J.R. Hobbs and R.C. Moore 
(Eds.) Formal Theories of the Common- 
sense World. Norwood, N J: Ablex. 
Hobbs, J.R. 1985. On the Coherence and 
Structure of Discourse. CSLI Report 
CSLI-85-37. 
Iwanska, L., D. Appelt, D. Ayuso, 
K. Dahlgren, B. Stalls, R. Grishman, 
G. Krupka, C. Montgomery, and E. Riloff. 
1991. Computational aspects of discourse 
in the context of MUC-3. Proc. of the 
Third Message Understanding Conference 
(MUC-3), 256-282. 
Katz, J.J. and J.A. Fodor. 1963. The 
Structure of Semantic Theory. Language 
39:170-210. 
Knott, A. and R. Dale. 1994. Using Lin- 
guistic Phenomena to Motivate a Set of 
Rhetorical Relations. Discourse Processes 
18(1):35-62. 
Lochbaum, K.E., B.J. Grosz and C.L. Sid- 
ner. 1990. Models of plans to support 
communication: an initial report. Proc. 
AAAI: 485-490. 
Lord, C. and K. Dahlgren. 1997. Partic- 
ipant and event anaphora in newspaper 
articles. In J. Bybee et al. (Eds.) Essays 
on Language Function and Language Type 
Dedicated to T. Givon. Amsterdam: John 
Benjamins. 
Mann, W. and S. Thompson. 1987. Rhetor- 
ical Structure Theory: A Theory of Tezt 
Organization. ISI Reprint Series: ISI-RS- 
87-190. 
Morrow, D.G.i S.L. Greenspan, and 
G.H. Bower. 1987. Accessibility and situ- 
ation models in narrative comprehension. 
Journal of Memory and Language 26:165- 
87. 
Polanyi, L. 1988. A formal model of the 
structure of discourse. Journal of Prag- 
matics 12:601-638. 
Reichman, R. 1985. Getting Computers to 
Talk Like You and Me. Cambridge, MA: 
MIT Press. 
Trabasso, T. and L.L. Sperry. 1985. Causal 
relatedness and the importance of story 
events. Journal of Memory and Language 
24:595-611. 
van den Broek, P., P.J. Bauer, and 
T. Bourg (Eds.) 1997. Developmental 
Spans in Event Comprehension and Rep- 
resentation: Bridging Fictional and Ac- 
tual Events. Mahwah, N J: Lawrence Erl- 
baum. 
Van Dijk, T. and W. Kintsch. 1983. Strate- 
gies of Discourse Comprehension. New 
York: Academic Press. 
