Issues in Linguistic Seginenta,tion 
.\]anyce M. Wiebe 
Department of Coml)uter Science and the Computing Resea,rch Labora,tory 
Box 30001/Dept. CS 
New Mexico State University 
Las Cruces, NM 88003 
wiel~e((.0mnsu.edu 
This paper addresses discourse structure from the perspective of understanding. 
It would perhaps help us understand the na,ture of discourse relatiolls il" we better under- 
stood what units of a text. can be related to one a.nother. In Olle ma.jor theory of discourse 
structure, Rhetorical Structure Theory (Mann &: Thompson 1988; Imrea.l'ter simply RS'T), the 
smallest possible linguistic units that can lmrtMl)ate in a rhetorical rela.tion a,re called units, 
and "are essentially clauses, except that clausal subjects a.nd complenlents a.nd restricte(l rel- 
ative clauses are considered as parts of their host clause units rather than as sepa, rate units" 
\[p. 248\]. But both Dale and Meteer (in these proceedings) point out that rhetorical relations 
can appear within clausal units. (DMe's argument will be discussed at the end of" this paper.) 
For example, the relation that is expressed in two clauses in (1.2) is expressed in only one 
clause in (1.1)(from Meteer, these proceedings): 
(1) 1.1 My flicking the switch caused the light to turn on. 
1.2 Because I flicked the switch, the light turned on. 
Similarly, Hwang & Schubert (1992), in their work on recognizing tenq)ora.l relations among 
episodes in discourse, argue for a "fine structure" of discourse, in which temporal relations 
can be established even among episodes of' subordinate clauses. 
This paper points out another discourse phenomenon tllat calls fbr a "fine structure" 
of discourse. In passages contMning attit'udc report.s--reports of agents' beliefs, knowledge, 
intentions, percel)tions , etc.--rhetorical relations (:an hold such that one or more of the lin- 
guistic units involved in a relation is only l)a.rt of a sentence. 1 lit solne ca.ses, such a unit 
may be smaller than the smallest possible unit in RST. S1)ecifically, only the complement 
of an attitude report, rather than the entire sentence, might be involved in some particular 
relation. (An example is (2) below, which will be discussed shortly.) To make matters more 
concrete, we will consider short 1)assages in which an attitude rel)ort participates in a relation 
indicated by the cue phrase 'but', where 'hut' is I)eing used to connect clauses. 
1 Note that this paper is meant to illustr;Lte some COml)lexities that I believ(~ require ;Ltt~ntion. Ilndoubtedly, 
similar discourse structures occur that do not involve a.ttitude reports. 
148 
Consider the following pa.ss~ge: 
(2) 2.1 John knew that Ma.ry had uever beeu introduced U) Sa.m. 
2.2 But she had been introduced t<) Derek. 
One reading of this passage is tha.t (2.2), as well as (2.1), preseuts all a.ttitu(le of Johu. 
hnagine that (2) appears in a narrative in which .John is formiug a pla.ll, a.nd whom Mary has 
and has not met is somehow inq~orta.ut t() this pb~.u. Under this rea.diHg, (2.2) is a,n example 
of a sentence th~tt l)resents the a.ttitude of some a.gent X, even though .¥ nor the attitude 
are mentioned in the sentence. (Detecting such s~ntences specilic;dly in third-persoil fictiona.\] 
narrative text was the focus of previous work; sue Wiel)e 1990.) 
Notice that 'But' in (2.2) is being used to connect clauses, a.\]M not in addition to mark 
the beginning of a new discourse segment (as the term discourse ,~¢:gmcnt is used in Grosz ~ 
Sidner 1986). The question we are asking is what cla.uses are being connected by 'But' in (2)? 
Under the reading described above, the ibllowiug are the clauses participa.tiug in the relation: 
Mary had never l)een introduced to Sam. \[the complement of (2.1)\] 
But she had been introduced to Derek. 
Contrast (2) with (3): 
(3) 3.1 John thought that Richard ha(l sta.bl)ed him iu tile ba.ck. 
3.2 But John was often too suspicious of R,icha.rd. 
In (3), the entire senteltces (3.1) and (3.2) pa.rticil)ate in the relation iu(|ica.ted I)y 'But'. 
Deciding which linguistic units are involved ith a. relation is not sulficield, f<)r understan(liug 
how they are related, of course. Genera.lly sl)eakiug , knowledge M)ont the world aiM/or 
what the speaker or writer is trying to accomplish iu the discourse (Moore ,~,: Pollack 1992) 
would presumably be involved in arrivillg at the actual contrasts being \]mule in (2) and (3) 
(or whatever sorts of relations are being ilMica.ted by 'But'). But to hol)(:~ to arrive at an 
understanding of such texts, an NLU system must entertain the possibility that one or more 
of the linguistic units involved in the rela.tiou may I)e oMy a clausM complement. 
Following is another short passage in which 'but' in(licates a rela.tiotl ilivolving a.n attitu(le 
report. For this passage, R,ST units axe sufficiently fine-grained. The pa.ssa.ge is of interest to 
us here because the main clause of the second seHtence, (4.3), is not involved in the relation 
(what are numbered in this passage a.re RST units): 
(4) 4.1 The car was finally coming toward him. 
4.2 He finished his diagnostic tests, 4.3 feeling relief. 
4.4 But then the car started to turu right. 
The relation indicated by 'But' is between his relief at the car comiug toward hiln ((4.1) and 
(4.3) together) and the car then turuiug right (4.4). Tha.t is, \['or tim I~Urp(~s( ' of understanding 
the relation indicated by 'But', (4.1) a.nd (4.3) ~l.re groul)ed together, which axe iu turn grouped 
with (4.4). But there are clearly also narrative relations to be rec(~guized ill this passage (e.g., 
(4.2) and (4.4) are in a sequence rela.tiou), which involve other groupings (.)\[" the clauses. Many 
(such as Moore ~ Pollack 1992, and Dale, IIughes ,~,, McCoy, Meteer, aud Moser ~, Moore in 
149 
these proceedings) have noted that more tha.u oue type of rela.tiou ca.ll simultaneously hold 
among elements of the discourse; we see another example of this here. In discourses presenting 
attitudes, because they present states, eveuts, a.ud objects as well a.s attitudes toward them, 
a linguistic unit can be involved in more than one kind of relatiou, possibly grouped with 
different units. 
I used "groupe(l" above for la.ck of a completely a.1)prOlMa.te t(,rm from the litera,ture. It 
would be good to taJk a.bout these groupiugs a.s discourse segments of th(, liuguistic structure 
of Grosz & Sidner's theory, to distinguish lil.guistic structure, \['tom the uou-liuguistie I)a.sis t~)r 
that structure. This would I)e misleading, however, because dist'ours(, is strut'tured i~, (h'osz 
& Sidner's theory ou the basis of iuteutions. A rhetorica.l relaJ.i<)u holdiug between pieces 
of a discourse, such as the one indic,~ted by 'But' in (4), does uot uecesm~rily make them 
into a discourse segment (see Grosz &: Si(hmr 1986, p. 188, Moore ,D. Pa.ris 1992, p. 46, and 
Dale, these proceedings, for discussiou). Another possible term is Ma.uu &'. Thoml)SOU'S text 
span, which suggests (;Lt lea.st in pra.ctice) segmeuts tha.t a.re either uuits, or (:Oml)osed only of 
a.djacent units (excluding, fiJr examl~le, a. segmeut COml)osed of (4.1) ;rod (4.3) together with 
(4.4)). 
The examples given al)ove do not illustrat:e the ra,nge of l~Ossihle discourse structures 
in which attitude reports participa,te. I a,m curreutly auaJyziug text segmeuts (from oll-liue 
texts) that contain 1)oth attitude terms a,ud particular cue phrases to try to ideutif~, the various 
possil)ilities. My goa,1 is to develop a, mecha,nism that uses syuta,¢'tic a.ud lexicaJ knowledge 
to identify the segments involved in the relatiou. Without a,ccess to world a,nd/or intentional 
knowledge, such a lnechauism could produce only likely hyl~(~theses; the idea. is to see if 
information extracted by the non-discourse (:Oml~oneuts of a,u overa.ll NLU system could I)e 
used to constrMn processing at deeper levels (Wiebe 1990, Bergler to a.pl~ea.r , Passonneau 
Litman 1993, and Hirschberg & Litma.u to a.pl)ear ea.ch address usiug one or more of synt;~ctic, 
lexical, orthographic, and intonatiouM iuforlma.tioll to 1)ert~rm discourse ta.sks). 
In summary, in discourses with attitude reports, (A) liuguistic uuits smaller than R.ST 
units may be involved in a. discourse relatiou, a.ud (B) a siugle liuguistic ulfit ma.y 1)e involved in 
more than one kind of rela.tiou, possibly t()gether with differeut lil,guisl.i(" uuits. As mentioned 
a.bove, others have also noted (A) a.ud (B). Da.l(, ma.kes the il~l.et'(,st.itlg a.rgt.ment tha.t, a.mong 
other things, (A) and (B) suggest tha.t we should ba.uish those rhet¢M('a.l rela.tions which simply 
mirror underlying knowledge-ba.se rela.tions, lu ma.uy ca,ses, he poiuts out, rhetoricaJ relations 
are simply subject-matter relations--in esta.blishiug such rela.tious, a.ll w(, a.re rea.lly doing is 
identifying knowledge-base rela.tions between eutities melltioued if. the discourse. With this 
in mind, the fa.ct that we find ma.uy insta.m:es ()f (A) a.ud (B) is m)t surprisiug. Since various 
syntactic constituents evoke various ol)jects, sta.tes, aud events, it is not surprising that one 
can find discourse rel~ttions (mirroring knowledge-base rela.tious) tha.t iuvolve va.rious pieces 
of sentences. 
I think that Da.le makes some very good points. As I suggested a.bove, olle finds structures 
such a,s the ones illustrated in this pa.per l)eca.use discourses cau pl'esel,t attitudes towards 
things as well as presenting those things themselves. Thus, we fiu(I rela.tions a.mong 1)a.rts of 
sentences evoking only the objects of attitudes, as well as ;|.l|lOllg those evoking the attitudes 
150 
themselves. Further, I certainly do not disagree with Dale's suggestion that tocusing on those 
rhetoricM relations that are clearly not domain relations would be a, way to better understand 
communicative intentions. 
Given a rich knowledge base, however, out of all of the possil)le knowledge-I)ase rebd, ions 
that can hohl among all the things evoked in a discourse, only some are iutended to be picked 
out as the basis for coherence. (If we allow default inference, the mmd~er of possible relations 
is astounding.) Hobbs (1979), among others, argues this. As such, certain groupings of 
linguistic units, i.e., those evoking the things involved in these relations, axe more important 
than others for establishing coherence. Perhaps the ideal discourse model is one in which the 
process of arriving at these groupings and associated relations is governed I)y a process of 
intention recognition. But investigating local coherence directly--how it m;mifests itself in 
various contexts in naturally-occurring texts, and how non-pragmatic iuformation nfight be 
exploited to recognize it--could provide important constraints tbr iHteution-ba,sed models. 

References 
\[1\] Bergler, S. From Lexical Semantics to Text Analysis (to aplma.r ). Ilk P. Sa,iut-Dizier and 
E. Viegas (eds.), Comp'at~ttional Lexical ,5'rmantics, (~a.ntl)ri(Ige I/niv(,rsil, y Press. 
\[2\] Grosz, B. J. & Sidner, C. (19s(i). Attention, Intel,tions, and the Structure of Discourse. 
Computational Liny'ai.~tics 12(3): 175-20~1. 
\[3\] Hirschberg, J. and Litman, D. (to a.plmar ). Empirical Studies on the Disamlfiguation 
of Cue Phrases. Comlmtatioual Linguistics. 
\[4\] Hobbs, J. R. (1979). Coherence and Corefereltce. Cognitive Scictu:t: 3(1): 67-90. 
\[5\] Hwang, C. H. & Schubert, L. K. (1992). Tense Trees as the "Fin~ Structure" of Dis- 
course. In P1vc. of AC'L-92: 232-240. 
\[6\] Mann, W. & Thompson, S. (1988). Rhetorical Structure Theory: Toward a functional 
theory of text organization. Text 8(3): 24"\]-2,~1. 
\[7\] Moore, J. & Paris, C. (1992). Planlfing Text for Advisory Dialoguos: (:;q)turing Inten- 
tionaJ and Rhetorica.l lnfornta, tion. Trch'ui,'ul l~.t'porl 92-22, (l:'ittsl)urgh: Uttiversil,y of 
Pittsl)urgh Dept. of (~Oml)tlt(,r Science). 
\[8\] Moore, .l. & Pollack, M. (1992). A Prol)lem tbr RST: The tme(l for M ulti-Lew~l Discourse 
Analysis. Computational Linguistic.s 18(,'1): 537-544. 
\[9\] Passonneau, R. & Litman, D. (1993). Intention-Based Segmentatiot!: H uman Reliability 
and Correlation with Linguistic Cues. In P'lw'. of ACL-93. 
\[1O\] Wiebe, J. (1990). Recognizing Sul).iective Senten(:es: A ('.omptnt;~.ti(.m,l lnvestigat:ion 
of Narrative Text. Ph.D. diss~:rtalion. Trchnical Report !)0-03 ( Bult'a.lo: SUNY Buffalo 
Dept. of Computer Science). 
