Extending a Formal and Computational Model of Rhetorical Structure 
Theory with Intentional Structures h la Grosz and Sidner 
l)aniel Marcu 
Information Sciences Institute and Departnlent of Computer Science 
University o1' Southern California 
4676 Admiralty Way, Suite 1001 
Marina del Rey, CA 90292-6601 
marcuOis i. edu 
Abstract 
In the last decade, members of the computational lingt, is- 
tics community have adopted a perspective on discourse 
based primarily on either Rhetorical Structure Theory or 
Grosz and Sidner's Theory. However, only recently, re.+ 
searchers have started to investigate the relationship be- 
tween the two perspectives. In this paper, we use Moscr 
and Moore's (1996) work as a departure point for extend- 
ing Marcu's formalization of RST (1996). The result is 
a tirst-order axiomatization of the mathematical prol+er- 
ties o1' text structures and of the rehttionship between the 
strttcture of text and intentions. The axiomatization en- 
ables one lo use intentions for reducing the ambiguity o1' 
discourse and the structure of discourse for deriving in- 
tentional inferences. 
I Motivation 
I n the last decade, members of the computational l inguis- 
lies cotnnmnity have adopted a perspective on discourse 
based prinlarily on either l{hetorical Structure Theory 
(P, ST) (Matin and Thompson, 1988) or Grosz and Sid 
her's Theory (GST) (Grosz and Sidnet. 1986). 
In GSq, the linguistic constituents are called discom'xe 
segments (DS) and the lingt, istic discourse slructure is 
explicitly stipulated to be a tree o1' recursively embedded 
discourse segments. Each discourse segment is charac- 
tel+ized by a prinmry intention, which is called discomwe 
segment lmrpose (DSP). GST identilies only two kinds 
o1' intention-based relations that hold between the DSPs 
of two discourse segments: domittance and sati.@tction 
precedence. When a discourse segment purpose DSPI 
that characterizes discourse segment DS1 provides part 
of the satisfaction of a discourse segment purpose DSP., 
that characterizes discourse segment DS..,, with DS1 be- 
ing embedded in DS2, it is said that there exists a domi- 
nance relation between DSP~ and DSlq, i.e., DSP.e dom- 
inates DSpI. 1t' the salislhction of DSP, is a condition of 
the satisfaction oI'DSP2, it is said that DSP1 sati,@tction- 
precedes DSP.,. 
RST has a richer ontology of relations than GST: in- 
tentional and semantic rhetorical relations are considered 
to hold between non-overlaplfing textual spans. Most 
of these relations are asymmetric, i.e., they distinguish 
between their associated nuclei, which express what is 
most essential to the writer's purpose, and their satellites, 
which support tile nuclei. In RS'I, the linguisticdiscourse 
structure is modeled recursively as a tree of related seg- 
ments. Hence, unlike GSq, where relations are consid- 
ered to hold between the DSPs associated with embed- 
ded segments, relations in RST hold between adjacent, 
non-overlapping segments. 
Because RST has traditionally been applied to build 
discourse trees of liner granularity than GST, we will 
use it here as the starting point of our discussion. As- 
sume, for example, that we are given tim following text 
(in which the elementary textual units arc labelled lbr 
reference). 
(I) INo matlcr how much one wants to stay a non-smoker, ^~ \] 
\[the truth is that the pressure to smoke in junior high is 
greater than it will be any other time of one's life) q \] 
IWe know tim\[ 3,000 teens start smoking each day, q \] lal- 
though it is a fact that 90% of them once thoughl thai 
smoking was something that Ihey'd never do. D~ I 
Assume for the moment that we do not analyze this text 
as a whole, but rather, we dctcrlnine what rhetorical rela- 
tions could hold between every pair of elementary units. 
When we apply, for example, the definitions proposed 
by Mann and Thompson (1988), we obtain the set given 
below, l 
'rh, c~_,rcl(JUSTIFICNI'ION, AraB1) 
rhcl.rrel(J USTI FICATI()N, I)1 ~ I~,1 ) 
(2) r/+ct_rcl(F, Vll)ENCF, (h, I~q) 
'rhcI._.rcl(CONCFSSION, I)i, C1 ) 
'rhet_rcI(RFSTATF.MENT, 1)1, A1) 
These relations hold because the tmdcrstanding of both 
A1 (teens want to stay non-smokers) and I:h (90% o1' the 
teens think that smoking is something that they wotdd 
never do) will increase the reader's readiness to accept 
the writer's right to present Ih (the pressure on teens to 
start smoking is greater than it will be any other time 
of their lives); the understanding of c1 (3000 teens start 
smoking each day) will increase the reader's belief of 
1~1; the recognition of Ih as something compatible with 
IThroughoul this paper, we use the convention lhat rhelori- 
cal relations are represented as stated, lirst-order predicales hav- 
ing lhe fornl rhct_rel(,act.me, mLzellite, ?~mlcus). Mullint|- 
clear relalions are represented as predicales having Ihe \['orlll 
rhct_rcl( ~m',~c, n~tcle'usl , ~uclcus~ ). 
523 
AI B1 Cl Ol 
JUSIIF,CAI\[ON JUSTIFICATION JusnRcATJON 
EVIDENCE D1 JU8 TIFIOAIION Ol 
c$ B1 CONCESSION 
al B1 cI D1 nl Ci 
h) c) d) 
Figure 1 : The sot o1' all RS-trees that can be built for text (1). 
JUST\[FICAIION 
:,1 .... 
AI JUSTtFIOAItON 
gl CI 
e) 
the situation presented in c1 will increase the reader's 
negative regard for the situation presented in cl ; and the 
situation presented in D, is a restatement o1' the situation 
presented in &. 
Marcu (1996) has shown that on the basis of only the 
rhetorical judgments in (2) and without considering in- 
tentions, there are five valid RS-trces that one can build 
for text ( I ) (see figure l ). What happens though when we 
consider intentions as well? Moore and Pollack (1992) 
have already shown that different high-level intentions 
yield different RS-trces. But how do we formalize tile 
relationship between intentions and rhetorical structures? 
For example, how can we use the discourse trees in fig- 
ure 1 in order to determine the primary intention asso- 
ciated with each analysis? And how can we determine 
what would be the corresponding dominance relations in 
a GST account of tile same text'? 
Consider also a slightly difl'erent problem: assume that 
besides rhetorical judgments, such as those shown in (2), 
one can also make intentional judgments. For example, 
assume that one is interested in an interpretation in which 
one knows that the DSP of seg,nent \[&, D1\], which con- 
tains all units from A1 tO 1)1, dominates the DSP of seg- 
ment \[c1, l)~\]. Then what is the primary intention of the 
text in that case'? And how many discourse trees are both 
valid and consistent with that intentional judgment? Nei- 
ther RST nor GST can answer these questions on their 
own. However, a unified theory can. Ill this paper, we 
provide such a theory. 
2 The limits of Moser and Moore's 
approach 
In a recent proposal, Moser and Moore (1996) argued 
that the primary intentions in a GST representation can 
be derived fi'om the nuclei ot'the corresponding RST rep- 
resentation. Although their proposal is consistent with 
the cases in which each textual span is characterized by 
an explicit nucleus that encodes the primary intention of 
that span (as in the case of text (I)), it seems that an ad- 
equate account of the correspondence between GST and 
RST is somewhat more complicated. For example, in tile 
case of text (3) below, whose RST analysis is shown in 
ligure 2, we cannot apply Moser and Moore's approach 
because we can associate tile primary intention of dis- 
course segment \[a2, B2\] neither to trait A2 nor to trait B2. 
(3) \[John wanted to play squash with Janet,Aq \[but he 
NONVOLITIONAL 
CAUSE 
C2 
A2 B2 
Figure 2: A rhetorical analysis of text (3). 
also wanted to have dinner with Suzanne. '~2\] \[He went 
crazy, c2 \] 
In Grosz and Sidner's terms, we can say that the primary 
intention ot' segment \[A2, B~\] is (Intend writer (Believe 
reader "John wanted to do two things that were incom- 
patible")). But in order to recognize this relation, we 
need to recognize that the two desires given in units A~ 
and B2 are incompatible, which is captured by the CON- 
TRAST relation that holds between the two units. In other 
words, the intention associated with segment \[A2, B2\] is a 
function both el' its nuclei, A 2 and B2, and of the rhetori- 
cal relation of CONTRAST that holds between them. 
In this paper, we generalize this obserwttion by 
making use o1" the compositionality criterion proposed 
in (Marcu, 1996), which stipulates that it'a rhetorical 
relation holds between two textual spans, a si,nilar re- 
lation also holds between two salient constructs of those 
spans. 2 Similarly, we will assume that the primary inten- 
tion of a discourse segment is not given by the nucleus 
of the corresponding relation but rather that it depends 
on the corresponding relation and the salient constructs 
associated with that segment. 
3 Melding text structures and intentions 
3.1 Formulation of the problem 
Formally, the problem that we want to solve is the 
following. Given a sequence of textual units U = 
tq, u2,..., UN, a set 1U~ of rhetorical relations that hold 
among these units, and a set o1' intentional judgments IH 
that pertain to the same units, find all legal discourse 
structures (trees) of U, and determine the dominance, 
satisl'action-precedence relations, and primary intentions 
of each span of these trees. 
Following (Marcu, 1996), we use tile predicates 
posiHo,z(ui, j) and vl, eId'd(,za,,,e, s, ,z) with the fol- 
2Seclion 3 discusses in detail how the salient construcls are deler- 
mined. 
524 
lowing semantics: tim predicate posilion(ui, j) is tree 
for a textual unit ul in sequence U if and only if 
ul is the j-th element in the sequence; the predicate 
rhei_vel(namc, ui, uj) is true for textual units ul and 
uj witb respect to rhetorical relation name, if and only 
it' the detinition provided by RST for rhetorical relation 
name applies to textual units ui, in most cases a satellite, 
and uj, a nucleus. In order to enable discourse prob- 
lems to be characterized by rhetorical judgments that 
hold between large textual spans as well, we use pred- 
icate rh.cl_rel_ext(namc, s~, s~, *~, n~). This predicate 
is trl, e for textual spans \[ss, .%\] and \[,,.,, n0\] with respect 
to rhetorical relation name if and only if the detinition of 
rhetorical relation name applies for tim textual span that 
ranges over units ss--se, ill most cases a satellite, alld tex- 
tual spans that ranges over units n.~-nc, a nucleus. 3 
From a rhetorical perspective, text (I) is described at 
the minimal unit level by the relations given in (2) and (4) 
below. 
f l,ositio,,.(A1,1), 1,ositio,~(lh, 2), (4) 
1,ositio,,.( C~ , 3), 1,ositio,,O), , ,1) 
The intentional judgments 1~1 are given by the follow- 
ing functions and predicates: 
• The predicate dom(l~, lq, 1~, h-,) is true whenever 
tbe DSP of discourse segment/span \[I1, hl\] domi- 
nates ttle DSI' of discourse segment \[l~, h:~\]. A dom- 
inance relation is well-formed if segment \[/~, h~\] 
is a proper subsegment of segment \[ll, h,t\]. i.e., 
l, </~ < h., < h, A (h ¢ z~ v h~ # h~). 
• The predicate salpvec(ll, Ih, lu, h..,) is true when- 
ever an intentional satisfactiol>precedence relation 
holds between the DSI's of segments Ill, hi\] and 
\[/2, h2\]. A satisfaction-precedence relation is well- 
formed if tile segments do not overlap. 
• Tile oracle function .fl(r, aq,..., ;%) takes as at: 
guments a rhetorical relation r and a set of texttufl 
units, and returns tbe primary intention that pertains 
to that relation and those units. For example, in 
the case of segment \[A2, Be\] in text (3), the ora- 
cle function .l) (CONTRAST, A2, B2) is assullted to 
returu a Iirst-order object wltose meaning can be 
glossed as "inform the reader that John wanted to 
do two things that were incompatible". And the 
oracle function .1) (EWDI ~;NcE, B1) associated with 
segntent \[A1,1)~\] in text (1) is assuntcd to return 
a \[irst-oMer object whose nteaning can be glossed 
as "increase the reader's belief that the pressure to 
smoke in junior high is greater than it will be any 
other time of one's life". 
Without restricting the generality of the problem, dis- 
course structures are assented to be binary trees. In our 
formalization, each node era discourse structure is char- 
aclerized by l()tu" features: the status (nucleus or satel- 
lite), tim O'lJe (the rhetorical relations tlmt hold between 
3'Fhe s ~llld e subscripls COlTCgpond Io .~tm'ling ~.lll(I ending posilions. 
the text spans that that node spans over), the l)romotion 
set (the set of units that constitute the most "salient" (ira- 
pertain) part of the text that is spanned by that node), 
and tile i)rima O, intelltion. By convention, for each leaf 
node, the type is LEAF, the promotion set is tile textual 
unit to which it corresponds, and tbe primary intention 
is that of inJbmting the content of that unit. For exam- 
pie, a representation of the tree in ligure 1.a that makes 
explicit the features el' all spans that play an active role 
in the final representation is given in \[igure 3. In general, 
the salient units are computed using the comlmsitionality 
criterion proposed in (Marcu, 1996), i.e, they are given 
by the union of the salient units of the immediate sub- 
ordinated nuclei. Similarly, the primary intentions are a 
function of tbe rhetorical relation (type) and salient units 
of each span. 
The status, type, promotion set, and primary intention 
that are associated with each node in a discourse trec pro- 
vide suflieient information for a full description of an in- 
stance of a tree structure. Given the linear nature of text 
and the fact that we cannot predict in advance where the 
boundaries between various segments will be d,'awn, we 
should provide a lnethodology that permits one to enu- 
merate all possible ways in which a tree could bc built 
on the lop of a linear sequence of elementary discourse 
units. The solution we use relies on tile same intuition 
that constitutes tile foundation of chart parsing: just as a 
chart parser is capable of consklering all possible ways 
in which different words in a sentence could be chlstered 
into higher-order grammatical units, so our formalization 
is capable of considering all the possible ways in which 
different segments coukl be joined into discourse trees. 
l,et spa,tLj, or simply \[i,j\], denote a text span 
thai includes all tile elementary discourse unils be- 
tween position i and j. Then, if we consider a 
sequence of discourse units .u~, It2:... ~'lt~t, there 
are n ways in which spans o1' length one could 
be built, spa'~Zl,l, st)(tLt2,2, • • • , 'sl)(t'/tn,n; it - \] 
ways in which spans of length two could be built, 
• spa~zl,2: Sl)~Ut..&3~... , spaltn-l,n; 11 -- 2 ways 
in which spans of length three could be built, 
and one 6"\])(t?l.l ; h Sl)(tll.2,4~ . . . ~ .5\]}altn-2,n; . . . ; 
way in which a span of length n coukl be built, spa771,n. 
Since it is impossible to determine a priori the sl)ans 
that will be used to make up a discourse tree, we will 
associate with each span that could possibly become 
part of a tree a status, a type, promotion, and primary 
intention relation and let discourse and intentional 
constraints determine the valid discourse trees. In 
other words, we want to ¢tetermine from the set of 
ha- (,,.- 1)-t- (n-2) +...+ 1 = n(n4- 1)/2 potentM 
spans that pertain to a sequence of n discourse units, the 
subset that adheres to some constraints of rhetorical and 
intentional well-formedness. For example, for text 1, 
there are d + 3 -t- 2 + \[ = l0 potential spans, i.e., 
S\])(17tl ,1 ~ 8\])aTt2,2: S1)fl713,3, sl)a?).,1,4, 8P(t?tl,2~ $1)(/N.2,3, 
sPa~l:l,4: 8Payt.1,3~ s'\])a?12,,t, and 8p(I.711,.I, but 
525 
~lllS = SATELLITE 
t)'l'ype = LEAF 
l'romotion = {all 
lntclltion = f(al) 
A1 -D1 
-- Type = EVIDENCE 
~'-~- Pl'OlllO\[ion = {B1} 
~" \]IIIonIIOII =~\[ E IVlDENCE,B1) 
hi ,,-uB1 ~-Slltttls == J UST;FICATION NUCLEUS C1-DI ('~a~ ~732 ~" ~ Type ~-- --~- ~Q.y Type SlaltlS := CONCESSION 8ATELLITE 
/~'~Q~l'Oiilotion = {811 ~ Promotion = {Ol I 
~" /i,~ltion = f {JUSTIFICATION,all ~ "~hl~atiozl = I (CONCESSION,C1) \>. 
\ 
"/ DI ~__ Slat/" N ..... ATELUTE S\[a\[tlS = NUCLEUS C1 /~Slil|tlS = NUCLEUS BI 
~_. \tJTyt,o = LEAE --- Q{)'l'ype = LEgs ---- { 3)Type = LEAF 
Promolion = Ira} Promotion = {cq Promotion = {DI1 
hltcntion = f tin) Intention = f (el) Intention = f jta) 
Figure 3: A representation of tree l.a that includes the status, type, promotion, and primary intention features that 
characterize every node that does not have a NONE status. The nunlbers associated with each node denote the limits of 
the text span that that node characterizes. 
only seven of them play an act?ve role in 
the representation given in figure l.a, i.e., 
8\])(l~.1,1, SP(llZ2,2, $1)(t1~.3,3, 8\])(t1~'4,4, St)Ctl~l ,2, spa~.3,4, 
alld .5'\])a IZ 1,4- 
To formalize the constraints that pertain both to RST 
and GST, we thus assume that each potential span \[1, hi 
is characterized by the following predicates: 
• S(I, h, s lalus) provides the status of span El, h\], i.e., 
the text span that contains units / to h; staZus can 
take one of the values NUCLEUS, SATELLITE, or 
NUNS. according to the role played by that span 
in the tinal discot,rse tree. For example, for the 
tree depicted in tigure 3, some of the relations that 
hold are: ,5'(1, 2, NUCLEUS),,5'(3, 4, SATELLITE), 
,5"(1,3, NONE). 
• T(1, h, relation_ua.rn.e) provides the name of the 
rhetorical relation that holds between the text 
spans that are immediate subordinates o1' span 
El, h\] in the discourse tree. If the text span is 
not used in the construction of the final tree, 
the type assigned is NONE. For example, for 
the tree in ligure 3, some o1' the relations that 
hold are: T(I, J, LEAF), 5/'(1,2, JUSTW~CATION), 
T(3, 4, CONC~SSrON), T(1, 3, NONE). 
• P(I, h.,unit_name) provides one of the set of 
units that are salient for span El, h\]. The col- 
lection of units for which the predicate is true 
provides the promotion set of a span, i.e., all 
units that are salient for that span. If span \[1, h\] 
is not used in the tilml tree, by convention, the 
set of salient units is NONE. For example, for 
the tree in figure 3, some of the relations that 
hold are: P(1, 1., &), P(1, 2, lh), P(1,3, NONE), 
1'(3, 4, D,). 
• Ill, h, intention) provides the primary intention 
of discourse span El, h\]. The term iulenlion is 
represented using the oracle ftmction J). For ex- 
ample, for the tree in figure 3, some of the rela- 
tions that tloi(t arc: I(3, 4, f/(CONCESSION, Cj )), 
l(J,/1, .fI(P:VIDENCI~:, B\])), l(J, 3, NONE). 
3.2 An integrated formalization of RST and GST 
Using the ideas that we have discussed ill the previous 
section, we present now a first-order formalization of dis- 
course structures that makes use both of RST- and GST- 
like constraints. In this lbrmalization, wc assume a uni- 
verse that consists of the set of natural numbers fi'om J 
tO N, where N represents the number of textual units in 
the text that is considered; the set of names thai were 
defined by Mann and Thompson for each rhetorical rela- 
tion; the set of unit names that are associated with each 
textual unit; and four exlra constants: NUCLEUS, SATEL- 
LITE, NONE, and LI~2AF. The formalization is assumed lo 
provide unique name axioms for all these constants. 
The only funclion symbols that operate eve," the as- 
sumed domain are the mlditional + and - functions that 
are associated with the set of natural numbers and the or- 
acle function J). The formalization uses the traditional 
predicate symbols that pertain to the set of natural num- 
bers (<, <, >, >, =, ¢) and eight other predicate sym- 
bols: ,5', T, P and I to account for the status, type, salienl 
units, and primary intention that are associated with ev- 
ery text span; vhel_vel to account for the rhetorical rela- 
lions that hold between different textual units; position 
to account for the index of the textual units in lhe text 
dmt one considers; dora to account for dominance rela- 
tions; and satprec to account for satisfaction-precedence 
relations. 
Throughout the paper, we apply the convention that 
all unbound variables are universally quantified and that 
variables are represented in lower-case italics" and con- 
stants in SMALL CAPITALS. We also make use of the 
two extra relations, vclevaul_uni~ and relevant_tel. 
For every text span span \[/, hi, relevant_unit(l, h, u) 
describes the set ot' textual units that are relevant for 
that text span, i.e., the units whose positions in the 
initial sequence are numbers in the interval \[l, hi. It 
is only these units that can be used to label the pro- 
526 
motion set associated with a tree that subsumes all 
units in the interval \[l, hi. For every text span \[1, h.\], 
vclevcm.Z_vcl(l, h, name) describes the set of rhetorical 
relations that are relevant to that text span, i.e., the set of 
rhetorical relations that span over text units in the inter- 
val \[1, h\] and the set of extended rhetorical relations that 
span over text spans that cover the whole interval \[/, h\] 
(see (Marcu, 1996) for the formal delinitions of these re- 
httions.) 
For example, fin" text (1), which is descrihed formally 
in (2) and (4), the following is the set of all rclc'~a~zl_rel 
and vclevctn~_unil, relations that hold with respect to text 
segment \[l,3\]: {vclcvanLvcl(l,3, JUSTWlCaTtON), 
'rclcvanl_vcl(l, 3, EVII)ENCl0, relevcr, t_m~it(I, 3, &), 
,.~z~v.,,t_~,,nit(l, a, B~), ,.d~,:.,,z_,,,,it(l, :~, q)}. 
The constraints that pertain to the discourse trees that 
we formalize can be partitioned into constraints related to 
the domain of objects over which each predicate ranges, 
constraints related to the structure of the tree, and con- 
straints that relate the slrucltlral COlnponenl with the in- 
tentional component. The axioms that pertain to the do- 
mains over which predicates ,5, P, and 7' range and the 
constraints related to the structure of the live are the same 
as those given by Marcu (1996). For lhe sake of com- 
pleteness, in this paper we only enumerate then\] infor- 
mally. In contrast, the axioms that pertain to intentions 
and the relation between structure and intentions are dis- 
cussed in detail. 
Constraints that concern the objects over which the 
predicates that describe every segment \[1, hi of a text 
structure range (Mareu, 1996, pp. 1072-1073). 
,, For every siren \[/, h\], the set or objects over which 
predicate ,5' ranges is the set {NUC1A,~US, SNI'ELIJTI,\], 
NONE). 
• The status of any discourse segment is unique. 
• For every segment \[l, h\], the set of objects over 
which predicate 7' ranges is the set of rhetorical re- 
lations that are relevant to that span. 
• At most one rhetorical rdation can connect two ad- 
jacent discourse spans 
• The primary intention of a discourse segment is ei- 
ther NONE or is a function of the salient units that per- 
tain to that segment and of the rhetorical relation that 
holds between the immediate subordinated segments. 
Since we want to stay within the boundaries of Iirst-order 
logic, we express this (see formula (5) below) by means 
of a disjunction of at most N sulfformulas, which corre- 
spond to the cases in which the span has I, 2 .... , or N 
salient traits. 4 
4Formula (5) reflects no preference concerning lhe order in which 
rhetorical relalions and intentions should be computed (Asher and Las- 
carides, 1998). It only asserts a consh'ailll on the two. 
\[(1 < h < N) A (l < I < h.)\] 
{ I(I, h, i'n.t~t.io~u,) --, 
i.n.leT~ionzb = NONF, V 
(~,', .,)\[7'(I, h, ,') A ,' ¢ NONI:.A 
PU, h., ..;) A (V,/)(\]'(~, :,, y) -~ ,; = y)A 
i,,.te,,gio,,4h = fz(,', ,;)\]V 
(~'r, ~c,,-2)\[{1'(/, h, r) A 'r   NONF, A 
P(I, h,..,.,) A P(~, h,.,:2) A.;, ¢ .:_~A 
(Vv)(\]'(1, h, v) ~ (v = .';, v :j = ~2))A 
i~,~,tio~,., = f .(,., ....,, :,:~)\]v (5) 
(~'r, a:,, a:2 .... , :,:N)\[S\]~(/, h, r) A r y:- NONEA 
a;1 7~ a:~ A a:l # a::~ A ... A :cl ¢ ~;NA 
• ~;2 -7 k a'3 A ... A :C# ~ :;';NA 
,~:N--I :~ XNA 
P(/, h, ,:, ) A e(t, h, ~) A... A PU, h,, ,;,)A 
(V~)(P(t, h, y) -+ (:/= ~, v... v y = ,;,)> 
inl.c,,.lio,tu, --- fz(r, :c,, a;u,. •. , ,;,)\]} 
• The primary intention of any discourse segment is 
unique. 
(6) \[(i < 1,. < N) A (1 5_ t < 1,.)\] 
\[(1(~,/,, i, ) A J(I, h, <)) -- .i, = <4 
• For every segmeut \[l, hi, the set of objects over 
which predicate P ranges is the set of units that make 
up that segment 
Constraints that concern /lie strnctmm of the dis- 
course trees 
• The status, type, and promotion set that are associ- 
ated with a discourse segment reflect the COmlmsition - 
ality criterion. That is, whenever a rhetorical relation 
holds between two spans, either a simihu" relation holds 
between Ihe mosl salicnl units of those spans or an ex- 
tended rhetorical relation holds between those spans. 
• Discourse segments do not overlap. 
• A discourse segment with status NONE does not par- 
ticipate in the tree at all. 
• There exists a discourse segment, the root, that 
sirens over the entire text. 
~,S'(1, N, nonl') A ~P(\], N, NONF,)A 
(7) ~"(1, N, NONIi) A -71(1, N, NONE) 
• The dominance relations described by Grosz and 
Sidner hold Between the DSP of a discoorse seg- 
ment and the DSP of'its most immediate subordinated 
satellite. This constraint is consistent with Moser and 
Moore's (1996) discussion of RST and GST. In fact, this 
is not surprising if we examine the definitions of dom- 
inance relation given by Grosz and Sidner and satellite 
given by Mann and Thompson: a discourse segment 
purpose D,5't½ dominates a discourse segment purpose 
D,5'1"1 if I),5'P\] contributes to the satisfaction el' the 
I),5'1½. But this is exactly the role that satellites play in 
P, ST: they do not express what is most essential for the 
writer's purpose, but rather, provide supporting informa- 
lion that contributes to the understanding of the nucleus. 
527 
The relationship between Grosz and Sidner's domi- 
nance relations and Mann and Thompson's distinction 
between nuclei and satellites is formalized by axioms (8) 
and (9). 
\[(1 ~ hl _< N) A(1 ~ 11 ~ I,.I)A 
(1 ~ h,9 ~ N) A (1. < 19 < h,2)\] "~> 
{\["~,5'(11, hi, NONE) A ,~'(/2, h2, SATEI,L1TF,)A 
11 <l~ <h~ <hlA (s) ~(~-+'+, ,',,+)(,',+ < 6 < z~ < 
h~ _< h~ < \],,~A 
(13 ¢ 12 V h,,3 ¢ h,2)A 
s(/+, ha, SATI+LLm,:))\] 
dom(ll, hq, 12, h2)} 
\[(+ < h, < N) ,X (+ _< h _< h,) /, (l _< h+ < N)A 
(9) (1 ~ 1.9 .~ 11.2) A do?l+(l,, lt, l, 12, //.2)\] "--+ 
\[-~,5'(h, hi, NON.:) A S(6, h_~, SATJILUTE)\] 
Axiom (8) specities that if segment \[12, h.2\] is the imme- 
diate satellite el'segment \[lt, lq\], then there exists a dom- 
inance relation between the DSP of segment \[/1,/q\] and 
the DSP of segment \[12, h2\]. Hence, axiom (8) explicates 
the relationship between the structure of discourse and 
intentional dominance. In contrast, axiom (9) explicates 
the relationship between intentional dominance and dis- 
course structure. That is, if we know that the intention 
associated with span \[lj, 1,1\] dominates the intention as- 
sociated with span \[12, h,2\], then both those spans play an 
active role in the representation and, moreover, the seg- 
ment \[12,11,2\] plays a SATELLITE role. 
• The satisfaction-precedence rdations described by 
Grosz and Sidner are parataetie relations that hold 
between arlfitrarily large textual spans. Neverthe- 
less, as we have seen in the examples discussed in this 
paper, the fact that a paratactic relation holds between 
spans does not imply that there exists a satisfaction- 
precedence relation at the intentional level between those 
spans. Therefore, for satisfaction-precedence relations, 
we will have only OnE axiom, that shown in (I0), below. 
\[(t 5 hJ ~ N) A (1 ~ 11 ~ hl) A (\] <" h,2 ~ N)A 
(1 o) 0 <- z~ _< ,'+2) A .,+,~,tv.,'~4.'~, h,~, z~, \],.,_,)\] -+ 
\[S(11, h,1, NUCI+EUS) A ,5'(12, h,2, NUCI,EUS)\] 
This specifiES that the spans that are arguments of a 
satisfaction-precedence relation have a NUCLEUS status 
in the linal representation. 
4 A computational view of the 
axiomatization 
Given the formulation discussed abovE, tinding the dis- 
course trees and the primary intentions lkw a text such as 
that given in (1) amounts to finding a model for a first- 
order theory that consists of formulas (2), (4), and the 
axioms enumerated in section 3. 
There are a number of ways in which one can pro- 
ceed with an implementation: for cxalnple, a smtight- 
forward choice is one that applies constraint-satisl'action 
techniques, an approach that extends that discussed 
in (Marcu, 1996). Given a sequence U of N textual units, 
one can take advantage of the structure of the domain and 
associate with each of the N(N-F 1)/2 possible text spans 
a status and a type variable whose domains consist in the 
set of objects over which the corresponding predicates 
,5 + and T, range. For each of the N(N + 1)/2 possible 
text spans \[l, h.\], one can also associate h, - l + \] promo- 
lion variables. These are boolean variables that specify 
whether units l, 1 + \],... , h belong to the promotion set 
of span \[/, hi. For each of the N(N + 1)/2 possible text 
spans \[l, hi, one can also associate h - 1 + 2 intentional 
variables: one of these wtriables has as domain the set 
of rhetorical relations that are relevant for the span \[1, hi. 
The rest of the h -/+ 1 wwiables are boolean and specify 
whether unit l, l-t- \] .... , or h are arguments of the oracle 
function f~ that intentionally characterizes that span. 
Hence, each text of N units yields a constraint- 
satisfaction prohlem with N(N + I)(2N + \]3)/6 vari- 
ables (NCN q- \])(2N -}- 13)/(J = 2NCN q- \])/~ -}- 
V,2<=N V,h<----N I<=N W,h<=N(h_l_F2))). (h'--l+l)+~l-1 Z-,h.=l Z-,I=1 Z~,h=l 
The constl+aints associated with these wtriables arc a one- 
to-onE mapping o1' the axioms in section 3. Finding the 
set of RS-trees and the intentions that are associated with 
a given discourse reduces then to/inding all the solutions 
for a traditional constraint-satisfaction problem. 
5 Applications 
Reasoning from text structures to intentions. Con- 
sider again the example text (1), which was usEd 
throughout this paper. As we discussed in section 1, il' 
we assume that an analyst (or a program) determines that 
the rhetorical relations given in (2) hold between the el- 
ementary units of the text, there arc live valid trees that 
correspond to text (1) (see figure 1). If we consider now 
the axioms that dEscribE the relationship bEtwEen text 
structures and intentions, we can infer, for example, thai, 
for the tree I.a, the DSP of span \[A1,131\] dominates the 
DSP of span \[cj, l)j\] and that the primary intention of 
the whole text depends on unit B1 and on the rhetori- 
cal relation of EVID\]\]NCF,. Ill such a casE, the axiomati- 
zation provides the means for drawing intentional infer- 
ences on the basis of the discourse structure. Also, al- 
though there are live discourse structures that are consis- 
tent with the rhetorical judgments in (I), they yield only 
three intentional interpretations, i.e., there arc only three 
primary intentions that one can associate to the whole 
text. One intention is that discussed above, which is as- 
sociated with analysis I.a. Another intention depends on 
unit Bz and the JUSTIFICATION relation that holds be- 
tween units A1 and lh; this intention is associated with 
the analyses shown in ligure 1.c and l.e. And another in- 
tention depends on trait Bj and the JUSTIFICATION rela- 
tion that holds between units l)j and Bj ; this intention is 
associated with the analyses shown in figure 1.b and 1.d. 
Reasoning fronl text structures to intentions can be 
also beneficial hi a context such as that described by 
Lochbaum (1998) because the rhetorical constraints can 
help prune the space of shared phms that woukl charac- 
terize an intEn tional interpretati o n of a d iscou rse. 
528 
Using intentions lbr nmnaging rhetorical amlfiguities. 
Assume now that besides providing.ivdgments concern- 
ing the rhetorical rehttions that hold between various 
units, an analyst (ot" a progran0 provides judglnents of 
intentions as well. If, lk+t" cxaml+le, besides the relations 
given in (2) a program determines that the DSP of span 
tAt, 1)1\] dominates 111o DSP of unit I/i, the theory that 
corresponds to these judgments and 111e axioms given 
in section 3 yields only two wdid text structures, those 
presented in \[igure l.b and I.d. In this ease, the axiom- 
atization provides the means of using intentional judg- 
ments for reducing the ambiguity that characterizes the 
discourse parsing process. 
hwestigating the relationship between semantic and 
intentional relations. In their seminal paper, Moore 
and Polhtck (1992) showed lhat a text may be charac- 
terized by intentional and rhetorical analyses that are not 
isomorphic. For example, for the text shown in (1 I) be- 
low, which is taken from (Moore and Pollack, 1992), one 
may argue from an informational perspective that A3 is 
a CONI)ITION \['or B3. However, l}'otll an intentional per- 
spective, one may argue thai 1',3 can be used to MOTI- 
VATI'; A3. Similal + judgments can be made with respect 
to units 1{3 and c3. Hence, lhe set of relations that COln- 
pletely characterizes text (11) is thal shown in (12) be- 
low. 
(11) \[Come home by 5:00. ^a\] \[Then we can go to the hard- 
ware store before it closes)':'\] \[That way we can linish 
Ihe bookshelves tonightY:' \] 
.rhct_.rcl(CONl)lTlON, A:~ 1',.+.) 
'rhcI_.rcl(MOTIVATION, B;:, A:: ) 
(12) rh(t_rcl({;ONI)lrlON, 1~:.., C':,. ) 
'r/t.CI_,"cl(MOTIVATION, C::, B:;) 
When given this discourse problenl, our imple- 
mentation produces the four discourse trees shown 
iu figure 4, each el + them having a different primary 
intention (./"/(CONI)ITION, C3), f!(MOTIVATION, a3), 
.ft(MOTWATION, B3), and ./) (CONl)rrtoN, I+:~)). 
Hence, our approach enables one to derive automatically 
and enumerate all possible rhetorical interpretations of 
a text and to study the rehttionshil~ between structure 
and intentions. Our approach does not provide yet the 
mechanisms for choosing between different interpreta- 
tions, but it provides the foundations for such a study. In 
contrast, Moore and Pollaek's informal approach could 
neither derive nor enumerate all possible interpretations: 
in fact, their discttssion refers only to the two trees 
shown in ligure 4.a and .b. 
Unlike Moore and Polhtck's approach, where it is sug- 
gested that a discourse representation should reflect si- 
multaneously both its informational and intentional inter- 
pretations, the approach presented here is capable of only 
enumerating these interpretations. The formal model we 
proposed is not rich enough to accotlllllodate conctlrretH, 
non-isomorphic interpretations. 
......... j I ................................ ,u++ _. /- - ~,\] ??.c~ 
+ ................. + + ,+ 
A3 B3 93 C3 A3 B3 83 C2 
a) b> c) a: 
Figure 4: The set of all RS-trces that can be built for 
text (I 1). 
6 Conclusion 
Crucial to tile develolmlent of syntactic theories was the 
ability to provide mechanisms capable of deriving all 
valid syntactic interpretations of a given sentence. Se- 
mantic or corpus-specific information was then used to 
manage the usually large number of interpretations. 
The work described in this paper sets theoretical foun- 
dations that enable a similar approach to the study of dis- 
course. The way a syntactic theory enables all wtlid syn- 
tactic trees of a sentence be derived, the same way the 
axiomatization presented here enables all valid discourse 
trees of a text be derived. But the same way a sylltac- 
tic theory may produce trues that arc incorrect ftonl a 
semantic perspective for example, the same way the ax- 
iomalization described here may produce trees that are 
incorrect when, for example, focus and cohesion are fac- 
tored in. 
A ntmlber o1' researchers have ah'eady shown how in- 
dividual rhetorical and intentional judgments can be de- 
rived automatically l'mm linguistic constructs such as 
tense and aspect, certain patterns of pronominalization 
and anaphoric usages, it-clefts, and discourse markers or 
cue phrases. But once lhese.iudgmcnts arc made, we still 
need to determine all discourse interpretations that are 
not only consistent with these judgments but also wtlid. 
This paper provides mechanisms for deriving and enu- 
merating all valid structure of a discourse and enables a 
quantitative study el' the relation between text structures 
atld intentions. 

References 

Nicholas Asher and Alex Lascarides. 1998. Questions in dia- 
logue. Linguistics' and l'hilosophy, 21 (3):237-309. 

Barbara J. Grosz and Candace L. Sktner. 1986. Attention, in- 
tentions, and lhe structure of discourse. Co,qmlational Lin- 
guLvlics, 12(3): 175-204. 

Karcn IL Lochbaum. 1998. A collabonttive planning 
model of intentional structtlre. Computational Linguistics, 
24(4):525-572. 

William C. Marm and Sandra A. Thompson. 1988. Rhetorical 
structure lheou: Toward a functional theory of text organi- 
zation. 7Eft, 8(3):243-281. 

l)aniel Marcu. 1996. Bt, ilding up flmtorical structure trees. 111 
Proceedings of AAA 1-96, rages 1069-1074. 

Johanna 1). Moore and Mart ut E. Polhtck. 1992. A problem 
for RST: The need for multi-level discourse analysis. Com- 
pulalional LinguLvlics, 18(4):537-544. 

Megan Moser and Johanna 1). Moore. 1996. 'lbward a synthe- 
sis of two accotttlls of discot, rse structure. Computational 
Lingttistics, 22(3):409-419. 
