Domain Structure, Rhetorical Structure, and Text Structure 
Penelope Sibun 
Fuji Xerox Palo Alto Laboratory 
3400 Hillview Avenue 
Palo Alto, CA 94304 
(415) 813-7772 
s ibun@pal, xerox, com 
Introduction 
It is generally agreed that text has structure (at least, coherent text does). Therefore, an under- 
standing and appreciation of text structure must play some role in building computational systems 
that are capable of using text as people do. What is less clear is what are necessary and sufficient 
sources of structure for a text-using system, and further, what such a system needs to know about 
and do with these structures in the process of using text. By using text, I mean understanding it 
or producing it; speaking it, writing it, or thinking about it. In this paper, I present a case for the 
importance of domain structure in structuring text, and discuss the role of rhetorical structure and 
intentionality. 
Intentionality and Structure 
The purpose of this workshop is to consider traditional and new approaches to identifying and 
representing structure for text. The workshop title suggests that intentionality, or rather, our 
representation of it, has a critical role in text structure. I have no intention of arguing against 
the existence of intentions; however, I believe this emphasis on intentions in accounting for text 
structure is misdirected, for two reasons. 
The first reason is similar to one that I present in the next section as an argument against 
rhetorical relations, namely, that a representation of intentional structure has no explanatory power 
in describing at least some texts. For instance, a speaker's intentions in answering the request, 
"Please describe for me the layout of this house," are not likely to be more complex than to provide 
a coherent answer. Knowing about the intention to provide a coherent answer, however, reveals 
nothing about what actually makes the resulting text coherent. (Vander Linden, this volume, 
makes a similar point with respect to written instructional text.) 
The second reason is that intentions are typically construed to be entities that are abstract, 
domain independent, and so forth, divorced from the particulars of how they are expressed. In 
all my work in understanding how text is put together, I have yet to find a use for intentions so 
construed. Rather, I suspect that intentionality is so completely diffused throughout the structure 
118 
of a text that to consider it as a separate phenomenon is not instructive. Exploring this issue 
further is beyond the scope of this position paper; I refer readers interested in a fresh perspective 
on intentionality to Batali (1993). 
Structure: Domain, Rhetorical, and Text 
In this paper, I suggest that, irrespective of their relation to intentions, rhetorical relations, as 
they are typically construed 1 (e.g., in Mann and Thompson's Rhetorical Structure Theory (RST) 
(1987)), 
1. are not sufficient sources of structure for a text-using system; and 
2. are not necessary for at least some uses of text. 
Instead, I take the position that, 
1. in order not to be vacuous, rhetorical structure must be grounded in the domain of discourse, 
and thus cognizant of the structure of that domain; and 
2. in some types of text the structure of the domain is a sufficient source for the structure of 
the text, and rhetorical relations are superfluous. 
I will briefly sketch both parts of this position. They are discussed in more detail in Sibun (1992, 
1991), which also describe a working text generation system built according to these principles. 
To see what is missing from an account of text structure that is solely in terms of rhetorical 
structure, let us consider one of the best known and most thoroughly worked out theories, Mann 
and Thompson's RST. Seven of the 23 RST relations are presentational relations, while most (16) 
are subject matter relations, such as NON-VOLITIONAL CAUSE and SEQUENCE. Out of combinations 
of any of these relations can be built a representation of the structure of a text. However, while this 
rhetorical structure may be a domain independent way of representing how domain knowledge may 
be variously ordered, the question remains of what the domain structure is that is being reordered 
domain independently. In other words, since we assume that rhetorical relations are not assigned 
arbitrarily between clauses, there must be something about the domain itself that constrains the 
subject matter relations. 
Consider, for example, the text produced in response to a request to describe a particular 
house; suppose the house is a large one with a livingroom and a sunporch in the northeast corner 
and a diningroom and a kitchen in the southwest comer. If the text contains these two clauses (the 
second is reduced), 
there's a sunporch 
then a livingroom 
XSee Sidner, this volume, for a discussion of the semantic drift of the term "rhetorical" from its use with respect 
to the dlsclp\]ine of rhetoric to its current use in computational linguistics. 
119 
then the relation SEQUENCE is licensed because the rooms are physically proximate (physical prox- 
imity is a relation in spatial domains such as houses). However, if the second clause mentions the 
diningroom on the other side of the house, no SEQUENCE may be involved, though at the rhetorical 
structure level, it is unclear what other relation could possibly be assigned. Indeed, the text may 
prove to be incoherent, but it is incoherent precisely in virtue of the relations in the domain, not 
in virtue of the SEQUENCE relations in the rhetorical structure. (See Sibun (1992) and Kittredge 
et al. (1991) for further examples of the need for taking domain structure into account.) 
I have shown that domain structure is crucial to ascertaining rhetorical structure. Any system 
of rhetorical relations, therefore, can only lay claim to sufficiently accounting for text structure 
if it is grounded in any domain of discourse to which it is applied, and it knows about how the 
domain structure relates to the rhetorical structure. (Rainbow (1990) calls knowledge about this 
relationship domain communication knowledge.) 
If rhetorical structure is not sufficient for accounting for domain structure, a logical question 
to ask is whether it is necessary. I am certain that it is necessary--some of the time. However, 
some of the time rhetorical structure does not appear to contribute anything to the structure of 
a text. This can be proved by demonstration: Sallx (Sibun 1991, 1992) can generate multidansal 
coherent texts in which the sole source of text structure is domain structure. Salix has generated 
domain-structured texts in domains of houses, families, text style, and airports. 
One might be tempted to argue that it is important that rhetorical structure be available for the 
extent of a text. Thus, for example, a system generating a domain-structured text would maintain 
a rhetorical structure composed of a series of SEQUENCE relations. This rhetorical structure clearly 
doesn't shed any light on the text structure, and it has no impact on the generator's decisions; 
indeed its validity depends directly on the domain structure, bringing us back to the discussion 
ab ore. 
Conclusion 
Because, in the general case, we cannot count on rhetorical structure to be either necessary or suffi- 
cient to account for text structure, I believe that rhetorical structure should not be the cornerstone 
of a computational theory of text structure. Rather, I think a fresh perspective is needed, in which 
rhetorical structure, domain structure, intentional structure, conversational structure, grammatical 
structure, and so forth, are all seen to play a role, none of them privileged, in text structure. I 
think that if we take all of these sources of structure into account, and emphasize one or more of 
them as seems warranted for the job at hand, we will find our task becomes easier: rather than 
appealing to a single abstract theory in discussing all texts, we will be availing ourselves of a rich 
set of tools for understanding each text to the fullest. 
120 

References 
Batali, J. (1993), "Trails as Archetypes of Intentionality." To appear in Proceedings of the Fifteenth 
Annual Conference of the Cognitive Science Society, University of Colorado at Boulder. 
Kittredge, R., T. Korelsky, and O. Rainbow (1991), "On the Need for Domain Communication 
Knowledge," Computational Intelligence: Special Issue on Natural Language Generation, Vol- 
ume 7(4). 
Mann, W. and S. Thompson (1987), Rhetorical Structure Theory: A Theory of Tezt Organiza- 
tion. Report ISI/RS-87-190, University of Southern California, Information Sciences Institute. 
Rainbow, O. (1990), "Domain Communication Knowledge." In Proceedings of the Fifth Inter- 
national Workshop on Natural Language Generation, pp 87-94, Linden Hal.l, Dawson, PA. 
Sibun, P. (1992), "Generating Text without Trees." Computational Intelligence: Special Issue 
on Natural Language Generation, Volume 8(1), pp 102-122. 
Sibun, P. (1991), Locally Organized Tezt Generation. Ph.D. Thesis (COINS Technical Report 
91-73), Department of Computer and Information Science, University of Massachusetts. 
Sidner, C. (1993), "On Discourse Relations, Rhetorical Relations, and Rhetoric." To appear in 
Proceedings of the A CL Workshop on Intentionality and Structure in Discourse Relations (this vol- 
ume). 
Vander Linden, K. (1993), "Rhetorical Relations in Instructional Text Generation." To appear 
in Proceedings of the A CL Workshop on Intentionality and Structure in Discourse Relations (this 
volume). 
