THE EXPRESSION OF LOCAL RHETORICAL RELATIONS IN 
INSTRUCTIONAL TEXT* 
Keith Vander Linden 
Department of Computer Science 
University of Colorado 
Boulder, CO 80309-0430 
Internet: linden@cs.colorado.edu 
INTRODUCTION 
Given the prevalence of the use of rhetorical rela- 
tions in the generation of text (Itovy, 1989; Moore 
and Paris, 1988; Scott and Souza, 1990), it is 
surprising how little work has actually been done 
on the grammatical realization of these relations. 
Most systems, based on Mann and Thompson's 
formulation of Rhetorical Structure Theory (Mann 
and Thompson, 1988), have adopted simplified so- 
lutions to their expression. If, for example, an ac- 
tion, X, and a purpose for that action, Y, must be 
expressed, a standard form such as "Do X in or- 
der to Y" will be generated. In reality, the purpose 
relation can be and is expressed in a myriad of dif- 
ferent ways depending upon numerous functional 
considerations. Consider the following examples: 
(la) Follow the steps in the illustration below, 
for desk installation. (code 1) 
(lb) To install the phone on a desk, follow the 
steps in the illustration below. 
(le) Follow the steps in the illustration below, 
for installing the phone on a desk. 
(ld) For the desk, follow the steps in the 
illustration below. 
These examples of purpose expressions illus- 
trate two issues of choice at the rhetorical level. 
First, the purpose clauses/phrases can occur ei- 
ther before or after the actions which they moti- 
vate. Second, there are four grammatical forms to 
choose from (all found in our corpus). In (la), we 
see a "for" prepositional phrase with a nominaliza- 
tion ("installation") as the complement, in (lb), a 
"to" infinitive form (tnf), in (lc), a "for" preposi- 
tion with a gerund phrase as a complement, and 
*This work was supported in part by NSF Grant 
IRI-9109859. 
1 My convention will be to add a reference to the end 
of all examples that have come from our corpus, indi- 
cating which manual they came from. (code) and (exc) 
will stand for examples from the Code-a-Phone and 
Excursion manuals respectively (Code-a-phone, 1989; 
Excursion, 1989). All other examples are contrived. 
318 
in (ld), a "for" preposition with a simple object 
as the complement. Although all these forms are 
grammatical and communicate the same basic in- 
formation, the form in (la) was used in the corpus. 
I am interested in the functional reasons for this 
choice. 
Another aspect of this analysis to notice is 
that, contrary to the way rhetorical structure the- 
ory has been used in the past, I have allowed 
phrases, as well as clauses, to enter into rhetor- 
ical relations. This enables me to address the use 
of phrases, such as those in (la), (lc), and (ld), 
which hold rhetorical relations with other spans of 
text. 
The proper treatment of alternations such as 
these is crucial in the generation of understandable 
text. In the following sections, I will discuss a 
methodology for identifying such alternations and 
include samples of those I have found in a corpus 
of instructional text. I will then discuss how to 
formalize and implement them. 
IDENTIFYING 
ALTERNATIONS 
I identified alternations by studying the linguistic 
forms taken on by various rhetorical relations in a 
corpus of instructional text. The corpus, currently 
around 1700 words of procedural text from two 
cordless telephone manuals, was large enough to 
expose consistent patterns of instructional writing. 
I plan to expand the corpus, but at this point, the 
extent to which my observations are valid for other 
types of instructions is unclear. 
To manage this corpus, a text database sys- 
tem was developed which employs three inter- 
connected tables: the clause table, which repre- 
sents all the relevant information concerning each 
clause (tense, aspect, etc.), the argument table, 
which represents all the relevant information con- 
cerning each argument to each clause (subjects, 
objects, etc.), and the rhetorical relation table, 
which represents all the rhetorical relations be- 
tween text spans using Mann and Thompson's for- 
malism. I used this tool to retrieve all the clauses 
and phrases in the corpus that encode a particular 
local rhetorical relation. I then hypothesized func- 
tional reasons for alternations in form and tested 
them with the data. I considered a hypothesis 
successful if it correctly predicted the form of a 
high percentage of the examples in the corpus and 
was based on a functional distinction that could 
be derived from the generation environment 2. 
I have analyzed a number of local rhetorical 
relations and have identified regularities in their 
expression. We will now look at some representa- 
tive examples of these alternations which illustrate 
the various contextual factors that affect the form 
of expression of rhetorical relations. A full anal- 
ysis of these examples and a presentation of the 
statistical evidence for each result can be found in 
Vander Linden (1992a). 
PURPOSES 
One important factor in the choice of form is the 
availability of the lexicogrammatical tools from 
which to build the various forms. The purpose re- 
lation, for example, is expressed whenever possible 
as a "for" prepositional phrase with a nominaliza- 
tion as the complement. This can only be done, 
however, if a nominalization exists for the action 
being expressed. Consider the following examples 
from the corpus: 
(2a) Follow the steps in the illustration below, 
for desk installation. (code) 
(2b) End the second call, and tap FLASH to 
return to the first call (code) 
(2e) The OFF position is primarily used for 
charging the batteries. (code) 
Example (2a) is a typical purpose clause 
stated as a "for" prepositional phrase. Example 
(2b) would have been expressed as a prepositional 
phrase had a nominalization for "return" been 
available. Because of this lexicogrammatical gap 
in English, a "to" infinitive form is used. There 
are reasons that a nominalization will not be used 
even if it exists, one of which is shown in (2e). 
Here, the action is not the only action required 
to accomplish the purpose, so an "-ing" gerund is 
used. This preference for the use of less prominent 
grammatical forms (in this case, phrases rather 
2In the process of hypothesis generation, I have 
frequently made informal psycholinguistic tests such 
as judging how "natural" alternate forms seem in the 
context in which a particular form was used, and have 
gone so far as to document this process in more com- 
plete discussions of this work (Vander Linden et al., 
1992a), but these tests do not constitute the basis of 
my criteria for a successful hypothesis. 
than clauses) marks the purposes as less impor- 
tant than the actions themselves and is common 
in instructions and elsewhere (Cumming, 1991). 
PRECONDITIONS 
Another issue that affects form is the textual con- 
text. Preconditions, for example, change form de- 
pending upon whether or not the action the pre- 
condition refers to has been previously discussed. 
Consider the following examples: 
(3a) When you hear dial tone, dial the number 
on the Dialpad \[4\]. (code) 
(3b) When the 7010 is installed and the battery 
has charged for twelve hours, move the 
OFF/STBY/TALK \[8\] switch to STBY. (code) 
Preconditions typically are expressed as in 
(3a), in present tense as material actions. If, 
however, they are repeat mentions of actions pre- 
scribed earlier in the text, as is the case in (3b), 
they are expressed in present tense as conditions 
that exist upon completion of the action. I call 
this the terminating condition form. In this case, 
the use of this form marks the fact that the readers 
don't have to redo the action. 
RESULTS 
Obviously, the content of process being described 
affects the form of expression. Consider the fol- 
lowing examples: 
(4a) When the 7010 is installed and the battery 
has charged for twelve hours, move the 
OFF/STBY/TALK \[8\] switch to STBY. The 
7010 is now ready to use. (code) 
(4b) 3. Place the handset in the base. The 
BATTERY CHARGE INDICATOR will light. 
(exc) 
Here, the agent that performs the action de- 
termines, in part, the form of the expression. In 
(4a), the action is being performed by the reader 
which leads to the use of a present tense, relational 
clause. In (4b), on the other hand, the action is 
performed by the device itself which leads to the 
use of a future tense, action clause. This use of fu- 
ture tense reflects the fact that the action is some- 
thing that the reader isn't expected to perform. 
CLAUSE COMBINING 
User modeling factors affect the expression of in- 
structions, including the way clauses are com- 
bined. In the following examples we see actions 
being combined and ordered in different ways: 
(5a) Remove the handset from the base and lay 
it on its side. (exc) 
319 
(5b) Listen for dial tone, then make your next 
call (code) 
(5c) Return the OFF/STBY/TALK switch to 
STBY after your call. (code) 
Two sequential actions are typically expressed 
as separate clauses conjoined with "and" as in 
(5a), or, if they could possibly be performed si- 
multaneously, with "then" as in (5b). If, on the 
other hand, one of the actions is considered obvi- 
ous to the reader, it will be rhetorically demoted 
as in (5c), that is stated in precondition form as 
a phrase following the next action. The manual 
writer, in this example, is emphasizing the actions 
peculiar to the cordless phone and paying rela- 
tively little attention to the general skills involved 
in using a standard telephone, of which making a 
call is one. 
IMPLEMENTING 
ALTERNATIONS 
This analysis of local rhetorical relations has re- 
sulted in a set of interrelated alternations, such 
as those just discussed, which I have formalized in 
terms of system networks from systemic-functional 
grammar (Halliday, 1976) 3. 
I am currently implementing these networks 
as an extension to the Penman text generation ar- 
chitecture (Mann, 1985), using the existing Pen- 
man system network tools. My system, called 
IMAGENE, takes a non-linguistic process structure 
such as that produced by a typical planner and 
uses the networks just discussed to determine the 
form of the rhetorical relations based on functional 
factors. It then uses the existing Penman networks 
for lower level clause'generation. 
IMAGENE starts by building a structure based 
on the actions in the process structure that are to 
be expressed and then passes over it a number of 
times making changes as dictated by the system 
networks for rhetorical structure. These changes, 
including various rhetorical demotions, marking 
nodes with their appropriate forms, ordering of 
clauses/phrases, and clause combining, are im- 
plemented as systemic-type realization statements 
for text. IMAGENE finally traverses the completed 
structure, calling Penman once for each group of 
nodes that constitute a sentence. A detailed dis- 
cussion of this design can be found in Vander Lin- 
den (1992b). IMAGENE is capable, consequently, 
of producing instructional text that conforms to 
a formal, corpus-based notion of how realistic in- 
structional text is constructed. 
3System networks are decision structures in the 
form of directed acyclic graphs, where each decision 
point represents a system that addresses one of the 
alternations. 
320 

REFERENCES 
Code-a-phone (1989). Code-A-Phone Owner's 
Guide. Code-A-Phone Corporation, P.O. Box 
5678, Portland, OR 97228. 
Cumming, Susanna (1991). Nominalization in 
English and the organization of grammars. 
In Proceedings of the IJCAI-91 Workshop on 
Decision Making Throughout the Generation 
Process, August 24-25, Darling Harbor, Syd- 
ney, Australia. 
Excursion (1989). Excursion 8100. Northwestern 
Bell Phones, A USWest Company. 
Halliday, M. A. K. (1976). System and Function in 
Language. Oxford University Press, London. 
Ed. G. R. Kress. 
Hovy, Eduard H. (1989). Approaches to the 
planning of coherent text. Technical Report 
ISI\]RR-89-245, USC Information Sciences In- 
stitute. 
Mann, William C. (1985). An introduction to 
the Nigel text generation grammar. In Ben- 
son, James D., Freedle, Roy O., and Greaves, 
William S., editors, Systemic Perspectives on 
Discourse, volume 1, pages 84-95. Ablex. 
Mann, William C. and Thompson, Sandra A. 
(1988). Rhetorical structure theory: A the- 
ory of text organization. In Polanyi, Livia, 
editor, The Structure of Discourse. Ablex. 
Moore, Johanna D. and Paris, Cdcile L. (1988). 
Constructing coherent text using rhetorical 
relations. Submitted to the Tenth Annual 
Conference of the Cognitive Science Society, 
August 17-19, Montreal, Quebec. 
Scott, Donia R. and Souza, Clarisse Sieckenius de 
(1990). Getting the message across in RST- 
based text generation. In Dale, Robert, Mel- 
lish, Chris, and Zock, Michael, editors, Cur- 
rent Research in Natural Language Genera- 
lion, chapter 3. Academic Press. 
Vander Linden, Keith, Cumming, Susanna, and 
Martin, James (1992a). The expression of lo- 
cal rhetorical relations in instructional text. 
Technical Report CU-CS-585-92, the Univer- 
sity of Colorado. 
Vander Linden, Keith, Cumming, Susanna, and 
Martin, James (1992b). Using system net- 
works to build rhetorical structures. In Dale, 
R., Hovy, E., RSesner, D., and Stock, O., edi- 
tors, Aspects of Automated Natural Language 
Generation. Springer Verlag. 
