RSTTool 2.4 - A Markup Tool for Rhetorical Structure Theory 
Michael O'Donnell (micko@dai.ed.ac.uk) 
Division of Informatics, University of Edinburgh. 
Abstract 
RSTTool is a graphical tool for annotating a text in 
terms of its rhetorical structure. The demonstration 
will show the various interfaces of the tool, focusing 
on its ease of use. 
1 Introduction 
This paper describes the RSTTool, a graphical in- 
terface for marking up the structure of text. While 
primarily intended to be used for marking up Rhet- 
orical Structure (cf. Rhetorical Structure Thegry 
(RST): Mann and Thompson (1988)), the tool also 
allows the mark-up of constituency-style analysis, as 
in Hasan's Generic Structure Potential (GSP - cf. 
Hasan (1996)). 
The tool is written in the platform-independent 
scripting , Tcl/Tk, and thus works under 
Windows, Macintosh, UNIX and LINUX operating 
systems. 
RSTTool is easy to use, one creates an RST dia- 
gram from a text by dragging from segment to seg- 
ment, indicating rhetorical dependency. There is a 
separate interface for text segmentation. The tool 
can automatically segment at sentence boundaries 
(with reasonable accuracy), and the user clicks on 
the text to add boundaries missed by the automatic 
segmenter (or click on superfluous boundaries to re- 
move them). 
The tool was previously described in O'Donnell 
(1997). However, since then tile tool has been sub- 
stantially revised and extended,(the current version 
being 2.4). This version is also far more robust due 
to extensive debugging by one of RST's inventor's, 
Bill Mann. Particular improvements in the tool in- 
clude: 
1. GUI for defining relation: ability to add, re- 
name, delete, etc. tile relations used using a 
graphical user interface. 
2. St.atistical Analysis: a new interface was added. 
which allows users to be presented with statist- 
its regarding the proportional use of relations 
in a text. 
3. Output Options: the new tool allows saving 
of RST analyses in postscript (for inclusion in 
Latex documents), or sending diagrams directly 
to the printer. Files are now saved in an XML 
format, to facilitate importation in other sys- 
tems. 
4. Improved Structuring: the possibilities for 
structuring have improved, allowing the in- 
sertion of spans, multinuclear elements and 
schemas within existing structure. 
The Tool consists of four interfaces, which will be 
described in following sections: 
1. Text Segmentation: for marking the boundaries 
between text segments; 
2. Text Structuring: for marking the structural re- 
lations between these segments; 
3. Relation Editor: for maintaining the set of dis- 
course relations, and schemas; 
4. Statistics: for deriving simple descriptive stat- 
istics based on the analysis. 
2 What is RSTTool For? 
The RSTTool is an analysis tool, but most users of 
the tool are researchers in the text generation field. 
For this reason, we present the tool at this confer- 
ence. 
Several reasons for using the tool are: 
. Corpus Studies: before one can generate text, 
one nmst understand the rhetorical patterns of 
. By performing analyses of texts sin> 
ilar to which one wishes to generate, one can 
identify the recurrent structures in tile text-type 
and work towards understanding their context 
of use. 
o Results Verification: often, a particular study 
may be challenged by other researchers. If the 
study was performed using RSTTool, the cor- 
pus supporting the study can be released for 
analysis by others. Previously, most RST ana- 
lysis was done bv hand, malting distribution of 
253 
.~.L ..~..£.~~~.~,~~ ~,. &~...-...,,~15~5~;,~,~..~2~9~:.~., ~$ ..g-=.'. , ,.~ ~* ." • .:, z~ ..... • a 
~_with the ~oftware~ 
~dividualf'~es.\] ... 
..... he author hereby gra, t permission to use, copy, modify, 
~distribute, and license this software and £t6 documentation for 
~any purpose, ~ provided that existing copyright notices are 
~'~retained in all copies; and that this notice is iscluded verbatim 
~|in any distributions.~ No written agree~aent, license, or royalty   
fee is required for any of the authorized use~s.i 
~.~,~,~,~,~IModtflcations .to this software may be copyrighted by their 
~|authors< mnd need not follow the licensing terms described here,: 
(~h~9~provided that~ the new terms are clearly indicated on the first ~N.:'t&~NI~ge of e~h file ~here ~h~y ~pp~y.:! 
Figure h The Segmentation Interface 
corpora difficult. RSTTool thus not only sim- 
plifies the production of the corpus, but also 
allows ease of distribution and verification. 
. Diagram Preparation: the RSTTool can also be 
used for diagram preparation, for inclusion in 
papers. The tool allows diagrams to be expor- 
ted as EPS files, ready for inclusion in LaTeX 
documents (as demonstrated in this paper). For 
PCs and Mac, screen-dumps of diagrams are 
possible (Tcl/Tk does not yet fully support ex- 
port of GIF or JPG formats, and conversion 
from EPS to other formats is primitive). Some 
versions of MS Word allow the inclusion of EPS 
diagrams. 
o A Teaching Tool: by getting students to analyse 
texts with the RSTTool, teachers of discourse 
theory can increase the student's understanding 
of the theory. 
To allow RSTTool analyses to be more generally 
usable, the tool now saves its analyses in an XML 
format, making loading into other systems for pro- 
cessing much simpler. 
3 Segmentation Interface 
The first step in RST analysis is to deternline seg- 
ment boundaries. RSTTool provides an interface to 
facilitate this task. The user starts by "importing" a 
plain text file. The user can then automatically seg- 
ment at sentence boundaries by pressing the "Sen- 
tences" button. This segmentation is not 100% re- 
liable, but is reasonably intelligent. The user carl 
then correct any mistakes made by the automatic 
segnrentation, and also add in segment boundaries 
within sentences. 
To add a segment boundary, the user simply clicks 
at the t)oint of the text where the boundary is de- 
sired. A boundary marker is inserted. To temove 
a boundary, the user simply clicks on the boundary 
marker. Figure 1 shows the Segmentation interface 
after clausal segmentation. 
The user can also edit the text, correcting mis- 
takes, etc.,.by switching to Edit mode. 
The user then moves to the Structuring interface 
by clicking on the "Structurer" button at the top of 
the window. Note that the user can return at any 
point to the Segmentation interface, to change seg- 
ment boundaries, or edit text. These changes are 
automatically accounted for in the structuring com- 
ponent. 
4 Structuring Interface 
The next step involves structuring the text. The 
second interface of the RSTTool allows the user to 
connect the segments into a rhetorical structure tree. 
as shown in figure 2. We have followed the graphical 
style presented in Mann and Thompson (1988). 
The tool supports not only RST structuring, but 
also constituency structuring. I believe that texts 
cannot always be analysed totally in ternrs of rhet- 
orical relations, and that some level of schematic 
analysis complements the rhetorical analysis. For 
instance, a typical conference paper (such as this 
one) can be assigned a top level schematic structure 
of 
Title - Author ^ Institution - Abstract 
" Section* - Bibliography 
The R.STTool allows intermixing of such schema 
with RST analysis. 
Initially, all segments are unconnected, ordered at 
the top of the window. The user can then drag tile 
mouse from one segment (tile satelite) to another 
(the nucleus) to link them. 
The system allows both plain RST relations and 
also multi-nuclear relations (e.g., Joint, Sequence, 
254 
Figure 2: The Structuring Interface 
1-2 3-4 
When he took it It was as heavy and he was because he 
up, as lead, going to throw it thought a trick 
away, had been played 
on him. 
Figure 3: RST Structuring 
Orie_ = 
TWO old men 2-3 
sitting talking In 
a reUrement ______~-'~e~oenc>.~__ 
home. One asks, The other 
"How's your replies, "No 
memory?" problem at all, 
touch wood", as 
he knocked on 
the oak table. 
Pun;line 
Two minutes go 
by, and he says 
"isn't anyone 
going to get that 
door!" 
Figure 4: Schema-based Structuring 
etc.). Scoping is also possible, whereby tile user in- 
dicates that the nucleus of a relation is not a seg- 
ment itself, but rather a segment and all of its satel- 
lites. See figure 3 for an example combining normal 
RST relations (Circumstance, Motivation); nmlti- 
nuclear structure (Conjunction), and scoping (the 
nodes marked 1-2 and 3-4). In addition, schemas 
can be used to represent constituency-type struc- 
tures. See figure 4. 
Because RST-structures can become very elabor- 
ate, the RSTTool allows the user to collapse sub- 
trees - hiding the substructure under a node. This 
makes it easier, for instance, to comtect two nodes 
which normally would not appear on the same page 
of the editor. 
5 Editing Relations 
The tool provides an interface for editing relation 
sets. The user can add, delete or rename relations. 
If the relation is in use in the current analysis, the 
changes are propagated throughout the analysis. 
6 Statistical Analysis 
Discussions on the RST mail list have demonstrated 
that there is a community concern with frequency 
of different relations in specific text-types. The 
RSTTool, by providing counts of relations within a 
text, supports this research goal. See figure 5. 
The interface shows not only the frequency of re- 
lations, but also the ratio of Nuc Sat orderings to 
Sat lquc orderings for the relation (valuable data for 
both generation and automatic discourse structure 
recognition). 
7 Summary 
RSTTool is a robust tool which facilitates manual 
analysis of a text's rhetorical structure. These ana- 
lyses can be used for a number of purposes, including 
i) to improve understanding of discourse structure, 
to aid in either text generation or analysis; ii) dia- 
gram preparation, and iii) as a teaching tool. 
The main improvement in the latest version of the 
tool is the statistical analysis interface. Later ver- 
sions of the.tool will extend oll this aspect, increas- 
ing the range of analyses which can be performed on 
each text, or collection of texts. 
Future versions will also add code for automatic 
structure recognition, using such work ms Marcu's 
RST recognition tool (Marcu, 1997). While tile au- 
thor believes that automatic recognition is not yet 
reliable, integrating such a tool into an R ST Markup 
255 
Figure 5: The Statistics Interface 
tool allows the recognition software to provide a first 
draft, which the human editor can correct to their 
liking. At present, such a mixture of automatic and 
human-directed mark-up is the best way of achieving 
accurate mark-up of text structure. 

References 
Ruqaiya Hasan. 1996. The nursery tale as a genre. 
In Carmel Cloran, David Butt, and Geoff Willi- 
ams, editors, Ways of Saying: Ways of Meaning. 
Cassell, London. Previously published in Notting- 
ham Linguistics Circular 13, 1984. 
W.C. Mann and S. Thompson. 1988. Rhetorical 
structure theory: Toward a functional theory of 
text organization. Text, 8(3):243-281. 
Daniel Marcu. 1997. The rhetorical parsing of 
natural  texts. In The Proceedings of 
the 35th Annual Meeting of the Association for 
Computational Linguistics, (A CL '97lEA CL '97), 
pages 96-103, Madrid, Spain, July 7-10. 
Michael O'Donnell. 1997. Rst-tool: An rst analysis 
tool. In Proceedings off the 6th European Work- 
shop on Natural Language Generation, pages 92 - 
96, Gerhard-Mercator University, Duisburg, Ger- 
many, March 24 - 26. 
