Machine-Assisted Rhetorical Structure Annotation
Manfred Stede and Silvan Heintze
University of Potsdam
Dept. of Linguistics
Applied Computational Linguistics
D-14415 Potsdam
Germany
stede|heintze@ling.uni-potsdam.de
Abstract
Manually annotating the rhetorical struc-
ture of texts is very labour-intensive. At the
same time, high-quality automatic analysis
is currently out of reach. We thus propose to
split the manual annotation in two phases:
the simpler marking of lexical connectives
and their relations, and the more di cult
decisions on overall tree structure. To this
end, we developed an environment of two
analysis tools and XML-based declarative
resources. Our ConAno tool allows for e -
cient, interactive annotation of connectives,
scopes and relations. This intermediate re-
sult is exported to O’Donnell’s ‘RST Tool’,
which facilitates completing the tree struc-
ture.
1 Introduction
A number of approaches tackling the di cult
problem of automatic discourse parsing have
been proposed in recent years (e.g., (Sumita
et al., 1992) (Marcu, 1997), (Schilder, 2002)).
They di er in their orientation toward sym-
bolic or statistical information, but they all |
quite naturally | share the assumption that
the lexical connectives or discourse markers are
the primary source of information for construct-
ing a rhetorical tree automatically. The den-
sity of discourse markers in a text depends on
its genre (e.g., commentaries tend to have more
than narratives), but in general, it is clear that
only a portion of the relations holding in a text
is lexically signalled.1 Furthermore, it is well-
known that discourse markers are often ambigu-
ous; for example, the English but can, in terms
of (Mann, Thompson, 1988), signal any of the
relations Antithesis, Contrast, and Concession.
Accordingly, automatic discourse parsing focus-
ing on connectives is bound to have its limita-
tions.
1In our corpus of newspaper commentaries (Stede,
2004), we found that 35% of the coherence relations are
signalled by a connective.
Our position is that progress in discourse
parsing relies on the one hand on a more thor-
ough understanding of the underlying issues,
and on the other hand on the availability of
human-annotated corpora, which can serve as
a resource for in-depth studies of discourse-
structural phenomena, and also for training
statistical analysis programs. Two examples
of such corpora are the RST Tree Corpus by
(Marcu et al., 1999) for English and the Pots-
dam Commentary Corpus (Stede, 2004) for
German. Producing such resources is a labour-
intensive task that requires time, trained anno-
tators, and clearly speci ed guidelines on what
relation to choose under which circumstances.
Nonetheless, rhetorical analysis remains to be
in part a rather subjective process (see section
2). In order to eventually arrive at more objec-
tive, comparable results, our proposal is to split
the annotation process into two parts:
1. Annotation of connectives, their scopes
(the two related textual units), and | op-
tionally | the signalled relation
2. Annotation of the remaining (unsignalled)
relations between larger segments
Step 1 is inspired by work done for English
in the Penn Discourse TreeBank2 (Miltsakaki
et al., 2004). In our two-step scenario, it is the
easier part of the whole task in that connectives
can be quite clearly identi ed, their scopes are
often (but not always, see below) transparent,
and the coherence relation is often clear. We see
the result of step 1 as a corpus resource in its
own right (it can be used for training statistical
classi ers, for instance) and at the same time
as the input for step 2, which \ lls the gaps":
now annotators have to decide how the set of
small trees produced in step 1 is best arranged
in one complete tree, which involves assigning
2http://www.cis.upenn.edu/ pdtb/
relations to instances without any lexical sig-
nals and also making more complicated scope
judgements across larger spans of text | the
more subjective and also more time-consuming
step.3
Our approach is as follows. To speed up the
annotation process in step 1, we have devel-
oped an XML format and a dedicated analysis
tool called ConAno, which will be introduced
in Section 4. ConAno can export the anno-
tated text in the ‘rs3’ format that serves as in-
put to O’Donnell’s RST Tool (O’Donnell, 1997).
His original idea was that manual annotation be
done completely with his tool; we opted however
to use it only for step 2, and will motivate the
reasons for this overall architecture in Section
5.
The net result is a modular, XML-based
annotation environment for machine-assisted
rhetorical analysis, which we see as on the one
hand less ambitious than fully-automatic dis-
course parsing and on the other hand as more
e cient than completely ‘manual’ analysis.
2 Approaches to rhetorical analysis
There are two di erent perspectives on the task
of discourse parsing: an \ideal" one that aims
at modelling a systematic, incremental process;
and an \empirical" one that takes the experi-
ences of human annotators into account. \Ide-
ally", discourse analysis proceeds incrementally
from left to right, where for each new segment,
an attachment point and a relation (or more
than one of each, cf. SDRT) are computed and
the discourse structure grows step by step. This
view is taken for instance in SDRT (Asher, Las-
carides, 2003), which places emphasis on the no-
tion of ‘right frontier’ (also discussed recently by
(Webber et al., 2003)).
However, when we trained two (experienced)
students to annotate the 171 newspaper com-
mentaries of the Potsdam Commentary Corpus
(Stede, 2004) and upon completion of the task
asked them about their experiences, a very dif-
ferent picture emerged. Both annotators agreed
that a strict left-to-right approach is highly im-
practical, because the intended argumentative
structure of the text often becomes clear only
in retrospect, after re ecting the possible con-
tributions of the segments to the larger scheme.
3This assessment of relative di culty does not carry
over to PDTB, where the annotations are more complex
than in our step 1 but do not go as far as building rhetor-
ical structures.
Thus they very soon settled on a bottom-up ap-
proach: First, mark the transparent cases, in
which a connective undoubtedly signals a rela-
tion between two segments.4 Then, see how the
resulting pieces  t together into a structure that
mirrors the argument presented.
The annotators used RST Tool (O’Donnell,
1997), which worked reasonably well for the pur-
pose. However, since we also have in our group
an XML-based lexicon of German connectives
at our disposal (Berger et al., 2002), why not
use this resource to speed up the  rst phase of
the annotation?
3 Annotating connectives and their
scopes
In our de nition of ‘connective’, we largely fol-
low (Pasch et al., 2003) (a substantial catalogue
and classi cation of German connectives), who
require them to take two arguments that can
potentially be full clauses and that semantically
denote two-place relations between eventualities
(but they need not always be spelled out as
clauses). From the syntactic viewpoint, they are
a rather inhomogeneous group consisting of sub-
ordinating and coordinating conjunctions, some
prepositions, and a number of sententence ad-
verbials. We refer to the two related units as
an ‘internal’ and an ‘external’ one, where the
‘internal’ one is the unit of which the connec-
tive is actually a part. For example, in Despite
the heavy rain we had a great time, the noun
phrase the heavy rain is the internal unit, since
it forms a syntactic phrase together with the
preposition. Notice that this is a case where
the eventuality (a state of weather) is not made
explicit by a verb.
As indicated, this step of annotating connec-
tives and units is closely related to the idea
of the PDTB project, which seeks to develop
a large corpus annotated with information on
discourse structure for English texts. For this
purpose, annotators are provided with detailed
annotation guidelines, which point out various
challenges in the annotation process for explicit
as well as empty connectives and their respec-
tive arguments. They include, among others,
 words/phrases that look like connectives,
but prove not to take two propositional ar-
guments
4The clearest cases are subjunctors, which always
mark a relation between matrix clause and embedded
clause.
 words/phrases as preposed predicate com-
plements
 pre- and post-modi ed connectives
 co-occurring connectives
 single and multiple clauses/sentences as ar-
guments of connectives
 annotation of discontinuous connective ar-
guments
Annotators have to also make syntactic judge-
ments, which is not the case in our approach
(where syntax would be done on a di erent an-
notation layer, see (Stede, 2004)).
In the following, we brie y explain the most
important problematic issues with annotating
German connectives and the way we deal with
them, using our annotation scheme for Con-
Ano.
3.1 Issues with German connectives
Connective or not: Some words can be used
as connective or in other functions, such as und
(‘and’), which can for example conjoin clauses
(connective) or items in a list (no connective).
Which relation: Some connectives can sig-
nal more than one relation, as the above-
mentioned but and its German counterpart aber.
Complex connectives: Connectives can be
phrasal (e.g., aus diesem Grund, ‘for this rea-
son’) or even discontinuous (e.g., entweder
...oder, ‘either . ..or’). A fortiori, some may
be used in more than one order (wenn A, dann
B / dann B, wenn A / dann, wenn A, B; ‘if
.. .then . ..’).
Multiple connectives/relations: Some
connectives can be joined to form a complex
one, which might then signal more than one
relation (e.g., combinations with und and aber,
such as aber dennoch, ‘but still’).
Modi ed connectives: Some but not all
connectives are subject to modi cation (e.g.,
nur dann, wenn, ‘only then, if’; besonders weil,
‘especially because’).
Embedded segments: The minimal units
linked by the connective may be embedded
rather than adjacent: Wir m ussen, weil die Zeit
dr angt, uns Montag tre en (‘We have to, be-
cause time is short, meet on Monday’).
3.2 A DTD and an Example
As the  rst step toward an annotation tool, we
de ned an XML format for texts with connec-
tives and their scopes. Figure 1 shows the DTD,
and Figure 2 a short sample annotation of a
single | yet complex | sentence: Auch Berlin
koennte, jedenfalls dann, wenn der Bund sich
erkenntlich zeigt, um die Notlage seiner Haupt-
stadt zu lindern, davon pro tieren. (‘Berlin,
too, could { at least if the federation shows
some gratitude in order to alleviate the emer-
gency of its capital { pro t from it.’) The DTD
introduces XML tags for each of the connec-
tives (<connective>), their possible modi ers
(<modifier>) and respective discourse units
(<unit>, where the type can be  lled by int or
ext), as well as the entire text (<discourse>).
Henceforth, we will refer to the text unit con-
taining the connective as the internal, ‘int-unit’
and to the other, external, one as ‘ext-unit’. Us-
ing this DTD, it is possible to represent the
range of problematic phenomena discussed in
the previous section.
Connective or not: Only those words actu-
ally used as connectives will be marked with the
<connective> tag, while others such as the fre-
quently occurring und (‘and’) or oder (‘or’) will
remain unmarked, if they merely conjoin items
in a list.
Which relation: The <connective> tag in-
cludes a rel attribute for optional speci cation
of the rhetorical relation that holds between the
connected clauses.
Complex connectives: Using an XML
based annotation scheme, we can easily mark
phrasal connectives such as aus diesem Grund
(‘for this reason’) using the <connective> tag.
In order for discontinuous connectives to be an-
notated correctly, we introduce an id attribute
that provides every connective with a distinct
reference number. This way connectives such as
entweder ...oder, (‘either . .. or’) can be repre-
sented as belonging together. (see <connective
id="4" rel="condition"> tags in Figure 2)
Multiple connectives/relations: In our
annotation scheme, complex connectives such
as aber dennoch, (‘but still’) are treated as two
distinct connectives that indicate di erent rela-
tions holding between the same units.
Modi ed connectives: Connective modi-
 ers are marked with a special <modifier> tag,
which is embedded inside the <connective>
tag, as shown with jedenfalls modifying dann in
our example. Hence an additional id attribute
for this tag is not necessary.
Embedded segments: Discourse units are
marked using the <unit> tag, which also pro-
vides an id attribute. On the one hand, this
is used for assigning discourse units to their re-
spective connectives, on the other hand it pro-
vides a way of dealing with discountinuous dis-
course units, as the example shows.
<?xml version=’1.0’ encoding=’UTF-8’?>
<!ELEMENT modifier (#PCDATA)>
<!ELEMENT connective (#PCDATA|modifier)>
<!ATTLIST connective
id CDATA #IMPLIED
rel CDATA #IMPLIED>
<!ELEMENT unit
(#PCDATA|connective|unit)*>
<!ATTLIST unit
id CDATA #IMPLIED
type CDATA #IMPLIED>
<!ELEMENT discourse (#PCDATA|unit)*>
Figure 1: The DTD for texts-with-connectives
<?xml version="1.0"?>
<!DOCTYPE discourse SYSTEM "discourse.dtd">
<discourse>
<unit type="ext" id="4">
Auch Berlin koennte,
<connective id="4"
relation="condition">
<modifier>jedenfalls</modifier>
dann
</connective>
,
</unit>
<unit type="int" id="4">
<connective id="4"
relation="condition">
wenn
</connective>
der Bund sich erkenntlich zeigt, um
die Notlage seiner Hauptstadt zu
lindern,
</unit>
<unit type="ext" id="4">
davon profitieren.
</unit>
</discourse>
Figure 2: Sample annotation in XML-format
4 The ConAno annotation tool
A range of relatively generic linguistic annota-
tion tools are available today, but none of them
turned out suitable for our purposes: We seek a
very easy-to-use, platform-independent tool for
mouse-marking ranges of text and having them
associated with one another. Consequently, we
decided to implement our own Java-based tool,
ConAno, which is geared especially to connec-
tive/scope annotation and thus can have a very
intuitive user interface.
Just like discourse parsers do, ConAno ex-
ploits the fact that connectives are the most re-
liable source of information. Rather than at-
tempting an automatic disambiguation, how-
ever, ConAno merely makes suggestions to the
human analyst, which she might follow or dis-
card. In particular, ConAno loads a list of
(potential) connectives, and when annotation
of a text begins, highlights each candidate so
that the user can either con rm (by marking its
scope) or discard (by mouse-click) it if it not
used as a connective. Furthermore, the connec-
tive list optionally may contain associated co-
herence relations, which are then also o ered to
the user for selection. This annotation phase
is thus purely data-driven: Attention is paid
only to the connectives and their speci c rela-
tion candidates.
To elaborate a little, the annotation process
proceeds as follows. The text is loaded into the
annotation window, and the  rst potential con-
nective is automatically highlighted. Potential
preposed or postposed modi ers, if any, of the
connective are also highlighted (in a di erent
color). The user moves with the mouse from
one connective to the next and
 can with a mouseclick discard a highlighted
item (it is not a connective or not a modi-
 er),
 can call up a help window explaining the
syntactic behavior and the relations of this
connective,
 can call up a suggestion for the int-unit
(i.e., text portion is highlighted),
 can analogously call up a suggestion for the
ext-unit,
 can choose from a menu of the relations
associated with this connective.
A screenshot is given in Figure 4. The sugges-
tions for int-unit and ext-unit are made by Co-
nAno on the basis of the syntactic category of
Figure 3: Screenshot of ConAno
the connective; we use simple rules like \search
up to the next comma" to  nd the likely int-
unit for a subjunctor, or \search the preceding
two full-stops" to  nd the ext-unit for an adver-
bial (the preceding sentence). The suggestions
may be wrong; then the user discards them and
marks them with the mouse herself. The result
of this annotation phase is an XML  le like the
(very short) one shown in Figure 2.
5 Overall annotation environment
A central design objective is to keep the envi-
ronment neutral with respect to the languages
of the text, the connectives to be annotated, and
the coherence relations associated with them.
Accordingly, the list of connectives is external
and read into ConAno upon startup. In our
case, we use an XSLT sheet to map our ‘Dis-
course Marker Lexicon’ (see below) to the input
format of ConAno. The text to be annotated
is expected in plain ASCII. When annotation is
complete, the result (or an intermediate result)
can be saved in our XML-format introduced in
section 3.2. Optionally, it can be exported to
the ‘rs3’ format developed by (O’Donnell, 1997)
for his RSTTool. This allows for a smooth tran-
rs3
ConAno
RSTTool
DiMLex xslt raw text
text with connectives,
scopes (and relations)
rhetorical tree
text with full
Figure 4: Overview of annotation environment
sition to a tool for constructing complete rhetor-
ical trees. Rather than starting from scratch,
the RSTTool user can now open the  le pro-
duced by ConAno, which amounts to a partial
rhetorical analysis of the text, and which the
user can now complete to a full tree.
Our Discourse Marker Lexikon ‘DiMLex’
(Berger et al., 2002) assembles information on
140 German connectives, giving a range of syn-
tactic, semantic, and pragmatic features, in-
cluding the coherence relations along the lines of
(Mann, Thompson, 1988). They are encoded in
an application-neutral XML format (see Figure
5), which are mapped with XSLT sheets to vari-
ous NLP applications. Our new proposal here is
to use it also for interactive connective annota-
tion. Hence, we wrote an XSLT sheet that maps
DiMLex to a reduced list, where each connective
is associated with syntactic labels coordination,
subordination or adverb and <coh-relation>
entries for its potential relations | see Figure
6 for DTD and 7 for an example. The, for
these purposes quite simple, syn value has been
mapped from the more complex classi cation in
DiMLex under kat (German for category). This
format is the input to ConAno.
As indicated above, we do not see the tran-
sition to RSTTool as a necessary step. Rather,
the intermediate result of connective/scope an-
notation is useful in its own right, as it encodes
those aspects of rhetorical structure that are in-
dependent of the chosen set of coherence rela-
tions and the conditions of assigning them.
6 Summary
With our work on German discourse connec-
tives, the structure of their argument units,
and the indicated rhetorical relations, we seek
a better understanding of underlying linguistic
issues on the one hand, and an easier way of
developing rhetorical structure-annotated cor-
pora for German texts on the other hand. For
<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl"
href="short_dictionary.xsl" ?>
<!DOCTYPE dictionary SYSTEM "dimlex.dtd">
<dictionary>
<entry id="41">
<orth phrasal="0">denn</orth>
<syn>
<kat>konj</kat>
<position>vorvorfeld</position>
<!-- . . . -->
</syn>
<semprag>
<relation>cause</relation>
<relation>explanation</relation>
<presupp>int-unit</presupp>
<!-- . . . -->
</semprag>
<example>
Das Konzert muss ausfallen,
*denn* die Saengerin ist erkrankt.
</example>
<example>
Die Blumen auf dem Balkon sind
erfroren, *denn* es hat heute
nacht Frost gegeben.
</example>
</entry>
</dictionary>
Figure 5: DiMLex extract
this purpose, we present an annotation envi-
ronment, including our ConAno tool, which
helps human annotators to mark discourse con-
nectives and their argument units by  nding
possible connectives and making suggestions on
their estimated argument structure. We pointed
out several challenges in the connective annota-
tion process of German texts and introduced an
XML based annotation scheme to handle the
di culties. For one thing, the results of this
step provide elobarate information about the
structure of German texts with respect to dis-
course connectives, but furthermore they can be
used as input to O’Donnell’s RST Tool, in or-
der to complete the annotation of the rhetorical
tree structure. The overall scenario is then one
of machine-assisted rhetorical structure anno-
tation. Since ConAno is based on an external
list of connectives (with associated syntactic la-
bels and relations), the tool is not dedicated to
one particular theory of discourse structure, let
alone to a speci c set of relations. Furthermore,
it can in principle deal with texts in various lan-
guages (it just relies on string matching between
<?xml version=’1.0’ encoding=’UTF-8’?>
<!ELEMENT example (#PCDATA)>
<!ELEMENT coh-relation (#PCDATA)>
<!ELEMENT sem (example|coh-relation)*>
<!ELEMENT syn (sem)*>
<!ATTLIST syn
type CDATA #IMPLIED>
<!ELEMENT part (#PCDATA)>
<!ATTLIST part
type CDATA #IMPLIED>
<!ELEMENT orth (part)*>
<!ATTLIST orth
type CDATA #IMPLIED>
<!ELEMENT entry (syn|orth)*>
<!ATTLIST entry
id CDATA #IMPLIED>
<!ELEMENT conanolex (entry)*>
Figure 6: DTD for connectives in ConAno in-
put format
<entry id="116">
<orth type="cont">
<part type="single">wenn</part>
</orth>
<orth type="discont">
<part type="single">wenn</part>
<part type="single">dann</part>
</orth>
<syn type="subordination">
<sem>
<coh-relation>condition
</coh-relation>
<example>*Wenn* man auf den Knopf
drueckt, oeffnet sich die Tuer
von selbst.
</example>
<example>*Wenn* du mich fragst,
*dann* wuerde ich die Finger
davon lassen.
</example>
</sem>
</syn>
</entry>
Figure 7: Connective information in ConAno
input format
connectives in the list and in the text), but we
have so far used it only for German.
Acknowledgements
We thank the anonymous reviewers for their
constructive comments and suggestions for im-
proving the paper.

References

Asher, N. and Lascarides, A. 2003. Logics of
Conversation. Cambridge University Press.

Berger, D.; Reitter, D. and Stede, M. 2002.
XML/XSL in the Dictionary: The Case of
Discourse Markers. In: Proc. of the Coling
Workshop ‘NLPXML-2002’, Tapei.

O’Donnell, M. 1997. RST-Tool: An RST Analy-
sis Tool. Proc. of the 6th European Workshop
on Natural Language Generation, Duisburg.

Mann, W. and Thompson, S. 1988. Rhetorical
Structure Theory: A Theory of Text Organi-
zation. TEXT 8(3), 243-281.

Marcu, D. 1997. The rhetorical parsing of nat-
ural language texts. Proc. of the 35th Annual
Conference of the ACL, 96-103.

Marcu, D.; Amorrortu, E. and Romera, M.
1999. Experiments in Constructing a Corpus
of Discourse Trees. In: Proc. of ACL Work-
shop ‘Towards Standards and Tools for Dis-
course Tagging’, University of Maryland.

Miltsakaki, E.; Prasad, R.; Joshi, A. and Web-
ber, B. 2004. Annotating Discourse Connec-
tives and their Arguments. In: Proc. of the
HLT/NAACL Workshop ‘Frontiers in Corpus
Annotation’, Boston.

Pasch, R; Brausse, U.; Breindl, E. and Wass-
ner, H. 2003. Handbuch der deutschen Kon-
nektoren. Berlin: deGruyter.

Schilder, F. 2002. Robust Discourse Parsing via
Discourse Markers, Topicality and Position.
Natural Language Engineering 8 (2/3).

Stede, M. 2004. The Potsdam Commentary
Corpus. In: Proc. of the ACL Workshop ‘Dis-
course Annotation’, Barcelona.

Sumita, K.; Ono, K.; Chino, T.; Ukita, T.;
Amano, S. 1992. A discourse structure ana-
lyzer for Japanese text. Proc. of the Interna-
tional Conference on Fifth Generation Com-
puter Systems, 1133-1140.

Webber, B.; Knott, A.; Stone, M. and Joshi,
2003. A. Anaphora and Discourse Structure.
Computational Linguistics 29(4), 545-588.
