Proceedings of the Workshop on Frontiers in Corpus Annotation II: Pie in the Sky, pages 13–20,
Ann Arbor, June 2005. c©2005 Association for Computational Linguistics
A Unified Representation for Morphological, Syntactic, Semantic, and
Referential Annotations
Erhard W. Hinrichs, Sandra Kübler, Karin Naumann
SfS-CL, University of Tübingen
Wilhelmstr. 19
72074 Tübingen, Germany
{eh,kuebler,knaumann}@sfs.uni-tuebingen.de
Abstract
This paper reports on the SYN-RA
(SYNtax-based Reference Annotation)
project, an on-going project of annotating
German newspaper texts with referential
relations. The project has developed an in-
ventory of anaphoric and coreference rela-
tions for German in the context of a uni-
fied, XML-based annotation scheme for
combining morphological, syntactic, se-
mantic, and anaphoric information. The
paper discusses how this unified annota-
tion scheme relates to other formats cur-
rently discussed in the literature, in par-
ticular the annotation graph model of Bird
and Liberman (2001) and the pie-in-the-
sky scheme for semantic annotation.
1 Introduction
The purpose of this paper is threefold: (i) it dis-
cusses an annotation scheme for referential relations
for German that is significantly broader in scope
than existing schemes for the same task and lan-
guage and that also goes beyond the inventory of
anaphoric relations included in the pie-in-the-sky
sample feature structures1, (ii) it presents a unified,
XML-based annotation scheme for combining mor-
phological, syntactic, semantic, and anaphoric infor-
mation, and (iii) it discusses how this unified anno-
tation scheme relates to other formats currently dis-
cussed in the literature, in particular the annotation
1See e.g. nlp.cs.nyu.edu/meyers/pie-in-the-sky/
analysis5.
graph model of Bird and Liberman (2001) and the
pie-in-the-sky scheme for semantic annotation2.
2 Referential Relations
This section introduces the inventory of referential
relations adopted in the SYN-RA project. We define
referential relations as a cover-term for all contex-
tually dependent reference relations. The inventory
of such relations adopted for SYN-RA is inspired by
the annotation scheme first developed in the MATE
project (Davies et al., 1998). However, it takes a
cautious approach in that it only adopts those refer-
ential relations from MATE for which the develop-
ers of MATE report a sufficiently high level of inter-
annotator agreement (Poesio et al., 1999).
SYN-RA currently uses the following subset
of relations: coreferential, anaphoric, cataphoric,
bound, split antecedent, instance, and expletive. The
potential markables are definite NPs, personal pro-
nouns, relative, reflexive, and reciprocal pronouns,
demonstrative, indefinite and possessive pronouns.
There is a second research effort under way at the
European Media Laboratory Heidelberg, which also
annotates German text corpora and dialog data with
referential relations. Since their corpora are not pub-
licly available, it is difficult to verify their inventory
of referential relations. Kouchnir (2003) has used
their data and describes the relations anaphoric,
coreferential, bridging, and none.
Following van Deemter and Kibble (2000), we
define a coreference relation to hold between two
2See nlp.cs.nyu.edu/meyers/pie-in-the-sky/
pie-in-the-sky-descript.html.
13
NPs just in case they refer to the same extra-
linguistic referent in the real world. In the following
example, a coreference relation exists between the
noun phrases [1] and [2], and an anaphoric relation
between the noun phrase [2] and the personal pro-
noun [3]. Since noun phrases [1] and [2] are corefer-
ential, all three NPs belong to the same coreference
chain. In keeping with the MUC-6 annotation stan-
dard3, we establish the anaphoric relations of a pro-
noun only to its most recently mentioned antecedent.
(1) [1 Der
The
neue
new
Vorsitzende
chairman
der
of the
Gewerkschaft
union
Erziehung
Education
und
and
Wissenschaft]
Science
heißt
is called
[2 Ulli
Ulli
Thöne].
Thöne.
[3 Er]
He
wurde
was
gestern
yesterday
mit
with
217
217
von
out of
355
355
Stimmen
votes
gewählt.
elected.
’The new chairman of the union of educators
and scholars is called Ulli Thöne. He was
elected yesterday with 217 of 355 votes.’
Cataphoric relations hold between a preceding
pronoun and a following antecedent within the same
sentence, even if this antecedent has already been
mentioned within the preceding text. An example
for a cataphoric relation is shown in (2).
(2) Vier
Four
Wochen
weeks
sind
are
[sie]
they
nun
now
schon
already
in
in
Berlin,
Berlin,
[die
the
220
220
Albaner
Albanians
aus
from
dem
the
Kosovo].
Kosovo.
’They have already been in Berlin for four
weeks, the 200 Albanians from Kosovo.’
The relation bound holds between anaphoric ex-
pressions and quantified noun phrases as their an-
tecedents (see example (3)).
(3) [Niemandem]
To nobody
fällt
is
es
it
schwer,
difficult,
das
the
Bild
picture
vor
in front of
[sich]
himself
zu
to
sehen.
see.
’Nobody has trouble imagining the picture.’
3See www.cs.nyu.edu/cs/faculty/grishman/
COtask21.book_1.html.
The split antecedent relation holds between co-
ordinate NPs/plural pronouns and pronouns/definite
NPs referring to one member of the plural expres-
sion. In example (4), the indefinite pronoun beide
enters into two split antecedent relations, with noun
phrases 1 and 2.
(4) Aber
But
plötzlich
suddenly
gibt
gives
es
it
da
there
einen
a
völlig
completely
unglaubwürdig
implausible
und
and
grotesk
grotesque
wirkenden
seeming
Anruf
phone call
[1 des
of the
Detektiven]
detective
bei
to
[2 der
the
Mutter
mother
des
of the
Opfers]
victim
,
,
[beide]
both
weinen
cry
sich
themselves
minutenlang
for some minutes
etwas
something
vor
verb part
,
,
...
...
’But suddenly, there is a completely implausi-
ble and grotesque phone call from the detective
to the mother of the victim, they both cry at
each other for several minutes, ...’
An instance relation exists between a preced-
ing/following pronoun and its NP antecedent when
the pronoun refers to a particular instantiation of the
class identified by the NP.
(5) Die
The
konservativen
conservative
Kräfte
powers
warten
wait
ja
just
nur
only
darauf,
for that,
ihm
him
[Sätze]
sentences
um
around
die
the
Ohren
ears
zu
to
hauen
hit
wie
like
[jenen
the one
von
about
den
the
16
16
Mittelstrecklern],
middle-distance runners,
denen
to whom
er
he
in
in
vier
four
Wochen
weeks
die
the
Viererkette
double full-back formation
beibringe.
teaches.
’The conservative powers are just waiting to
bombard him with sentences like the one about
the 16 middle-distance runners who he is teach-
ing the double full-back formation in four
weeks.’
14
In sentence (5), the relation between the two
bracketed NPs is an example of such an instance re-
lation since the second NP is a particular instantia-
tion of the referent denoted by the first NP.
A third person singular neuter pronoun es is
marked as expletive if it has no proper antecedent.
This is the case for presentational es in example (6),
impersonal passive as in example (7), or es as sub-
ject for verbs without an agent as in example (8).
(6) [1 Es]
It
zeichnet sich
emerges
die
the
konkrete
concrete
Möglichkeit
possibility
ab.
verb part.
’The concrete possibility emerges.’
(7) [Es]
There
wird
is
bis zum
until the
Morgen
morning
getanzt.
danced.
’People are dancing until morning.’
(8) [Es]
It
steht
stands
schlecht
bad
um
for
ihn.
him.
’He is in a bad way.’
Apart from expletive uses of es and anaphoric
uses with an NP antecedent, the pronoun es can also
be used in cases of event anaphora as in sentence
(9). Here es refers to the event of Jochen’s win-
ning the lottery. Currently, the annotation in SYN-
RA is restricted to NP anaphora and therefore event
anaphors such as in sentence (9) remain unannotated
for anaphora.
(9) Jochen
Jochen
hat
has
im
in the
Lotto
lottery
gewonnen.
won.
Aber
But
er
he
weiss
knows
es
it
noch
yet
nicht.
not.
’Jochen has won the lottery. But he does not
know it yet.’
The annotation of such relations is performed
manually with the annotation tool MMAX (Müller
and Strube, 2003). Its graphical user interface al-
lows for easy selection of the relevant markables and
the accompanying relation between the contextually
dependent expression and its antecedent.
3 Automatic Extraction of Markables and
of Semantic Information
Annotation of referential relations involves two
main tasks: the identification of markables, i.e.,
identifying the class of expressions that can enter
into referential relations, and the identification of the
particular referential relations that two or more ex-
pressions enter into. Identification of markables re-
quires at least partial syntactic annotation of the text.
If referential relations need to be annotated from
plain text, then markables must be identified semi-
automatically from the output of a chunker or full
parser, if available, or otherwise completely man-
ually. However, in each of these two scenarios,
identification of markables is a time-consuming pro-
cess. In case of semi-automatic annotation, the ef-
fort required depends on the quality of the parser, but
will require at least some amount of manual post-
correction of the parser output.
Identification of markables is considerably easier
for treebank data since treebanks already provide the
necessary syntactic information. For German, there
are currently two large-scale treebanks available: the
NEGRA/TIGER (Brants et al., 2002) treebank and
the Tübingen treebanks for spoken and written Ger-
man (Stegmann et al., 2000; Telljohann et al., 2003).
All the treebanks were annotated with the help of the
annotation tool Annotate (Plaehn, 1998). The tree-
bank annotations are available in the Annotate ex-
port format (Brants, 1997) and in an XML format.
The SYN-RA project is based on the Tübingen
treebank of written German (TüBa-D/Z). This tree-
bank uses as its data source a collection of articles of
the German daily newspaper taz (die tageszeitung).
The treebank currently comprises appr. 15 000 sen-
tences, with a new release of 7 000 additional sen-
tences scheduled for June of this year.
Due to its fine grained syntactic annotation, the
TüBa-D/Z treebank data are ideally suited as a basis
for the identification of markables and for extract-
ing relevant syntactic and semantic properties for
each markable. The TüBa-D/Z annotation scheme
distinguishes four levels of syntactic constituency:
the lexical level, the phrasal level, the level of topo-
logical fields, and the clausal level. The primary
ordering principle of a clause is the inventory of
topological fields, which characterize the word or-
15
Ihre
PPOSAT
asf
Schulkameradin
NN
asf
Cassie
NE
asf
Bernall
NE
asf
fragten
VVFIN
3pit
sie
PPER
np*3
,
$,
−−
ob
KOUS
−−
sie
PPER
nsf3
an
APPR
a
Gott
NE
asm
glaube
VVFIN
3sks
.
$.
−−
− HD − − HD HD − HD HD HD
NX
−
VXFIN
HD
NX
ON −
NX
HD
VXFIN
HD
NX
APP
EN−ADD
APP
NX
ON
PX
OPP
NX
OA
C
−
MF
−
VC
−
SIMPX
OS
VF
−
LK
−
MF
−
NF
−
0 1 2 3 4 5 6 7 8 9 10 11 12
500 501 502 503 504 505 506 507
508 509 510 511 512
513 514
515 516
517
518
SIMPX
Figure 1: A sample tree from the TüBa/D-Z treebank.
der regularities among different clause types of Ger-
man and which are widely accepted among descrip-
tive linguists of German (cf. e.g. (Drach, 1937;
Höhle, 1986)). The TüBa-D/Z annotation relies
on a context-free backbone (i.e. proper trees with-
out crossing branches) of phrase structure combined
with edge labels that specify the grammatical func-
tion of the phrase in question.
Figure 1 shows an example tree from the TüBa-
D/Z treebank for sentence (10). The sentence is di-
vided into two clauses (SIMPX), and each clause is
subdivided into topological fields. The main clause
is made up of the following fields: VF (mnemonic
for: Vorfeld – ’initial field’) contains the sentence-
initial, topicalized constituent. LK (for: linke Satz-
klammer – ’left sentence bracket’) is occupied by the
finite verb. MF (for: Mittelfeld – ’middle field’) con-
tains adjuncts and complements of the main verb.
NF (for: Nachfeld – ’final field’) contains extra-
posed material – in this case an indirect yes/no ques-
tion. The subordinate clause is again divided into
three topological fields: C (for: Komplementierer –
’complementizer’), MF, and VC (for: Verbalkomp-
lex – verbal complex). Edge labels are rendered
in boxes and indicate grammatical functions. The
sentence-initial NX (for: noun phrase) is marked as
OA (for: accusative complement), the pronouns sie
in the main and subordinate clause as ON (for: nom-
inative complement).
(10) Ihre
Their
Schulkameradin
fellow student
Cassie
Cassie
Bernall
Bernall
fragten
asked
sie
they[subj]
,
,
ob
whether
sie
she[subj]
an
in
Gott
God
glaube.
believes.
’They asked their fellow student Cassie Bernall
whether she believes in God.’
Topological field information and grammatical
function information is crucial for anaphora resolu-
tion since binding-theory constraints crucially rely
on sentence-structure (if the binding theory princi-
ples are stated configurationally (Chomsky, 1981))
or on argument-obliqueness (if the binding theory
principles are stated in terms of argument structure,
as in (Pollard and Sag, 1994)). In the case at hand,
the subject pronoun of the main clause, sie, can-
not be anaphorically related to the object NP Ihre
Schulkameradin Cassie Bernall since they are co-
arguments of the same verb. However, the posses-
sive pronoun ihre and the subject pronoun sie of the
subordinate clause, can be and, in fact, are anaphor-
ically related, since they are not co-arguments of the
same verb. This can be directly inferred from the
treebank annotation, specifically from the sentence
structure and the grammatical function information
16
encoded on the edge labels. Most published compu-
tational algorithms of anaphora resolution, including
(Hobbs, 1978; Lappin and Leass, 1994; Ingria and
Stallard, 1989), rely on such binding-constraint fil-
ters to minimize the set of potential antecedents for
pronouns and reflexives.
As already pointed out, the sample sentence con-
tains four markables: one possessive pronoun Ihre,
two occurrences of the pronoun sie and one complex
NP Ihre Schulkameradin Cassie Bernall. The latter
NP is a good example of SYN-RA’s longest-match
principle for identifying markables. In case of com-
plex NPs, the entire NP counts as a markable, but
so do its subconstituents – in the case at hand, par-
ticularly the possessive pronoun ihre. All of this in-
formation can be directly derived from the treebank
account. Compared to other annotation efforts for
German where markables have to be chosen manu-
ally (Müller and Strube, 2003), manual annotation
in the SYN-RA project can, thus, be restricted to the
selection of the appropriate referential relations be-
tween referentially dependent expressions and their
nominal antecedents.
4 The Unified, XML-based Annotation
Scheme
The annotation of referential expressions is em-
bedded in a unified format which also contains
morphological, syntactic, and semantic information.
The annotation scheme is represented in XML, the
widely acknowledged standard for exchanging data,
which guarantees portability and re-usability of the
data. Each sentence, as well as all words and
all nodes in the syntactic structure, are assigned a
unique ID. These IDs are used in the annotation of
referential relations. The annotation of the treebank
sentence 11976 (cf. example (10)) is shown in Fig-
ure 2.
The sentence number is encoded as the ID of the
sentence. The first word, Ihre, has an anaphoric rela-
tion to a noun phrase in the previous sentence. This
relation is marked in the element anaphora, which
gives the antecedent as node 517 of sentence 11975,
i.e. the previous sentence. The other two anaphoric
relations are sentence-internal, the first personal pro-
noun sie having Ihre (id: s11976w0) as antecedent,
the second one the noun phrase Ihre Schulfreundin
Cassie Bernall (id: s11976n513). The annotation of
the first personal pronoun is an example for the an-
notation of an anaphoric chain. Ihre and sie belong
to the same chain. However, in order to facilitate the
extraction of direct relations, such chains are repre-
sented in a way that each anaphoric expression refers
to the last occurrence of an antecedent.
The SYN-RA scheme is very similar to the
MUC-6 coreference annotation scheme4 but it is
more powerful in two respects: As described above,
the inventory is not restricted to coreference and
anaphoric relations, it also covers e.g. instance rela-
tions or split antecedent relations. The latter relation
is also the reason for encoding the relational infor-
mation as XML elements, and not as attributes of
a word or a node. If an anaphor enters into a split
antecedent relation, it has more than one distinct an-
tecedent. In this case, the element anaphora has two
(or more) relations. Such an example is graphically
displayed for sentence (4) in Figure 3. The rele-
vant XML representation of the complex entry for
the word beide is shown in Figure 4.
5 Related Work
This section discusses how the unified SYN-RA an-
notation scheme relates to other formats currently
discussed in the literature, in particular the pie-in-
the-sky scheme for semantic annotation5 and the
annotation graph model of (Bird and Liberman,
2001). While these two annotation schemes are by
no means the only contenders for corpus annotation
standards in the literature, they are certainly among
the most ambitious and promising.
While the pie-in-the-sky scheme is clearly still
under development, the following characteristics
and goals can already be gleaned from its web-
page and the annotation examples presented there:
The annotation is feature-structure-based and incor-
porates various levels of linguistic annotation, in
particular a PROPBANK style predicate-argument
structure, dependency style syntactic information,
as well as morpho-syntactic and word class infor-
mation. All this information is rooted in the at-
tributes needed for predicate-argument assignment,
4See www.cs.nyu.edu/cs/faculty/grishman/
COtask21.book_1.html.
5See nlp.cs.nyu.edu/meyers/pie-in-the-sky/
pie-in-the-sky-descript.html.
17
<sentence id="s11976">
<node id="s11976n518" cat="SIMPX" func="--" parent="0">
<node id="s11976n515" cat="VF" func="-">
<node id="s11976n513" cat="NX" func="OA">
<node id="s11976n500" cat="NX" func="APP">
<word id="s11976w0" form="Ihre" pos="PPOSAT" morph="asf" func="-">
< anaphora>
< relation type="ana" antecedent="s11975n517"/>
< /anaphora> </word>
<word id="s11976w1" form="Schulkameradin" pos="NN" morph="asf" func="HD"/>
</node>
<node id="s11976n508" cat="EN-ADD" func="APP">
<node id="s11976n501" cat="NX" func="-">
<word id="s11976w2" form="Cassie" pos="NE" morph="asf" func="-"/>
<word id="s11976w3" form="Bernall" pos="NE" morph="asf" func="-"/>
</node> </node> </node> </node>
<node id="s11976n509" cat="LK" func="-">
<node id="s11976n502" cat="VXFIN" func="HD">
<word id="s11976w4" form="fragten" pos="VVFIN" morph="3pit" func="HD"/>
</node> </node>
<node id="s11976n510" cat="MF" func="-">
<node id="s11976n503" cat="NX" func="ON">
<word id="s11976w5" form="sie" pos="PPER" morph="np*3" func="HD">
< anaphora>
< relation type="ana" antecedent="s11976w1"/>
< /anaphora> </word> </node> </node>
<word id="s11976w6" form="," pos="$," morph="--" func="--" parent="0"/>
<node id="s11976n517" cat="NF" func="-">
<node id="s11976n516" cat="SIMPX" func="OS">
<node id="s11976n504" cat="C" func="-">
<word id="s11976w7" form="ob" pos="KOUS" morph="--" func="-"/>
</node>
<node id="s11976n514" cat="MF" func="-">
<node id="s11976n505" cat="NX" func="ON">
<word id="s11976w8" form="sie" pos="PPER" morph="nsf3" func="HD">
< anaphora>
< relation type="ana" antecedent="s11976n513"/>
< /anophora> </word> </node>
<node id="s11976n511" cat="PX" func="OPP" comment="">
<word id="s11976w9" form="an" pos="APPR" morph="a" func="-"/>
<node id="s11976n506" cat="NX" func="HD">
<word id="s11976w10" form="Gott" pos="NE" morph="asm" func="HD"/>
</node> </node> </node>
<node id="s11976n512" cat="VC" func="-">
<node id="s11976n507" cat="VXFIN" func="HD">
<word id="s11976w11" form="glaube" pos="VVFIN" morph="3sks" func="HD"/>
</node> </node> </node> </node> </node>
<word form="." pos="$." morph="--" func="--" parent="0"/>
</sentence>
Figure 2: The XML format represents information on all levels of annotation. The words of the sentence
and the anaphoric annotation are shown in bold.
18
NP NP
Aber plötzlich gibt es da einen ... Anruf des Detektiven bei der Mutter ..., beide weinen sich
minutenlang etwas vor ...
split
split
Figure 3: The annotation of the split antecedent relation in sentence (4). For representational reasons, the
sentence is shortened and only relevant information is displayed. Syntactic boundaries are shown as dotted
lines, anaphoric relations as black lines.
<word id="s3426w20" form="beide" pos="PIS" morph="np*" func="HD">
< anaphora>
<relation type="split" antecedent="s3426n507"/>
<relation type="split" antecedent="s3426n526"/>
< /anaphora>
</word>
Figure 4: The XML representation of the encoding of split antecedents for the word beide in sentence (4).
A graphical representation of the relation is shown in Figure 3. The antecedent "s3426n507" refers to the
first NP, "s3426n526" to the second one in Figure 3.
with syntactic and morpho-syntactic information
distributed among the corresponding elements in
the predicate-argument structure representation. Ac-
cordingly, semantic representations provide the or-
ganizing principle while morpho-syntactic and syn-
tactic information play a subordinated role.
The SYN-RA annotation scheme resembles the
pie-in-the-sky scheme in that it also uses one level
of representation, in this case hierarchical syntac-
tic structure, as the organizing principle and treats
referential relations, grammatical function informa-
tion, and morpho-syntactic annotation as subordi-
nated types of information. More generally, the pie-
in-the-sky and the SYN-RA representations offer a
particular view of the annotation, each with its own
“perspective”: semantics-based (pie-in-the-sky) and
syntax-based (SYN-RA).
By contrast, Bird and Liberman’s (2001) anno-
tation graphs are intended as a graph-based, multi-
layered annotation scheme where each level of lin-
guistic annotation is treated equally, as an indepen-
dent layer. The graph-based annotation model is
powerful enough to also allow groupings of discon-
tinuous constituents and other non-adjacent linguis-
tic phenomena, without having to rearrange the lin-
ear order of the input. In both respects, their annota-
tion model is maximally general.
6 Future Directions
In the previous section we have compared two
perspective-dependent annotation schemes that use
a particular level of linguistic annotation as their pri-
mary organizing principle and have contrasted them
with the perspective-independent annotation-graph
model. We believe that both types of represen-
tation models have their independent justification.
Perspective-based representations, such as SYN-
RA and pie-in-the-sky, are well-justified for partic-
ular application scenarios. For example, for text
summarization and other semantic tasks, the pie-
in-the-sky model seems particularly well-motivated
since the pertinent semantic information can be eas-
ily extracted from its predicate-argument-structure-
rooted feature structures. For other tasks, such as
anaphora resolution, for which syntactic informa-
tion is more relevant, the syntax-based representa-
tion of SYN-RA allows for an easier extraction of
the relevant information for rule-based, statistical,
19
and machine-learning approaches to computational
anaphora resolution. More generally, perspective-
based representations are highly task-dependent. It
would be misguided to consider them as ideal, task-
independent annotation standards. If one wants
to establish a task-independent annotation standard,
then a perspective-independent annotation scheme
such as the annotation graph model looks like a
promising direction for future research. In particu-
lar, such research should focus on techniques that al-
low for easy conversion of perspective-independent
representations to task-dependent views of the rele-
vant linguistic information.

References
Steven Bird and Mark Liberman. 2001. A formal frame-
work for linguistic annotation. Speech Communica-
tion, 33(1,2):23–60.
Sabine Brants, Stefanie Dipper, Silvia Hansen, Wolf-
gang Lezius, and George Smith. 2002. The TIGER
treebank. In Erhard Hinrichs and Kiril Simov, edi-
tors, Proceedings of the First Workshop on Treebanks
and Linguistic Theories (TLT 2002), pages 24–41, So-
zopol, Bulgaria.
Thorsten Brants, 1997. The NeGra Export Format for
Annotated Corpora. Universität des Saarlandes, Com-
putational Linguistics, Saarbrücken, Germany.
Noam Chomsky. 1981. Lectures on Government and
Binding. Foris, Dordrecht.
Sarah Davies, Massimo Poesio, Florence Bruneseaux,
and Laurent Romary, 1998. Annotating Coreference in
Dialogues: Proposal for a Scheme for MATE. MATE.
Kees van Deemter and Rodger Kibble. 2000. On core-
ferring: Coreference in MUC and related annotation
schemes. Computational Linguistics, 26(2):629–637.
Erich Drach. 1937. Grundgedanken der Deutschen Satz-
lehre. Diesterweg, Frankfurt/M.
Jerry R. Hobbs. 1978. Resolving pronoun references.
Lingua, 44:311–338.
Tilman Höhle. 1986. Der Begriff "Mittelfeld", An-
merkungen über die Theorie der topologischen Felder.
In Akten des Siebten Internationalen Germanistenkon-
gresses 1985, pages 329–340, Göttingen, Germany.
Robert J. P. Ingria and David Stallard. 1989. A compu-
tational mechanism for pronominal reference. In Pro-
ceedings of the 27th Conference of the Association for
Computational Linguistics, pages 262–271, Vancou-
ver, Canada.
Beata Kouchnir. 2003. A machine learning approach to
German pronoun resolution. Master’s thesis, School
of Informatics, University of Edinburgh.
Shalom Lappin and Herbert Leass. 1994. An algorithm
for pronominal anaphora resolution. Computational
Linguistics, 20(4):535–561.
Christoph Müller and Michael Strube. 2003. Multi-level
annotation in MMAX. In Proceedings of the 4th SIG-
dial Workshop on Discourse and Dialogue, Sapporo,
Japan.
Oliver Plaehn, 1998. Annotate Bedienungsanleitung.
Universität des Saarlandes, Sonderforschungsbereich
378, Projekt C3, Saarbrücken, Germany, April.
Massimo Poesio, Florence Bruneseaux, and Laurent Ro-
mary. 1999. The MATE meta-scheme for coreference
in dialogues in multiple languages. In Proceedings of
the ACL Workshop on Standards for Discourse Tag-
ging, pages 65–74.
Carl Pollard and Ivan Sag. 1994. Head-Driven Phrase
Structure Grammar. Studies in Contemporary Lin-
guistics. University of Chicago Press, Chicago, IL.
Rosmary Stegmann, Heike Telljohann, and Erhard W.
Hinrichs. 2000. Stylebook for the German Treebank
in VERBMOBIL. Technical Report 239, Verbmobil.
Heike Telljohann, Erhard W. Hinrichs, and Sandra
Kübler, 2003. Stylebook for the Tübingen Treebank of
Written German (TüBa-D/Z). Seminar für Sprachwis-
senschaft, Universität Tübingen, Tübingen, Germany.
