Representing and Accessing Multi-Level Annotations in MMAX2
Christoph M¨uller
EML Research gGmbH
Villa Bosch
Schloß-Wolfsbrunnenweg 33
69118 Heidelberg, Germany
christoph.mueller@eml-research.de
1 Introduction
MMAX21 is a versatile, XML-based annotation
tool which has already been used in a variety of an-
notation projects. It is also the tool of choice in the
ongoing project DIANA-Summ, which deals with
anaphora resolution and its application to spoken
dialog summarization. The project uses the ICSI
Meeting Corpus (Janin et al., 2003), a corpus of
multi-party dialogs which contains a considerable
amount of simultaneous speech. It features a semi-
automatically generated segmentation in which
the corpus developers tried to track the flow of the
dialog by inserting segment starts approximately
whenever a person started talking. As a result, the
corpus has some interesting structural properties,
most notably overlap, that are challenging for an
XML-based representation format. The following
brief overview of MMAX2 focuses on this aspect,
using examples from the ICSI Meeting Corpus.
2 The MMAX2 Data Model
Like most current annotation tools, MMAX2
makes use of stand-off annotation. The base data
is represented as a sequence of <word> elements
with exactly one PCDATA child each. Normally,
these PCDATA children represent orthographical
words, but larger or smaller units (e.g. characters)
are also possible, depending on the required gran-
ularity of the representation.2
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE words SYSTEM "words.dtd">
<words>
...
<word id="word_3710">That</word>
<word id="word_3711">’s</word>
<word id="word_3712">just</word>
<word id="word_3713">a</word>
<word id="word_3714" meta="true">Pause</word>
<word id="word_3715">specification</word>
<word id="word_3716">for</word>
<word id="word_3717">the</word>
<word id="word_3718">X_M_L</word>
<word id="word_3719">Yep</word>
<word id="word_3720">.</word>
1http://mmax.eml-research.de
2The meta attribute in the example serves to distinguish
meta information from actual spoken words.
<word id="word_3721">format</word>
<word id="word_3722">.</word>
...
</words>
The order of the elements in the base data file is
determined by the order of the segments that they
belong to.
Annotations are represented in the form of
<markable> elements which reference one or
more base data elements by means of a span at-
tribute. Each markable is associated with exactly
one markable level which has a unique, descrip-
tive name and which groups markables that be-
long to the same category or annotation dimen-
sion. Each markable level is stored in a sepa-
rate XML file which bears the level name as an
XML name space. The ID of a markable must be
unique for the level that it belongs to. Markables
can carry arbitrarily many features in the common
attribute=value format. It is by means of
these features that the actual annotation informa-
tion is represented. For each markable level, ad-
missible attributes and possible values are defined
in an annotation scheme XML file (not shown).
These annotation schemes are much more power-
ful for expressing attribute-related constraints than
e.g. DTDs. The following first example shows the
result of the segmentation of the sample base data.
The participant attribute contains the identi-
fier of the speaker that is associated with the re-
spective segment.
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE markables SYSTEM "markables.dtd">
<markables xmlns="www.eml.org/NameSpaces/segment">
...
<markable id="markable_468"
span="word_3710..word_3714"
participant="me012"/>
<markable id="markable_469"
span="word_3715..word_3718"
participant="me012"/>
<markable id="markable_470"
span="word_3719..word_3720"
participant="mn015"/>
<markable id="markable_471"
span="word_3721..word_3722"
participant="me012"/>
...
</markables>
73
The next example contains markables representing
the nominal and verbal chunks in the sample base
data.
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE markables SYSTEM "markables.dtd">
<markables xmlns="www.eml.org/NameSpaces/chunks">
...
<markable id="markable_7834"
span="word_3710"
type="demonstrative"/>
<markable id="markable_7835"
span="word_3711"
type="copula"/>
<markable id="markable_7836"
span="word_3713..word_3715"
type="nn"/>
<markable id="markable_7837"
span="word_3711..word_3715"
type="predication"
subject="markable_7834"/>
<markable id="markable_7838"
span="word_3717..word_3718,word_3721"
type="nn"/>
...
</markables>
The following basic facts about markables in
MMAX2 are worth noting:
1. Every markable is defined with reference to
the base data. Markables on the same or different
levels are independent and ignorant of each other,
and only related indirectly, i.e. by means of base
data elements that they have in common.3 Struc-
tural relations like embedding ([[’s] just [a speci-
fication]]) can only be determined with recourse
to the base data elements that each markable
spans. This lazy representation makes it simple
and straightforward to add markables and entire
markable levels to existing annotations. It is also a
natural way to represent non-hierarchical relations
like overlap between markables. For example, a
segment break runs through the nominal chunk
represented by markable markable 7836 ([a
specification]) in the example above. If the seg-
ment markables were defined in terms of the mark-
ables contained in them, this would be a prob-
lem because the nominal chunk crosses a segment
boundary. The downside of this lazy representa-
tion is that more processing is required for e.g.
querying, when the structural relations between
markables have to be determined.
2. Markables can be discontinuous. A markable
normally spans a sequence of base data elements.
Each connected subsequence of these is called a
markable fragment. A discontinuous markable is
one that contains more than one fragment, like
3Note that this merely means that markables are not
defined in terms of other markables, while they can in-
deed reference each other: In the above example, markable
markable 7837 ([’s just a specification]) uses an associa-
tive relation (in this case namedsubject) to represent a ref-
erence to markable markable 7834 ([That]) on the same
level. References to markables on other levels can be repre-
sented by prefixing the markable ID with the level name.
markable markable 7838 ([the XML format])
above. Actually, this markable exemplifies what
could be called discontinuous overlap because it
does not only cross a segment boundary, but it also
has to omit elements from an intervening segment
by another speaker.
3 Accessing Data From Within MMAX2
3.1 Visualization
When a MMAX2 document is currently loaded,
the main display contains the base data text plus
annotation-related information. This information
can comprise
• line breaks (e.g. one after each segment),
• markable feature’s values (e.g. the
participant value at the beginning
of each segment),
• literal text (e.g. a tab character after the
participant value),
• markable customizations, and
• markable handles.
The so-called markable customizations are in
charge of displaying text in different colors, fonts,
font styles, or font sizes depending on a mark-
able’s features. The order in which they are ap-
plied to the text is determined by the order of
the currently available markable levels. Mark-
able customizations are processed bottom-up, so
markable levels should be ordered in such a way
that levels containing smaller elements (e.g. POS
tags) should be on top of those levels contain-
ing larger elements (chunks, segments etc.). This
way, smaller elements will not be hidden by larger
ones.
When it comes to visualizing several, poten-
tially embedded or overlapping markables, the so-
called markable handles are of particular impor-
tance. In their most simple form, markable han-
dles are pairs of short strings (most often pairs
of brackets) that are displayed directly before and
after each fragment of a markable. When two
or more markables from different levels start at
the same base data element, the nesting order of
the markables (and their handles) is determined
on the basis of the order of the currently avail-
able markable levels. The color of markable han-
dles can also be customized depending on a mark-
able’s features. Figure 1 gives an idea of what the
74
Figure 1: Highlighted markable handles on a discontinuous (left) and an overlapping (right) markable.
MMAX2 main window can look like. Both han-
dles and text background for chunk markables
with type=predication are rendered in light
gray. Other handles are rendered in a darker color.
Markable handles are sensitive to mouse events:
resting the mouse pointer over a markable handle
will highlight all handles of the pertaining mark-
able. Reasonable use of markable customizations
and handles allows for convenient visualization of
even rather complex annotations.
3.2 Querying
MMAX2 includes a query console which can be
used to formulate simple queries using a special
multi-level query language called MMAXQL. A
query in MMAXQL consists of a sequence of
query tokens which describe elements (i.e. either
base data elements or markables) to be matched,
and relation operators which specify which rela-
tion should hold between the elements matched
by two adjacent query tokens. A single markable
query token has the form
string/conditions
where string is an optional regular expression
and conditions specifies which features(s) the
markable should match. The most simple condi-
tion is just the name of a markable level, which
will match all markables on that level. If a regular
expression is also supplied, the query will return
only the matching markables. The query
[Aa]n?\s.+/chunks
will return all markables from the chunks level
that begin with the indefinite article4. Markables
with particular features can be queried by specify-
ing the desired attribute-value combinations. The
4The space character in the regular expression must be
masked as \s because otherwise it will be interpreted as a
query token separator.
following query e.g. will return all markables from
the chunks level with a type value of either nn
or demonstrative:
/chunks.type={nn,demonstrative}
If a particular value is defined for exactly one at-
tribute on exactly one markable level only, both
the level and the attribute name can be left out in
a query, rendering queries very concise (cf. the ac-
cess to the meta level below).
Relation operators can be used to connect
two query tokens to form a complex query.
The set of supported sequential and hierarchi-
cal relation operators5 includes meets (default),
starts, starts with, in, dom, equals,
ends, ends with, overlaps right, and
overlaps left. Whether two markables stand
in a certain relation is determined with respect
to the base data elements that they span. In the
current early implementation, for all markables
(including discontinuous ones), only the first and
last base data element is considered. The re-
sult of a query can directly be used as the input
to another query. The following example gives
an idea of what a more complex query can look
like. The query combines the segment level, the
meta level (which contains markables represent-
ing e.g. pauses, emphases, or sounds like breath-
ing or mike noise), and the base data level to re-
trieve those instances of you know from the ICSI
Meeting corpus that occur in segments spoken by
female speakers6 which also contain a pause or an
emphasis (represented on the meta level):
’[Yy]ou know’ in (/participant={f.*} dom /{pause,emphasis})
The next query shows how overlap can be han-
5Associative relations are not discussed here, (M¨uller,
2005).
6The first letter of the participant value encodes the
speaker’s gender.
75
dled. It retrieves all chunk markables along with
their pertaining segments by getting two partial
lists and merging them using the operator or.
(/chunks in /segment) or (/chunks overlaps_right /segment)
4 Accessing Data by Means of XSL
MMAX2 has a built-in XSL style sheet proces-
sor7 that can be used from the console to read
a MMAX2 document and process it with a user-
defined XSL style sheet. The XSL processor pro-
vides some special extensions for handling stand-
off annotation as it is realized in MMAX2. In the
current beta implementation, only some basic ex-
tensions exist. The style sheet processor can be
called from the console like this:
org.eml.MMAX2.Process -in INFILE.mmax -style STYLEFILE.xsl
The root element of each MMAX2 document is
the words element, i.e. the root of the base data
file, which will be matched by the supplied XSL
style sheet’s default template. The actual process-
ing starts in the XSL template for the word ele-
ments, i.e. the root element’s children. A minimal
template looks like this:
<xsl:template match="word">
<xsl:text> </xsl:text>
<xsl:apply-templates
select="mmax:getStartedMarkables(@id)"
mode="opening"/>
<xsl:value-of select="text()"/>
<xsl:apply-templates
select="mmax:getEndedMarkables(@id)"
mode="closing"/>
</xsl:template>
The above template inserts a white space before
the current word and then calls an extension func-
tion that returns a NodeList containing all mark-
ables starting at the word. The template then
inserts the word’s text and calls another exten-
sion function that returns a NodeList of markables
ending at the word. The markables returned by
the two extension function calls are themselves
matched by XSL templates. A minimal template
pair for matching starting and ending markables
from the chunks level and enclosing them in
bold brackets (using HTML) looks like this:
<xsl:template match="chunks:markable" mode="opening">
<b>[</b>
</xsl:template>
<xsl:template match="chunks:markable" mode="closing">
<b>]</b>
</xsl:template>
Note how the markable level name (here:
chunks) is used as a markable name space
to control which markables the above templates
should match. The following templates wrap a
pair of <p> tags around each markable from
7Based on Apache’s Xalan
the segment level and adds the value of the
participantattribute to the beginning of each.
<xsl:template match="segment:markable" mode="opening">
<xsl:text disable-output-escaping="yes">&lt;p&gt;</xsl:text>
<b>
<xsl:value-of select="@participant"/>
</b>
</xsl:template>
<xsl:template match="segment:markable" mode="closing">
<xsl:text disable-output-escaping="yes">&lt;/p&gt;</xsl:text>
</xsl:template>
Creating HTML in this way can be useful for
converting a MMAX2 document with multiple
levels of annotation to a lean version for distrib-
ution and (online) viewing.
5 Conclusion
MMAX2 is a practically usable tool for multi-
level annotation. Its main field of application is
the manual creation of annotated corpora, which
is supported by flexible and powerful means of vi-
sualizing both simple and complex (incl. overlap-
ping) annotations. MMAX2 also features a sim-
ple query language and a way of accessing anno-
tated corpora by means of XSL style sheets. While
these two data access methods are somewhat lim-
ited in scope, they are still useful in practice. If
a query or processing task is beyond the scope of
what MMAX2 can do, its simple and open XML
data format allows for easy conversion into other
XML-based formats, incl. those of other annota-
tion and query tools.
Acknowledgements
This work has partly been funded by the Klaus
Tschira Foundation (KTF), Heidelberg, Germany,
and by the Deutsche Forschungsgemeinschaft
(DFG) in the context of the project DIANA-Summ
(STR 545/2-1).

References
Janin, A., D. Baron, J. Edwards, D. Ellis, D. Gelbart,
N. Morgan, B. Peskin, T. Pfau, E. Shriberg, A. Stol-
cke & C. Wooters (2003). The ICSI Meeting Cor-
pus. In Proceedings of the IEEE International Con-
ference on Acoustics, Speech and Signal Processing,
Hong Kong, pp. 364–367.
M¨uller, C. (2005). A flexible stand-off data model with
query language for multi-level annotation. In Pro-
ceedings of the Interactive Posters/Demonstrations
session at the 43rd Annual Meeting of the Associa-
tion for Computational Linguistics, Ann Arbor, Mi.,
25-30 June 2005, pp. pp. 109–112.
