A Development Environment for an MTT-Based Sentence 
Generator 
Bernd Bohnet, Andreas Langjahr and Leo Wanner 
Computer Science Department 
University of Stuttgart 
Breitwiesenstr. 20-22 
-70565:Stuttgart, Germany .......... " ........ 
{bohnet \[langjahr\[wanner}©informatik.uni-stuttgart.de 
1 Introduction 
With the rising standard of the state of the art in 
text generation and the increase of the number 
of practical generation applications, it becomes 
more and more important to provide means for 
the maintenance of the generator, i.e. its ex- 
tension, modification, and monitoring by gram- 
marians who are not familiar with its internals. 
However, only a few sentence and text gener- 
ators developed to date actually provide these 
means. One of these generators is KPML (Bate- 
man, 1997). I~PML comes with a Development 
Environment and there is no doubt about the 
contribution of this environment to the popular- 
ity of the systemic approach in generation. 
In the generation project at Stuttgart, the 
realization of a high quality development en- 
vironment (henceforth, DE) has been a central 
topic from the beginning. The De provides sup- 
port to the user with respect to writing, mod- 
ifying, testing, and debugging of (i) grammar 
rules. (ii) lexical information, and (iii) linguis- 
tic structures at different levels of abstraction. 
Furthermore, it automatically generalizes tile or- 
ganization of the lexica and the grammar. In 
what follows, we briefly describe oF,'s main fea- 
tures. The theoretical linguistic background of 
the DE is the Meaning-Text Theory (Mel'euk, 
1988: Polgu~re, 1998). However. its introduc- 
tion is beyond tile scope of this note: tile inter- 
ested reader is asked to consuh the above reg 
erences as well as further literature on the use 
Of MTT ill text generation---for illSlallCe, (Ior- 
danskaja ctal., 1992: I,avoie £- Rainbow. 1997: 
(.'och. 1997). 
2 Global View on the DE 
In MTT, seven levels (or strata) of linguis- 
tic description are distinguished, of which 
five are relevant for generation: semantic 
(Sem), deep-syntactic (DSynt), surface-syntactic 
(SSynt), deep-morphologicM (DMorph) and 
surface-morphological (SMorph). In order to be 
able to generate starting from the data in a data 
base, we introduce an additional, the conceptual 
(Con) stratum. The input structure to DE is thus 
a conceptual structure (ConStr) derived from the 
data in the DB. The generation process consists 
of a series of structure mappings between adja- 
cent strata until the SMorph stratum is reached. 
At the SMorph stratum, the structure is a string 
of linearized word forms. 
The central module of the DE iS a compiler 
that maps a structure specified at one of tile five 
first of the above strata on a structure at the 
adjacent stratum. To support the user in the ex- 
amination of the internal information gathered 
during the processing of a structure, a debug- 
ger and an inspector are available. The user can 
interact with the compiler either via a graphic 
interface or via a text command interface. For 
the maintenance of the grammar, of the lexica 
and of the linguistic structures, the DE possesses 
separate editors: a rule editor, a lexicon editor, 
and a structure editor. 
2.1 The Rule Editor 
The Rules. Most of the grammatical rules 
in an MTT-based generator are two-level rules. 
.'\ two-level rule establishes a correspomlence 
260 
between minimal structures of two adjacent 
strata. Given that in generation five of 
MTT'S strata are used, four sets of two-level 
rules are available: (1) Sem=vDSynt-rules, (2) 
DSynt~SSynt-rules, (3) SSynt=vDMorph rules, 
and (4) DMorph~SMorph-rules. 
Formally, a two-level rule is defined by the 
optimize the organization of the grammar by a u - 
tomatic detection Of common parts in several 
rules and their extraction into abstract 'class' 
rules. The theoretical background and the proce- 
dure of rule generalization is described in detail 
in (Wanner & Bohnet, submitted) and will hence 
not be discussed in this note. 
quintuple (/2, Ctxt, Conds, 7~, Corr). £ specifies While editing a rule, the developer has the 
the lefthand side :of the r.,ule-~a,~minimal~so~rce~a stc~nd~r.d,c.ommands:,:~t,,his/'her~ disposal. Rules 
substructure that is mapped by the rule onto its can be edited either in a text rule editor or via 
destination structure specified in 7~, the right- 
hand side of the rule. Ctxt specifies the wider 
context of the lefthand side in the input structure 
(note that by far not all rules contain context in- 
formation). Conds specifies the conditions that 
must be satisfied for the rule to be applicable to 
an input substructure matched by £. Corr spec- 
ifies the correspondence between the individual 
nodes of the lefthand side and the righthand side 
structures. 
Consider a typical Sem=~,DSynt-rule, 
which maps the semantic relation '1' that 
holds between a property and an entity 
that possesses this property onto the deep- 
syntactic relation ATTR. The names begin- 
ning with a '7' are variables. The condition 
'Lex::(Sem::(?Xsem.sem).lex).cat = adj' 
requires that the lexicalization of the property 
is an adjective. '?Xsem ~ ?Xdsynt' and '?Ysem 
¢:~ ?Ydsynt' mean that the semantic node ?Xsem 
is expressed at the deep-syntactic stratum by 
?Xdsynt, and ?Ysem by ?Ydsynt. 
property (Sem_DSynt) { 
leftside : 
?Xsem -i-+ ?Ysem 
condit ions : 
Sem: :?Xsem.sem,1:ype = property 
Lex: :(Sem::(?Xsem.sem).lex).cat = adj 
rightside: 
?Xds 
?Yds 
?Yds -ATTR-+?Xds 
correspondence : 
?Xsem ~ ?Xds 
?Ysem ~ ?Yds} 
The rule editor (l~t-~) has two main \['unctions: 
(i) to support the mai)~tenance (i.e. editing and 
examination) of grammatical rules, and (ii) to 
a graphic interface. Obviously incorrect rules 
can be detected during the syntax and the se- 
mantic rule checks. The syntax check exam- 
ines the correctness of the notation of the state- 
ments in a rule (i.e. of variables, relations, con- 
ditions, etc.)--in the same way as a conventional 
compiler does. The semantic check examines 
the consistency of the conditions, relations, and 
attribute-feature pairs in a rule, the presence of 
an attribute's value in the set of values that are 
available to this attribute, etc. If, for instance 
in the above rule 'adj' is misspelled as 'adk' or 
erroneously a multiple correspondence between 
?gds and ?Xsem and ?Ysem is introduced, the 
rule editor draws the developer's attention to tile 
respective error (see Figure 1). 
Rule Testing. Rule testing is usually a very 
time consuming procedure, this is so partly be- 
cause tile generator needs to be started as a 
whole again and again, partly because tlle re- 
suiting structure and the trace must be carefully 
inspected in order to find out whether tile rule 
in question fired and if it did not fire why it 
did not. The DE attempts to minimize this el'- 
fort. With "drag and drop' the developer can 
select one or several rules and apply them onto 
an input structure (which can be presented ei- 
ther graphically or in a textual format; see be- 
low). When a rule dropped onto the structure 
fires, the affected parts of the input structure are 
made visually prominent, and the resulting out- 
put (sub)structure appears in the corresponding 
window of the slructure editor. If a rule did not 
fire. the inspector indicates which conditions of 
tim rule in question were not satisfied. See also 
I)elow lhe description of the features of lhe in- 
spect or. 
261 
Figure 1: Error messages of the rule editor 
2.2 The Structure Editor 
The structure editor manages two types of win- 
dows: windows in which the input structures are 
presented and edited, and windows in which the 
resulting structures are presented. Both types of 
windows can be run in a text and in a graphic 
mode. The input structures can be edited in 
both modes, i.e., new nodes and new relations 
can be introduced, attribute-value pairs associ- 
ated with the nodes can be changed, etc. 
In the same way as rules, structures can be 
checked with respect to their syntax and se- 
mantics. Each structure can be exported into 
postscript files and thus conveniently be printed. 
2.3 The Lexicon Editor 
The main function of the lexicon editor is to sup- 
port the maintenance of the lexica. Several types 
of lexica are distinguished: conceptual lexica, se- 
mantic lexica, and lexico-syntactic lexica.. 
Besides tile standard editor functions, the 
lexicon editor provides the following options: (i) 
sorting of tile entries (either alphabetically or ac- 
cording to such criteria as 'category'); (ii) syntax 
check; (iii) finding information that. is common 
to several entries and extracting it into abstract 
entries (the result is a hierarchical organization 
of the resource). During the demonstration, each 
of these options will be shown ilt action. 
2.4 The Inspector 
The inspector fulfils mainly three functions. 
First. it presents information collected (luring 
the application of the rules selected by the de- 
veloper Io :-ill inpul structure. This illfornia- 
tion is especially useful for generation experts 
who are familiar with the internal processing. It 
concerns (i) the correspondences established be- 
tween nodes of the input structure and nodes of 
the resulting structure, (ii) the instantiation of 
the variables of those rules that are applied to- 
gether to the input structure in question, and 
(iii) the trace of all operations performed by the 
compiler during the application of the rules. 
Second, it indicates to which part of the input 
structure a specific rule is applicable and what 
its result at the destination side is. Third, it in- 
dicates which rules failed and why. The second 
and third kind of information is useful not only 
for generation experts, but also for grammarians 
with a pure linguistic background. 
Figure 2 shows a snapshot of the inspecLor 
editor interface. Sets of rules that can simulta: 
neously be applied together to an input struc- 
ture without causing conflicts are grouped dur- 
ing processing into so-called clusters. At the 
left side of the picture, we see two such clus- 
ters (Cluster 13 and Cluster 22). Tile instances 
of the rules of Cluster 13 are shown to the righl 
of the cluster pane. The cluster pane also con- 
tains sets of rules that failed (in the picture, the 
corresponding icon is not expanded). The left 
graph in Figure 2 is the input structure to which 
the rules are applied. For illustration, one of 
the rules, namely date, has been selected for ap- 
plication: tile highlighted arcs and nodes of tile 
input structure are the part to which date is ap- 
plicable. 'Pile result of its application is tile tree 
at the right. Beneath the graphical structures, 
we see tile correspondence between input nodes 
262 
SOUlCe s~'ucture:( 1 ) 
Evaluation F allecl 
Clu$1er 22 ~3 • 
+ ..... ,\\! 
IX.... " +, .... \ \ 
• ,.j ~m .+ 2 
taq tm,.: a,,:;,t,i 
ii.i.i.~i+_ 
~Jt"l 
RI~¢ ~-,~k--,~+--Bg~tIl* \ ' 
T A rr~ 
~4 
+ ....... \]li. 
N}~Y?P:'-..4J .~ ~ ,. -7"--=-- -- 
!p!e(?) ........ Io¢in Ume ,~.~!e(?} 
• ~ :o9(.__~ ............ 
ocalJoln(5) ....... :locln .spaco(6) 
171 (g) .371 (9) 
;~yerltB.~__) ........... .VVOrI(I 1 ) _ __ . 
+.~(L0). ...... i.~7.(!_al ................. 
:CO(B) ,CO(141 
........... ,,,, ,, ,,,,, 
Figure 2: The inspector interface of the DE. 
and result nodes. The numbers in parentheses 
are for system use. 
2.5 The Debugger 
In the rule editor, break points within individual 
rules can be set. When the compiler reaches a 
break point it stops and enters the debugger. In 
the debugger, the developer can execute the rules 
statement by statement. As in the inspector, the 
execution trace, the variable instantiation and 
node correspondences can be examined. During 
the demonstration, the function of the debugger 
will be shown in action. 
3 Current Work 
DE is written in ,Java 1.2 and has been tested on 
a SUN workstation and on a PC pentium with 
300 MHz and 128 MB of RAM. 
Currently, the described functions of the DE 
are consolidated and extended bv new features. 
The most important of these features aa'e the im- 
port and the export feature. The import feature 
allows for a transformation of grammatical rules 
and lexical information encoded in a different 
format into the format used bv our generator. 
Tests are being carried out with the import of 
RealPro (Lavoie ,~,: Rainbow. 1997) grammati- 
cal rules and lexical information (in particular 
subcategorization and diathesis information) en- 
coded in the DATR-formalism. The export fea- 
ture allows for a transformation of the rules and 
lexical information encoded in our format into 
external formats. 

References
Bateman, J.A. 1997. Enabling technology for mul- 
tilingual natural  generation: the KPblL 
development environment. Natural Language Engi- 
neering. 3.2:15-55. 
Coch, J. 1997. Quand l'ordinateur prend la plume : 
la gdn4ration de textes. Document Numdrique. 1.:~. 
Iordanskaja, L.N., M. Kim, R. Kittredge, B. Lavoie 
~: A. Polgu6re. 1992. Generation of Extended Bilin- 
gual Statistical Reports. COLING-92. 1019 1022. 
Nantes. 
Lavoie, B. & O. Rainbow. 1997. A fast and portable 
realizer for text generation systems. Proceedings of 
the Fifth Conference on Applied Natural Language 
Processing. Washington, DC. 
Mel'euk, I.A. 1988. Dependenc!l Syntaz:: Theory and 
Prac&ce. Albany: State University of New York 
Press. 
Polgu6re, A. 1998. La th~orie sens+textre. I)t- 
alangue,. 8-9:9-30. 
Wanner. L. ~ B. Bohnet. submitted. Inheritance in 
the MTT-grammar. 
