Multilinguality in a Text Generation System 
For Three Slavic Languages 
Geert-Jan Kruijff a, Elke Teich t', John Bateman ~, Ivana Kruijit;Korbayovfi", 
Hana Skoumalovg ~, Serge Sharoff 'l, Lena Sokolova d, Tony Hartley ~, 
Kamenka Staykova/, Ji~'~ Hana" 
°Charles University, Prague; ~University of the Saarland; ~University of Bremen; 
aRRIAI, Moscow; °University of Brighton; /IIT, BAS, Sofia 
http://www.itri.brighton.ac.uk/projects/agile/ 
Abstract 
This paper describes a lnultilingual text generation 
system in the domain of CAD/CAM software in-- 
structions tbr Bulgarian, Czech and l:\[ussian. Start- 
ing from a -independent semantic represen- 
tation, the system drafts natural, continuous text 
as typically found in software inammls. The core 
modules for strategic and tactical gene,'ation are im- 
plemented using the KPML platform for linguistic 
resource development and generation. Prominent 
characteristics of the approach implemented a.re a 
treatment of multilinguality that makes maximal use 
of the cominonalities between s while also 
accounting for their differences and a common repre- 
sentational strategy for both text planning and sen- 
tence generation. 
1 Introduction 
This paper describes the Agile system I tbr the 
multilingual generation of instructional texts as 
found in soft;ware user-manuals in Bulgarian, 
Czech and Russian. The current prototype fo- 
cuses on the automatic drafting of CAD/CAM 
software documentation; routine passages as 
found in the AutoCAD user-manual have been 
taken as target texts. The application sce- 
nario of the Agile system is as follows. First, 
a user constructs, with the help of a GUI, 
-independent task models that spec- 
ify the contents of the documentation to be 
generated. The user additionally specifies the 
 (currently Bulgarian, Czech or Rus- 
sian) and the register of the text to be gen- 
erated. The Agile system then produces con- 
tinuous instructional texts realizing the speci- 
fied content and conforming to the style of soft- 
ware user-manuals. The texts produced are 
1EU Inco-Copernicus project PL961004: 'Automatic 
Generation of Instructional Texts in the Languages of 
Eastern Europe' 
intended to serve as drafts for final revision; 
this ~drafting' scenario is therefbre analogous to 
that first explored within the Drafter project. 
Within the Agile project, however, we have ex- 
plored a more thoroughly nmltilingual architec- 
ture, making substantial use of existing linguis- 
tic resources and components. 
The design of the Agile system overall re, sts 
on the following three assumI)tions. 
First, the input of the system should be spec- 
ified irrespective of any particular output lan- 
guage. This means that the user must be able to 
express the content that she wants the texts to 
convey, irrespective of what natural (s) 
she masters and in what (s) the out- 
put text shouM be realized. Such - 
independent content specification can take the 
form of some knowledge representation pertain- 
ing to the application domain. 
Second, the texts generated as the outtmt of 
the system should be well-formulated with re- 
spect to the expectations of natiw. • speakers of 
each particular  covered by the system. 
Since differences among s may appear 
at any level, -sensitive decisions about 
the realization of the specified content must be 
possible throughout the generation process. 
And third, the notion of multilinguality em- 
ployed in the system should be recursive, in 
the sense that the modules responsible tbr the 
generation should themselves be multilingual. 
The text generation tasks which are common 
to the s under consideration should be 
pertbrmed only once. Ideally, there should be 
one process of generation yielding output in 
multiple s rather than a sequence of 
monolingual processes. This view of 'intrin- 
sic multilinguality' builds on the approach set 
out in Bateman et al. (1999). Each module of 
the system is fnlly multilingual in that it simul- 
474 
taneously enables both integration of linguistic 
resources, defining commonalities bel;ween lan- 
guages, and resource integrity, in |;bat the in- 
dividuality of each of the -speeitic re- 
sources of a multilingnM ensemble is always pre- 
served. 
We consider these assuml)l;ions an(l the view 
of multilinguality entailed by |;hem to be cru- 
cial for the design of efli;ctive multilingual text 
generation systems. The results so far a(:hicved 
by the Agile system SUl)port this and also ofl'er 
a ~soli(l experiential basis tbr the develot)mcnt of 
fllrther multilingnal generation systems. 
The overall operation of 1;t1(; Agile sysl;em is 
as tbllows. Al/tcr the us(u' has Sl)ecilied some 
inl;en(led text (;OlltenI; (described in Section 2) 
via the Agile GUI, the system i)ro(:eeds to gen- 
eral;e the texts required. To do this, a text 
t)lammr (Section 3) first assigns parts of the, 
task model to text elements and arranges l;h(;m 
in a hierarchical fashion a text t)lan. Then, a 
sentence plammr organizes I;he content of the 
text elements into sentence-sized elml~ks and 
ere~,tes the corresponding input fin' l;he tacti- 
ca,1 generator, expressed in standard sentence 
l)lamfing  (SPI,) lbrmulae. Finally, 1;11(; 
tactical g(meral;or generates t;he linguistic real- 
izations corresponding 1;o these Sl)l~s the text 
(Sect;ion 4). In the stage of the l)rojccI; rt}l)orte(l 
here, we, conceal;rated i)arl;icularly on \])roccdu- 
ral texts. These otlhr sl;el)-by-st;e t) des(:rit)t;ions 
of how to perlbrm domain tasks using the given 
software tools. A simplified version of one such 
procedural text is given (tbr English) in Fig- 
ure 1. This architectm:e mirrors the reference 
architecture for generation diseusse(t in I/,eiter 
8z Dale (1.997). The modules of the system are 
1)ipelined so that a continuous text is generated 
realizing the intended meaning of the inlmt se- 
mantic representation without backtracking or 
revision. 
Several important properties have ('haracter- 
ized the method of development leading to the 
Agile system. These are to a large extent re- 
sponsible for the eflhetiveness of the system. 
These include: 
Re-use and adaptation of available re- 
sources. We have re-used snt)stantial bodics 
of e, xisting linguistic resources at all levels rel- 
evant for the system; this t)laye(l a (:rueial role 
in achieving the Sol)histieatcd generation capa- 
7b d~nw a polylinc 
First start the PLINE command using one of these meth- 
ods: 
Windows From the Polylinc tlyout on the, l)raw tool~ 
lmr, choose Polylinc. 
DOS and UNIX lqom the Draw menu, choose Poly- 
line. 
1.. Spccit~y the start point of the polyline. 
2. S1)ecil~y tim next point of the 1)olylinc. 
3. Press ll,cturn t;o end the polyline. 
Figure l: Example "To draw a polyline" 
bilities now displayed by the system in each of 
its s of expertise prior to the project 
t\]m'l.'e were 11o substantial ~mtomatic generation 
systenls fi)r any of the s covered. The 
core modules for strategic and ta(:ti('al gener- 
ation were all imt)lemcnted using the Kernel- 
Penman Multilingual system (KPML: ef Bate- 
man et al., \]999) a Common l,isp base(t gram- 
mar development environment, in addition, 
we adopted the Pemnan Upt)er Model as used 
within Pemnan/KPMl~ as the basis tbr our 
linguistic semantics; a more rcstri(:ted domain 
model (DM) rclewmt to the CAD/CAM-domain 
was &',lined as a st)e('ialization of l;he UM con- 
(:epts. The I)M was iuspired by the domain 
me(tel of the Drafter l)rojet:t, but l)res(ml;s a 
g(m(',ralizati()n ()f the latter in that it allows for 
eml)(;d(ling t:asks and illsLrut'|;ions t:o any arlfi- 
l;rm:y re(:ursive depth (i.e., more complex l;cxt; 
plans). Ah'eady existing lexical resom:ces and 
morphological modules availabh; to the 1)ro.j(',ct 
were re-used tbr Bulgarian, Czech and l~.ussian: 
the Czech and Bulgarian components were mo(t- 
ules written in C (e.g., IIaji(: L; Hla(lk~, 1997, 
tbr Czech) that were interfimed with KPMI, us- 
ing a standard set of API-methods (of. Bate- 
man & Sharoff, 1998). Finally, because no 
grammars suitable for generation in Bulgarian, 
Czech and l/.ussia,n existed, a grammar tbr En- 
glish (NIGEL: Mann & Matthiessen, 1985) was 
re-used to lmild them; tbr the theoretical basis 
of this technique see Teich (1995). 
Combination of two methods of resources 
development. Two methods were com- 
bined to enable us to develop basic general- 
 grammars and sub grammars 
fin: CAD/CAM instructional texts at; |;11(; same 
time. One nmthod is the system-oriented one 
aimed at lmildiug a computational resource 
475 
with a view of the whole  system: this 
is a method strongly supported by the KPML 
development environment. The other method 
is instance-oriented, and is guided by a detailed 
register analysis. The latter method was partic- 
ularly important given the Agile goal of being 
able to generate texts belonging to rather di- 
verse text types-- e.g., impersonal vs. personal; 
procedural, flmetional descriptions, overviews 
etc. 
Cross-linguistic resource-sharing. A cross- 
linguistic approach to linguistic specifications 
and implementation was taken by maximizing 
resource sharing, i.e. taking into account sim- 
ilarities and differences among the treated lan- 
guages so that development tasks have been dis- 
tributed across different s and re-used 
wherever possible. 
2 Language-independent Content 
Specifications 
The content constructed by a user via the Ag- 
ile GUI is specified in terms of Assertion-bozes 
or A-boxes. These A-boxes are considered to 
be entirely neutral with respect to the  
that will be used to express the A-box's con- 
tent. Thus individual A-boxes can be used for 
generating multiple s. A-boxes spec- 
i(y content by instantiating concepts from ~,he 
DM or UM, and placing these concepts in rela- 
tion to one another by means of configurational 
concepts. The configurational concepts define 
adnfissible ways in which content can be struc- 
tured. Figure 2 gives the configurational con- 
cepts distinguished within Agile. 
Procedure A procedure has three slots: 
(i) GOAL (obligatory,filled by a USER-AcTION), 
(ii) METIIODS (optional, filled by a METHOD-LIsT), 
(iii) SIDE-EPFECT (optional, filled by a USER- 
EVENT). 
Method A method has three slots: 
(i) CONSTRAINT (optionM, filled by an OPERATING- 
SYSTEM), 
(ii) PaEeONDITION (optional, filled by a PROCE- 
DURE), 
(iii) SUUSTEPS (obligatory, filled by a PI~OCEDUI/E- 
LIST). 
Method-List A METIIOD-LIST is a list of h/IETIIOD'S. 
Procedure-List A PROCEI)URE-LIST is a list of 
PROCEDURE:S. 
Figure 2: Configurational concepts 
Configurational concepts are devoid of actual 
content. Tile content is provided by inst, antia- 
tions of concepts that represent various user ac- 
tions, interface events, and interface modalities 
and functions. Taken together, these instanti- 
ations provide the basic propositional content 
tbr instructional texts and are taken as input 
tbr the text planning process. 
3 Strategic Generation: From 
Content Specifications to Sentence 
Plans 
To realize an A-box as a text, we go through suc- 
cessive stages of text planning, sentence plan- 
ning, and lexico-grammatical generation (cf 
also Reiter & Dale, 1997). At each stage there 
is an increase in sensitivity to, or dependency 
on, the target  in which output will 
be generated. Although the text planner itself 
is -independent, the text; plamfing re- 
sources may (lifter fl'om  to  
as much as is required. This is exactly analo- 
gous to the situation we find within the individ- 
ual  grammars as represented within 
KPML: we therefore represent the text planning 
resources in the same fashion. For the text type 
and s of concern here, however, w~ria- 
lion across s at the text planning stage 
turned out to be minimal. 
The organization of an A-box is used to guide 
the text planning process. Itere, we draw a dis- 
tinction between text structure elements (TSEs) 
as the elements from which a (task-oriented) 
text, is built ut), and text templates', which con- 
dition the way TSEs are to be realized linguis- 
tically. We locus on the relation between con- 
cepts on the one hand, and TSEs on the other. 
We are specifically interested in the configura- 
tional concepts that are used to configure the 
content specified in an A-box because we want 
to maintain a close connection between how the 
content can be defined in an A-box and how 
that content is to be spelled out in text. 
3.1 Structuring and Styling 
A text structure element is a predefined com- 
ponent that needs to be filled by one or more 
specific parts of the user's content definition. 
Using the reader-oriented terminology common 
in technical authoring guides, we distinguish 
a small (recursively defined) set of text TSEs; 
these are listed in Figure 3. 
476 
Task-Docmnent A TASK-\])OCUMFNT has tWO slots: 
(i) TASK-TFI'I,E (ol)ligatory), 
(ii) TASK-INSTI{U(ITIONS (obligatory), being a list 
of" at least one ~\[NSTRUCTION. 
Instruction An INSTRUCTION has three slots: 
(i) TASKS (obligatory), being a list of at least one 
TASK~ 
(ii) CONSTRAINT (optional), 
(iii) Pm,ZCONDITION (optional). 
Task A TASK has two slots: 
(i) INSTRUCTIONS (ol)tional), 
(ii) SII)I';-EI,'I"I,:C'I' (ol)tional). 
Figure 3: Text Structure Elements (TSEs) 
The TSEs are placed in correspondence with 
the configurational concet)ts of the DM (cf. Fig- 
ure 2); this enat)les us to lmild a text stru('ture 
l;hat folh)ws the structuring of the content in an 
A-1)ox (cf. Figure 4). 
Orthogonal to the notion of text structure el- 
ement is the notion of text temt)late. Whereas 
TSEs capture what needs to be realized, the 
text template (:al)tures how that content is to 
1)e realized. Thus, a feint)late defines a style 
for expressing the content. Am we discuss be- 
low, we define text templates in terms of con- 
straints on the realization of si)e(:iti(" (in(tivid- 
ual) TSEs. D)r examt)le, whereas in Bulgarian 
and Czech headings (to which the '\]'ASK-TITLE 
element corresponds: of. Figure 4) are usually 
realized as nominal groups, in the Russian Au- 
toCAD ulallnal headings are realized as nonii- 
nile purpose clauses as they are ill English. 
3.2 Tex~ Planning g~ Sentence Planning 
The major component of the text pbmner is 
fi)rnmd by a systemic network fi)r text struc- 
turing; this network, called the text structur- 
ing region, defines an additional level of linguis- 
tic resources for the level of genre. This region 
constructs text structures in a way that is very 
similar to the way in which the systemic net- 
works of the grammars of the tactical genera- 
|or build up grammatical structures. In fact, 
by using KPML to implement this means for 
text structuring, the interaction between global 
level text generation (strategic generation) and 
lexico-grammatical expression (tactical genera- 
tion) is greatly facilitated. Moreover, this al)- 
t)roach has the advantage |;tint constraints on 
output realization can 1)e easily accmnulated 
and propagated: for example, the text plan- 
ner can iml)ose constraints on the output lexico- 
grammatical realization of particular text t)lan 
elements, such am the realization of text head- 
ings by a nominalization ill Czech and Bulgar- 
|an or by an infinite purpose clause in Rus- 
sian. This is one contribution to overcoming the 
notorious generation gap prol)leln caused when 
a text planning module lacks control over the 
line-grained distinctions that m'e available in a 
grmmnar. Ill our case, both text plamfing and 
sentence planning are integrated into one and 
the same system and are distinguished by strat- 
ification. 
TASK-TITLE ~-} GOAl, of topmost PROCEDURE 
TASK-INSTRUCTIONS ~-} METIIODS of PROCEDUI/E 
Sll)E-EIq,'ECT ~ SIDhl-EFFI~CT of PROCEDUII.I\] 
TASK /-~ GOAL of PROCEI)IHtI,; 
(-JONSTRA1NT <-} CONSTRAINT of ~41,VI'IIO1) 
PRECONI)ITION ~ PIH¢COND1TION of ~,4ETI1OI) 
1NSTIIUCTI(IN-TAsKS 1--} SUBSTH)S of a METIIOD 
INST1HJCTION +5 MI,TI'IIOD 
Figure 4: Mapping TSEs and configurational 
concepts defined in the DM 
Following on from the orthogomflity of text 
t/;mplates and text structure elements, the text 
structuring region consists of two parts. One 
1)arl; deals wil;h interpreting the A-box in terms 
of TSEs: traversing l;he network of this part of 
the region produces a text structure for the A- 
b/lx contbrufing to the definitions above. The 
second part of the region imposes constraints 
on the realization of the TSEs introduced by 
the first part. Divers(; constraints can be ira- 
posed depending on the user's choice of style, 
e.g., personal (featuring ppredominantly imper- 
atives) vs. impersonal (tbaturing indicatives). 
Tile result of text plmming is a text plan. 
This can be thought of as a hierarchical struc- 
ture (built by TSEs) with lilts of A-box content 
at; its leaves together with additional constraints 
imposed by the text planning process: e.g., that 
the Title segment of the document should not be 
realized as a full (:lause but; rather as a nominal 
phrase or a lmrt)osive det)endent clause. The 
text plan may also include constraints on pre- 
ferred layout of the docmnent elements: this 
ilflbrmation is passed on via HTML annotations. 
The sentence plmmer then takes this text plan 
as intmt, and creates SPL tbrmulae to express 
477 
the content identified by the text plan's leaves. 
The resulting SPLs can also group one or more 
leaves together (aggregation) det)ending on de- 
cisions taken by the text planner concerning dis- 
course relations. Furthennore, constraints on 
realization that were introduced by the text- 
planner are also included into the SPLs at this 
stage. 
Of particular interest multilingually is the 
way concepts may require different kinds of re- 
alizations ill different s. For example, 
s need not of course realize concepts 
as single words: in Czech the concept Mcn,t 
gets realized as "menu" but the interface modal- 
ity Dialogboz is realized as a multiword expres- 
sion "dialogovd okno" (whose compofients i.e., 
an adjective and a nominal head may undergo 
various grammatical operations independently). 
The Agile system sentence plammr handles such 
cases by inserting SPL fbrms corresponding to 
the literal semantics of the complex expressions 
required; these are then expressed via the tac- 
tical generator in the usual way. The result- 
ing SPL formulas thus represent the - 
specitic semantics of the sentences to be gener- 
ated. Otherwise, if a concept maps to a single 
word, the sentence planner leaves the fnrther 
specification of how the concept should be re- 
alized to the lexico-grammar and its concept- 
to-word mapI)ings. More extensive diflb.rences 
between s are handled by conditional- 
izing the text and sentence planner resources 
fltrther according to . 
4 Tactical Generation: From 
Sentence Plans to Sentences 
The tactical generation component that colt- 
structs sentences (and other grammatical units) 
fl'om the SPL tbrmulae specified in the text 
plan relies on linguistic resources tbr Bulgarian, 
Czech and Russian. The necessary grammars 
and lexicons have been constrncted employing 
the methods described in Section 1. As ,toted 
there, the crucial characteristic of this model 
of nmltilingual representation is that it allows 
tbr the representation of both, commonalities and 
differences between s, as required to 
cover the observable eontrastive-linguistic phe- 
nomena. This can be applied even among typo- 
logically rather distant s. 
We first illustrate this with respect to some 
of the contrastive-linguistic t)henomena that are 
covered by this model employing exami)les ti'om 
English, Bulgarian, Czech and Russian. We 
then show the organization of the lexicons and 
briefly describe lexical dloice. 
4.1 Semantic and grammatical 
cross-linguistic variation 
One. of the tenets of our model of cross-linguistic 
variation is that s have a rather high 
degree of similarity semantically attd tend to 
differ syntactically. We can thus expect to have 
identical SPL expressions for Bulgarian, Czech 
and Russian in many cases, although these may 
be realized by diverging syntactic structures. 
However, we also allow for the case in which 
there is no commonality at; this level and even 
the SPL expressions diverge. 2 Example 1 illus- 
trates the latter case (high semantic divergence, 
plus grammatical divergence), and example 2 
the former (semantic commonality, plus gram- 
matical divergence). 
Example 1: English and Russian spa- 
tial PPs. The major lexico-grammatical dif 
ference l)etween English and Russian preposi- 
tional phrases is that the relation expressed by 
the PP is realized by the choice of the prepo- 
sition in English, whereas in Russian, it; is in 
addition realized by case-government. In the 
are.a of spatial PPs, the choice of a particular 
preI)osition in English corresl)onds to a distinc- 
tion in the dimensionality of the object that re- 
alizes the range of the relation expressed by the 
PP. For both PPs expressing a location and PPs 
expressing movement, English distinguishes be- 
tween three-dimensional objects (in, into), one- 
or-two-dimensional objects (on, onto) and zero- 
dimensional objects (at, to). 
In Russian, in contrast, zero-or-three dimen- 
sional objects (preposition: v) are opposed 
to one-or-two-dimensional objects (preposition: 
ha). A fnrther difference between the expres- 
sion of static location vs. movement is expressed 
by case selection: na/v+locative case expresses 
static location, v/na+accusative case expresses 
inovement (entering or reaching an object) and 
the preposition k+dative case expresses move- 
inent towards an object (,lot quite reaching or 
2This distinguishes our approach fl'om interlingua- 
based systems, which typically require a common seman- 
tic (or conceptual) input. 
478 
entering it). In the {-onverse relation, motion 
away from an object, s is sele, eted tbr move- 
ment from within an oh.joel;, and ot fbr move- 
men| away from the vicinity of an ot).jeet. Her(;, 
both prel)ositions govern genitive case. The di- 
mensionality of the object is only relevant for 
the distinction between v/na and s/ot, 1)ut not 
for h. Since the concel)tualizations of spatial re- 
lations are ditf'erent across \]'3nglish and Russian, 
the input SPL expressions diverge, as shown in 
Figure 5); rather than using domain model con- 
cepts, these SPL ext)ressions restrict themselves 
to Ut)pe, r Model concepts in order to highlight 
the cross-linguistic contrast. This examl)le illus- 
trates well how it is (}ften ne{:e, ssary t{} 'semanti- 
{:ize,' eve, nts differently in (tilt'ere|d; s in 
order 1;o achieve the most natural results. Not;{; 
that Cze, ch is here very similar to l/nssian. 
a. SPL Russian 
(example 
:name DO-Textl-Ku 
:targetform "Pomestite fragment v bufer." 
:logicalform 
(s / dispositive-material-action 
:lex pomestitj 
:speech-act-id command 
:actee (a / object :lex fragment) 
:destination (d / THREE-D-0BJECT 
:lex bufer))) 
1}. SPL Rn: English 
(example 
:name D0-Textl-En 
:targetform "Put the selection on the clipboard." 
:iogicalform 
(s / dispositive-material-action 
:lex put 
:speech-act-id command 
:actee (a / object :lex selection) 
:destination (d / ONE-0R-TW0-D-0BJECT 
:lex clipboard))) 
Figure 5: SPI, ext}ressions 
Example 2: English, Bulgarian and Czech 
headers in CAD/CAM texts. Grammatical 
ullits (1) (4) below show all ex~£tIllt, le, of ,:r,,ss- 
linguistic commonality at the level of sen|anti{: 
int}ut and divergence at the le, vel of grammar. 
These units all time|ion as selfsutficient Task- 
titles tbr the deseril}tions of particular actions 
that can be t)erformed with the given s{}t'tware. 
(1) En: T{} draw a polyline 
(2) BU: qepTaene na IlOJII4MI4IIFIH 
Drawing- of polylineqNDEF 
NOMINAL 
(3) Cz: Kreslenl kf'ivky 
drawing-NOMINAL \]ine-GEN 
(d) \]/,ll: LIwo6I,I Hal)I4COBaTI, IIO,KHJIIIIIHIO 
in-order draw-INF l)olyline-AcC 
There are two major dit  re,,,ces (:,) (4) 
that need to 1)e accounte, d for: (i) they exhibit 
divergent grammatieal ranks in that (1) and 
(4) are clauses (uontinite), while (2) and (3) are 
nomil,al groul,s (nominalizations); and (ii)they 
show divergent syntactic realizations: (2) 
and (3) ditl'er in that in Bulgarian, wlfich does 
not have (:as(',, the relation 1)etween the syntactic 
head Met)q_'aelte (ch, crtacnc) and the modifier lie- 
:mamma (polilinia) is (;xt)ressed by a t)re, position 
na (ha), whereas in Cze, ch, which has cast, this 
relation is expressed by genitive case, (k¢ivky). 
\])espite these (litferen(:es, only the first diver- 
gen(:e has any (;onsequen(:{;s for the S\])L ext)res- 
sions rcquir(;d; I;hc l)asie semantic commona\]- 
ity among (1)(4) is 1)reserve, d. This is shown 
in Figm:e 6 t)y me, ans of the standard linguis- 
tic conditionalization 1)rovided 1)y KPML l'or all 
levels of linguistic des(:ription. The COll(tition- 
alization shows that both the English (1.) and 
the Russian (4) ar(' nontinite clauses while, the 
\]hdgarian (2) and the Czech (3) are nominM- 
izations. These S\])l, ext)ressions also show the 
use of (lom~dn ('onc(;1)ts as i)rodu('e(l by the text 
tfl~mner rathe, r than Ut)lmr model concepts as in 
the SPLs in Figure, 5. 
(example 
:name DO-Textl 
:logicalform 
(s / DM::draw 
:en :ru :PROPOSAL-Q & PROPOSAL 
:bu :cz :EXIST-SPEECHACT-Q & NOSPEECHACT 
:actee (d / DM::polyline))) 
Figure 6: Multilingual SPL e, xpression tbr the 
header examlfles 
The second differen('e is handled by the gener- 
ation grmmnars internally. Here, Bulgarian and 
Czech share the basic tractional-grammatical 
description of t)ostmotlifie, rs tbr nomilmlizati(ms 
(Figm:e 7). The ditl'erence in structure only 
479 
shows in syntagmatic realization and is separate 
from the functional description: For Bulgarian, 
the postmodifier marker Ha (ha: %f') is inserted, 
and tbr Czech, the nominal group realizing the 
Postmodifier is attributed genitive ease. a 
(gate 
:name MEDIUM-QUALIFIER 
:inputs processual-mediated 
:outputs 
((i.0 medium-qualifier 
(:bu :cz preselect Medium nominal-group) 
(:cz preselect Medium noun-gen-case) 
(:bu insert Mediumqualifiermarker) 
(:bu lexify Mediumqualifiermarker na))) 
:region QUALIFICATION) 
Figure 7: Shared system tbr Bulgarian and 
Czech 
4.2 Lexical choice and lexicons 
The lexical items tbr each  are selected 
from the lexicon via the domain model. A DM 
concept is annotated with one or more lexical 
items from each . If there is more than 
one item per , the choice is constrained 
by features imposed by the gralnmar. 
For example the concept DN::draw is anno- 
tated with two lexical items which are the im- 
perfective and perfective forms of the verb draw 
in Czech, Bulgarian and Russian. If the gram- 
mar selects imperfective aspect, tim first is cho- 
sen; if the grammar selects perfective aspect, 
the second is chosen. This mechanism is used 
also fbr the choice between a verb and its nom- 
inalization, among others. With the help of the 
lexicon, the inflectional properties collected tbr 
a particular lexical item during generation are 
translated into a format suitable tbr external 
morphological modules, which are then called. 
The result of the external module, the inflected 
tbrm, is passed back to the KPML system and 
inserted into the grammatical structure. 
5 Evaluation and Conclusions 
A first round of evaluation has been carried 
out on the Agile prototype. This directly as- 
sessed the ability of users to control multilin- 
3This description is also valid for Russian, which has a 
nominal group structure similar to Czech. The 13ulgarian 
one is more like English. 
gual generation in tile three s, as well 
as the design and robustness of the system eom- 
1}onents. Groups of users were given a brief 
training period and then asked to construct 
A-boxes expressing given content. Texts were 
cross-generated: i.e., the s were w~ried 
across the A-boxes independently of the native 
s of the subjects who created them. Er- 
rors were then classified and recommendations 
for the next and final Agile prototype collected. 
The generated texts were then evaluated by ex- 
pert technical authors. They were generally 
judged to be of a broadly similar quality to 
the texts originating from manuals, and both 
kinds of texts received similar criticism. The 
main source of criticism and errors was the de- 
sign of the GUI which is now being improved 
for the final prototype. The overall design of 
the system has theretbre shown itself to offer an 
etfective approach tbr multilingual generation. 
We are now extending the system to cover a 
broader range of text types as well as the further 
grammatical and semantic variation required by 
the evaluators as well as by the additional text 
types. 

References 

Bateman, J. A., Matthiessen, C. M. I. M., & Zeng, L. 
(1999). Multilingual natural  generation 
for multilingual software: a flmctional linguistic 
approach. Applied Artificial hdelligencc, 13(6), 
607-639. 

Bateman, J. A. & Sharoff, S. (1998). Mult, ilingual 
grammars and multilingual lexicons for nmltilin- 
gual text; generation. In Mnltilinguality in the Icz- 
icon II, ECAI'98 Workshop 13, (pp. 1-8). 

Hajie, J. 8; Hladk£, B. (1997). Probabilistic and 
rule-based tagger of an inflective  a 
comparison. In Proceedings of ANLP'97, (pp. 
111-118). 

Mmm, W. C. & Matthiessen, C. M. I. M. (1985). 
Demonstration of the Nigel text generation com- 
puter progrmn. In J. D. Benson 8: W. S. Greaves 
(Eds.), Systemic Perspectives on Discourse, Vol- 
ume 1 (pp. 50-83). Ablex. 

Reiter, E. & Dale, R. (1997). Building applied natu- 
ral  generation systems. Journal of Nat- 
ural Language Engineering, 3, 57-87. 

Tcich, E. (1995). Towards a methodology for the 
construction of multilingual resources tbr multi- 
lingual text generation. In Proceedings of the I J- 
CAI'95 workshop on multilingual generation, (pp. 
136-148). 
