" : 7 , ,q,. Generating Multilingual Docum(mt,  f'rolll a, knowledge \]3a,, (. 
qh(, I\]:,(.I:ID C Project 
Dietmar l{Ssner 
FAW Ulm 
P.O. Box 2060, 89010 Ulm, Germany 
roesner@gau, uni-trim, de 
Manfre(l Stede 
(University (,t' Toronto and) FAW IIlm 
P.O. liox 2060, 89010 {Jim, Germany 
stede@-faw, un~ -ulm. de 
Abstract 
TECHI)OC is an inll)lemellted system 
demonstrating the flmsibility of gep, erating 
multilingual technical doeulnellts on the ha- 
sis of a lallguage-ill(lel)(mdellt knowledge 
base. Its application domain is user and 
maintenance instructions, which are pro- 
duced fl:om underlying plan structures tel)- 
resenting the activities, the p;u'ticipatlng ob- 
jects with their properties, l'ehttions, and so 
on. This paper gives a brief outline of the 
system architecture and discusses some re- 
cent developments in the project: the addi-. 
tion of actual event simulation in the KII, 
steps towards ;t document anthm'int,; tool, 
and a multilnodal user interface. 
1. Overview 
1.1 Project idea 
q'he availability of technical docmi~ents in 
multiph; \]itllgllltg(!s iS ~1, prM)lem of increas- 
ing significance. Not only (h) consumers 
demand adequate dOCUln(mtal.ion i,l I.heir 
lnother tongue; there are also h,.I,/M requirc- 
lnelfl, s, e.g., with respect to the upcoming 
European conllllOll nlarket; the product reli- 
ability act forces merchants to otl'er complete 
technical doculnentation in the consmner's 
native language. The need to provide such 
a massive amount of multilingual material is 
likely to exceed both the c~q)acities of human 
translators as well as those of nmchine trails- 
lotion technology currently awdlable. Our 
work in the TECIII)OC l)roject is motiwtted 
by the feeling that this situation calls f'()l" 
investigating a potential alternative: to cz- 
I, lOil ",at'ural language gcncralim~ tech'uology 
in order to help overcome J~,e documentatio.n 
problem. 
TE(~IlI)OC operates in the donlain of 
tcch'nical man'tzal.s, whMl was selected for 
two principal reasons. ()n the one hand, 
tlley i'el)r(!sel,t "real-win'hi" texl;s tlm.t arc 
actually usel'uh the (lolnain is practical in- 
stead o1' ;~ "l.oy worhl". On the otller hand, 
the language that is used in such nlamml.~ 
telldS to })e relatively simple; one mostly 
finds straiglltf'ot'ward instructions t\]lat have 
been writtxm with the intention to produce 
text that can be readily understood by a per- 
SOil WhO iS execlltill\[_,~ SOllie ll\];tintenan(:e &c- 
tivity. Moreover, as our initial analyses in 
the Iirst phase o\['TECIlI)O(~ had .shown, the 
strltclwl'e of nmnua\] sections is largely uni- 
form and anlenahle to l'ormalization. 
1.2 Outline of the generatiorl 
proccss 
T/",('.III)()C produces nmintenauce instruc- 
tions ill l'h,gli~h, (',(.rmzm and French. The 
sy.~;l.em is I)ase(l i)n a KI ), encoding techni- 
cal (h)maiu kuowledge as well as schematic 
text .'-;I,ructure in IA)OM, a I{I,-()NE (li- 
alect {I,()OM, I991\]. The macro.~t'ructwrc 
of a manual section is captured by schemas 
saying that (if apl)ropriate ) one tirst talks 
about the location of the object to be l'e- 
I)aired/maintained, then about i)ossil)le r(> 
l~lacement I)arts/sul)stances; next, the ;mtiv- 
ities are described, which fall into the three 
general categories of checking seine attril)ute 
(('.f';., a lhlid lew!l), adding a substance and 
replacing ~ l);U't/sulostauee. These actions 
are represented as phuls in the I.raditiona.1 A/ 
sense, i.e. with pre- and postconditions, and 
with recursive structure (steps call be elab- 
orated through complete refinement plans). 
These representations are mapped onto a 
language-independent document representa- 
tion that also captures its microstructure by 
means of RST relations \[Mann and Thoml> 
son, 1987\] with a number of specilh: an- 
notations (e.g., a proposition is to be ex- 
pressed as an instruction, giving rise to inl- 
perative mood). This document represen- 
tation is successively transformed into a se- 
quence of sentence plans (together with for- 
matting instructions in a selectahle target 
format; SGML, IgTEX , Zmacs and --- for 
screen output -- slightly formatted ASCII 
are currently supported), which are handed 
over to sentence generators. For English, we 
use ~Penman' and its sentence planning hm- 
guage (SPL) as input terms, q'o l)roduce 
German and French text, we have imple- 
mented a German version of Pemnan's gram- 
mar (NIGEL), which is enhanced I)y a roof 
phology module, and a fragment of a French 
grammar in the same way. 
For a more detailled description of the 
system architecture see \[R6sner and Stede, 
1992b1. 
2 The Knowledge Base 
The Knowledge Base is encoded in I,OOM. 
In addition to the standard KL-ONE func- 
tionality (structured inheritance, separa- 
tion of terminological and assertional knowl- 
edge), LOOM supl)orts object-oriente.d and 
also rule-based programming. 
In addition to tile 'Upper Model' of 
the Penman generator (a basic ontology 
that reflects semantic distinctions made 
by language, \[Bateman, 19901) more. than 
1000 concepts and instances constitute the 
TECHDOC KB. They encode the techni- 
cal knowledge as well as the plan struc- 
tures that serve as input to the generation 
process. The domains currently modeled are 
end consumer activities in car maintenance 
and some technical procedures fl'om an air- 
craft maintenance l~ilanual. 
One of the central aims in the design phi- 
losophy of the TECtlDOC knowledge t)ase is 
the separation of domain-independent tech- 
nical knowledge and specific concepts 1)er- 
raining to the particular domain: the I)orta- 
1)ility of general technical knowledge has 
been a concern \['rom the beginning. For 
instance, knowledge about various types of 
tanks (with or without imprinted scales, dip- 
sticks, drain bolts) is encoded on an abstract 
level in tile inheritance network (the 'mid- 
dle nlodel'), and the particular tanks found 
in tile engine domain are attached at, the 
lower end. Similarly, we have an abstract 
model of connections (plugs, bolts, etc.), 
their properties, and the actions pertaining 
to them (phlg-in cormections can be merely 
connected or disconnected, screw connec- 
tions call be tightly or loosely connected, or 
disconnected). Objects with the function- 
ality of connections (e.g., spark phlgs) ap- 
pear at the bottom of the hierarchy. Thus, 
when the system is transt\~rred to a dill'e.rent 
technical domain - as experienced recently 
when we moved to air(:raft manuals ---, large 
parts of the abstract representation levels are 
re-usable. 
3 Document Representa- 
tion Using RST 
Tile first task undertaken ill TECIIDOC was 
a thorough analysis of a corpus of pages from 
multilingual manuals in terms of eonte.nt as 
well as structure of tile sections. A text rep- 
resentation lew~l was sought that captured 
the conuno,mlities of the correponding sec- 
tions of the (lerman, English and French 
texts, i.e. l, hal, was not tailored towards one 
of the spe(:ific languages (for a discussiou of 
representation levels in multilingual gener- 
ation, see \[Grote et al., :1993\]). Rhetorical 
Structure Theory (RST) turned out to 1)e a 
nsefid formalism: for ahnost every section we 
investigated, the RST trees for the different 
language versions were identical. 
Our work with RST gave rise to a number 
of new discourse relations that we found use- 
ful in analyzing our texts. Also, we discov- 
ered several general problems with tile the- 
ory, regarding the status of minimal units 
\['or the analysis and the requirement that the 
340 
text representation be a tree structure all the 
time (instead of a general graph). These and 
other experiences with 1{ST are reported in 
\[E.gsner and Stede, 1992al. 
4 Recent Developments 
4.1 Event sinmlation in the knowl- 
edge base 
We developed a detailled represe.ntation of 
knowledge about actions. Together with 
an action concept, preconditions a,M post.- 
conditions can be defined in a declarative 
way. The preconditions can I)e checked 
against the current state of the knowledge 
base (via BOOM's ASK queries). If the In'e- 
conditions hold, the action can I)e performed 
and the postconditions are communicated to 
the knowledge base (with the TEl,l, facil- 
ity of BOOM). This typically leads to re- 
classification of certain technical objects in- 
volved. With the help of BOOM's produc- 
tion rule mechanisnh additional actions ei- 
ther in the knowledge base or on all output 
medium (e.g., for visualization) can be trig- 
gered, lTn this mode, instru(:tion generation 
is at by-product of simulating the actions that 
the instructions pertain to. 
Being able to take the current stat.e of 
a technical device into account, its in this 
simulation mode, is a prerequisite for uI> 
coming interactiw', applications of instruc- 
tion generation: devices equil)ped with ade- 
quate sensory instruments produce raw data 
I;hat can 1)e fed directly into tlw kuowle@:e 
base. Thereby, the specific situation of the 
device, e.g., the cltr, drives the instruction 
generation process, so that only the truly rel- 
evant information is given to the user. 
4.2 Towards a document authoring 
tool 
A lirst version of an authoring tool has been 
designed and implemented and tested with a 
number of users. The authoring tool allows 
to interactively build up knowledge base in- 
stances of maintenance plans, including the 
actions and objects involved, and to convert 
them immediately into documents in the se- 
letted languages. At, any time, the tool takes 
the current stal.e of tlle knowledge base into 
account: all menus offering select:ons dy.- 
namically construct their selection lists, st) 
that only ol)tions of applieid)le 1,yl)es ;,re of- 
fered. 
4.3 From text generation to a mul- 
timodal information system 
The generated t.exts are now displayed with 
words, groups and phrases and whole sen- 
tences being mouse-sensitive and when 
selected - ofl'ering menus with apl)licatfle 
queries to be directed {,o the underlying 
knowledge base instances. 'Phis allows for 
a nmnl)er of tasks to he performed on the 
generated smface texts, for ex~tmple: 
• pronouns can \])e asked almut their an- 
tecedent referenl,, 
lingnistic items in the output for one 
language can he asked about their cor- 
resl)onding il.ems in the of, her languages 
olttl)llt ~ 
ol)jeets can be asked about their loca- 
tion, answered by a suitable graphic il- 
lusl.ration, 
actions can I)e asked for more detailled 
instructions ou how to perform them, 
answered by a short video seqtlence,. 
In essence, these facilities haw! p;wed the 
way t,o mow~ ('ronl stai.ie, hmc|,ive strhw~s as 
output to an active and dynamic interface 
for the associated knowledge sources and 
theh" various presentation modalities. U'he 
key is that all information types (lexemes 
in various languages, images and objeet's lo- 
cation therein, and video sequences) are as- 
sociated with the underlying 1(15 instances, 
which are in \[,tlrn linked to 1,heir refi'.rents 
in the mouse-sensitiw', output text. l?igure 
1 shows a sample screen, where the user has 
just asked for additional "location" informa- 
tion al)out the dipstick, I)y clicking on the 
word in one of the text output windows. 
341 
I ='y~J?:?:::ii.:\] .~ . IFAW - TECItDOC mnltllin {~ . ~,~:~%'-~,t<?!;?~,,'.~£!i~:~ . .................................................................... 
il ii•  
m~" ....... " {lU~ Auto au~' ebenem Boder ab£t.e.\[len. ;i:1 
9ual text 9enerator, rt, nnin9 (~o "$1m32" : 
Sgs~m In£ormat|on: I :SPEECHflCI 
PENN~N::IHPERflTIVE) 
(R~ / 
COGNITION 
:LEX 
RE~D 
;SENSER (PENI,~:NEARER I PENMhN~KB~PERSON) 
:PHENOHENON (LEVEL INSTANCE 
I LEVEL) 
Olala9 
id den Motor aus steller, Bann das 1,:l 
ttoroel ueoerprue£en. Ben ~!~1 ~.1 
~~ent£ernen. thn abu.tschen. ~1 
IIn wledereln£uehre~, ihn t,Jieder 
it£ernen, und den Stand ablesen. Er ~ I1 
'l lle zwischen der °berer' PeBelmarke il; 
o unteren Pegelmarke sein. i~ 
\[ l~h: 
Checking 
Park the car on level ground, and i,:\] 
st~itch the engine o£f. Then. check the ~1 
enBine oil, Remove the l s~ wipe ~iJ 
off it. retnsert it. remove Lt again. ~il 
and read the level. It should be ~,iI i 
i)! ! 
French: ,:~ ...................... , ......... 
,,'oh 
Gaper le volture sur Ia Sur?ace de ~i:: 
niveau, puts machen le motetlr atJtOUr. . 
Puts. veri-rl~r i" hulle moteur, Re'c.trer 
\]a jali~e, essuyer Ja. reintrodulre \[a. i. 
retirer 1" a nouveau, pills VOIP \[e I, 
nlveau. It devoir' etre entre le repere 
superieur ot repere io2erieLlr, 
l:igure \]: Trilingual output and interactiw~, graphic roLl)port 
342 
4 RHETORICAL STRUCTURE EX- 
TRACTION 
The rhetorical structure represents logical relations 
between sentences or blocks of sentences of each sec- 
tion of the document. A rhetorical structure analysis 
determines logical relations between sentences based 
on linguistic clues, such .as connectives, anaphoric 
expressions, and idiomatic expressions ill the input 
text, and then recognizes an argumentative chunk of 
sentences. 
Rhetorical structure extraction consists of six 
major sub-processes: 
(1) Sentence analysis accomplishes morphological 
and syntactic analysis for each sentence. 
(2) Rhetorical relation extraction detects rhetorical 
relations and constructs tile sequence of sen- 
tence identifiers and relations. 
(3) Segmentation detects rhetorical expressions be- 
tween distant sentences which define rhetorical 
strncture. They are added onto tile sequence 
produced in step 2, and form restrictions for 
generating structures in step 4. For example, 
expressions like "...3 reasons. First, ... Sec- 
ond .... Third,:..",and "... Of course .... 
• ..But, ..." are extracted and the structural 
constraint is added onto the sequence so ~s to 
form a chunk between the expressions. 
(4) Candidate generation generates all possible 
rhetorical strnctures described by binary trees 
which do not violate segmentatio,, restrictions. 
(5) Preference judgement selects tile structure can- 
didate with the lowest penalty score, a wdue 
determined based on l)reference rules on ev- 
ery two neighboring relations in tile ca,ldidate. 
This process selects tile structure candklate with 
the lowest penalty score, a value determi,wd 
based on preference rules on every two neigh- 
boring relations in the candkhtte. A preference 
rule used in this process represents a heuris- 
tic local preference on consecutive rhetorical 
relations between sentences. Couskler the se- 
quence \[P <EG> t~ <SR> R\], where P, Q, R are 
arbitrary (blocks of) sentences. The premise 
of R is obvously not only t~ but both P aud O. 
Since the discussion in e and Q is considered to 
close locally, structure \[\[p <E(;> Q\] <SR> R\] 
is preferableto \[P <EG> \[Q <SR> R\]\]. Penalty 
scores are imposed on thc structure candidates 
violating the preference rules. For example, 
for the text in Fig. 1, the structure candidates 
which contai,, tile substructure 
\[3 <EG> F\[4 <gx> 5\] <SR> 6\]\] , which says 
sentence six is the entaihnent of sentence four 
and five only, are penalized. The authors have 
investigated all pairs of rhetorical relations and 
derived those preference rules. 
The system analyzes inter-paragraph structures 
after the analysis ofiqtra-paragraph structures. While 
the system uses the rhetorical relations of the first 
sentence of each paragraph for this analysis, it exe- 
cutes the same steps as il, does for tile intra -paragraph 
analysis. 
5 ABSTRACT GENERATION 
The system generates the abstract of each section of 
the document by examining its rhetorical structure. 
'\['he l)rocess consists of the following 2 stages. 
(1) Sentence. evaluation 
(2) Structure reduction 
In the sentence evaluation stage, the system calcu- 
late the importance of each sentence in the original 
text based on the relative importance of rhetorical 
relations. They are categorized into three types as 
shown in Table 2. For tile relations categorized into 
ltightNueleus, the right node is more important, from 
tile point of view of abstract generation, than tile left 
node. In the c~se of the LeftNucleus relations, the 
situatioq is vice versa. And both nodes of the Both- 
Nucleus relations are equivalent in their importance. 
For example, since the right node of tile serial rela- 
tion (e.g., yotte (thus)) is the conclusion of the left 
node, the relation is categorized into RightN, cleus, 
and the right node is more iml)ortant than tile left 
node. 
The Actual sentence evahlation is carried out 
in a den~erlt marking way. In order to determine im- 
portant text segments, the system imposes penalties 
on both nodes for each rhetorical relation according 
to its relative i,nportance. The system imposes a 
peualty oil the left node for tile RightNucleus rela- 
tlon, and also oil the right node for tlle LeftNuclevs 
relation. It adds penalties from tile root node to tile 
terminal nodes in turn, to calculate the penalties of 
all nodes. 
Then, in the struelm'e ~vduction stage, tim sys- 
tem recursiw;ly cuts out the nodes, from tile terminal 
nodes, which are imposed the highest peualty. The 
list of terminal nodes of tile final structure becomes 
an abstract for the original document. Suppose that 
the abstract is longer than tile expected length. In 
343 
