SYSTEM DEMONSTRATION 
FLAUBERT: AN USER FRIENDLY SYSTEM FOR MULTILINGUAL TEXT 
GENERATION 
FREDI~RIC MEUNIER 
meunier @ linguist.jussieu.fr 
TALANAUFRLinguistique 
Case 7003-2, Place Jussieu 
75251 Paris 
France 
LAURENCE DANLOS 
danlos @ linguist.jussieu.fr 
1 Introduction . . . 
FLAUBERT is an engine for text generation. Its first applications has been for instructional texts, both in 
French and in English, in software and aeronautics domains. It is an implementation of G-TAG, a 
- formalism for generation inspired from TAG (\[Danlos & Meunier 96\], \[Meunier 97\]). This formalism is 
a lexicalized text generation system (\[Danl0s 98a\], \[Danlos 98b\]). 
All linguistic data are outside of the engine code program. They are maintained directly by linguists 
under a Simple text editor. The syntactic TAG grammar we use for French is that-written by (\[Abeill6 
91\]). Moreover, the French families of elementary trees are automatically generated thanks to the 
hierarchical representation of LTAGS (\[Candito 96\]). The TAG grammar we use for English is home 
made. 
This engine runs on Sun Solaris with 32 M• RAM (generator and interface), and is written in Ada 95 
(generator) and C (interface). It is compiled by the GNU compilers, and uses GNU Scripts (bash, perl, 
sed, awk). 
2 Description 
AS in DRAFTER (\[Paris et al. 95\]), FLAUBERT takes as input a conceptual representation provided by 
the user who fills a questionnaire through an interface that proposes cascading menus based on a domain 
model (see below). The emphasis is put on linguistic issues such as lexical choices (including choices of 
connectives), parallelism issues, stylistic issues (e.g. length and content of clauses and sentences), etc. 
FLAUBERT uses three databases: 
0 A domain model describing an ontology of concepts in a typed feature formalism. In a 
standard way, the concepts include objects, actions, states and relations between them; 
• A set of lexical data bases associated withconcepts; the lexical database for a given concept 
describes its semantico-lexical realizations (lexical heads + argument structures) accompanied 
with tests of applicability for right semantics and well formdness; 
• A TAG grammar whose syntactic informations allow a derived tree to be computed from a 
derivation tree (see the data flow below). 
3 Data flow 
The dataflow of FLAUBERT is given in Figure 1. The system is sequential: 
• compiling the input data; 
• building a lexicalized tree structure called a "g-derivation tree"; 
• building a derived tree; 
284 
! 
I 
! 
I 
I 
I 
i word re-ordering, typographic considerations, etc.). 
I ~-q~ user interface 
III 
• post-processing. 
The first step deals with concepts of the domain and their instances provided by the user. It leads to a 
conceptual representation. Afterwards, the system search in the lexical data bases to make lexical choices 
and builds a g-derivation tree. During this step, it uses also other linguistic resources (lexical entry as 
well as syntactic functions) to optimize lexical choices (parallelism, aggregation, etc.). Next the system 
builds a derived tree (syntactic representation), using standard algorithms ([Schabes & Shieber 94]) and 
an existing TAG grammar designed for syntactic analysis. Finally, the text is post-processed (flexion, 
bulding 
lexicalized tree 
structure 
~-q~ TAG grammar 
3.1 User interface 
post-processing 
Figure 1: Data flow 
Figure 2: Cascading menus 
285 
Since the user may encounter difficulties to give input to FLAUBERT, we have developed a friendly user 
interface which proposes him/her to instanciate concepts with cascading menus as it is shown in Figure 2. 
This interface is under X, and can be displayed on most X servers. It •invokes the generator• in a Xterm 
which is automatically opened. 
3.2 .Conceptual representation 
Below an example of conceptual representation for an instructional text (in software application domain): 
E3 := Ot,~ \[ 
E1 opened=> TOK4 \] 
E2 
E3\] 
EO. : = ~a.,-ax:x.tcm \[ 
goal => 
body > ~--- . 
effect => 
El :=~ \[ 
- Creator => 
• Created => 
E2:= ~tx~sszcn \[ 
ist-event => 
2DJfl-event => 
E4 :-- O~m \[ " 
opener 
opened 
E5 := CLmC~ \[. 
clicker • 
clicked 
HI 
TOKZ\] 
FA 
E5 \] 
HI :=Usm \[ \] 
TOKI := U~ZR_XD \[ \] 
name => "User ID" \] 
=> HI TOK3 := BOT~m \[ 
=> ~3K2 \] name => ,'Add..." \] 
=> HI TOK4 := ~ \[ 
=> .TOK3 \] name . =>"User name" \] 
3.3 Semantic representation 
From E0, the system computes for French the g-derivation tree shown in Figtlre 3. In this tree, each •node 
written in bold (possibly accompanied with a \[T__Feature\], e.g. \[T._R4duc\]) points • to a TAG 
lexicalized elementary tree, except newS, a special tree which adds a new sentence to a text. 
B creer 
agent- " ~bjet 
HI TOKI 
• pour 
P1 
HI 
ouvrir 
° \[T R~uit \] agent---. ,,~r 
• " obj e~ 
TOK2 avant 
\[ T_R~duc \] 
subordonn4e 
cliquer 
\[T_R~duit) 
agent- - ob3 e t 
• HI " TOK3 
news 
2nd ' 
ouvrir 
\[T_Moyen\] 
objet: 
TOK4 
Figure 3 " G-derivation tree 
286 
3.4 Syntactic representation 
From the g-derivation tree in Figure 3 and with a French TAG grammar, the derived tree schematically 
• resumed in Figure 4 is composed. 
S 
PP 
Prep. S 
pour NO V N1 C ouvrir I m°dTinf 
E cr4er 
S S 
NO V N1 S m°dl=imp ~ PP PP 
" PERIOD 
NO 
A 
V 
C 1 Vs 
I I 
se ouvrir 
Figure 4: Derived tree 
3.5 French and English Texts 
French: Pour crder un identificateur d'utilisateur, ouvrez la fen~tre "User ID" avant de cliquer sur le 
bouton "Add... ". La fen~tre "User name" s'ouvre. 
English: In order to create an user ID, open the "User ID" window. Afterwards, click on the "Add..." 
button. The "User name" window is opened. 

References 
\[Abeill6 91\] Abeillr, A. 199!. Une grammaire lexicalis~e d'arbres adjoints pour lefran~ais. Ph.D., Universit6 de 
Paris 7. 
\[Candito 96\] Candito, M.-H. 1996. A principle-based hierarchical representatio n of LTAGS. Proceedings of 
COLING'96, Copenhagen. 
\[Danlos&Meunier 96\] Danlos, L., and F. Meunier. 1996. G-TAG, un formalisme pour la grn4ration de textes : 
prrsentation et applications industrielles. Actes de ILN'96, Nantes. 
\[Danlos 98a\] Danlos, L. 1998. Linguistic way for expressing a discourse relation in a lexicalized text generation 
system. Proceedings of COLING-A CL'98, Montrral. 
\[Danlos 98b\] Danlos, L. 1998. G-TAG: A formalism for text generation inspired from TAG. In A. Abeill6 and O. 
Rambow (eds). Tree Adjoining Grammars, CSLI, Stanford. 
\[Meunier 97\] Meunier, F. 1997. Implantation duformalisme de ggnEration G-TAG. Ph.D., Universit6 de Paris 7. 
\[Paris et al. 95\] Paris, C., K. Vander Linden, M. Fischer, A. Hartley, L. Pemberton, R. Power, and D. Scott. 1995: A 
support Tool for Writing Multilingual Instructions. Proceedings oflJCAI-95, 1398-1404, Montreal. 
\[Schabes&Shieber 94\] Schabes, Y., and S. Shieber. 1994. An alternative Conception of Tree-Adjoining Derivation, 
Computational Linguistics, 20:1. 
