The General Architecture of Generation in ACORD* 
Dieter Kohl 
Universit£t Stuttgart 
Keplerstrafle 17 
D-7000 Stuttgart 1 (West Germany) 
Agnes Plainfoss6 
Laboratoires de Marcoussis 
Route de Nozay 
91460 Marcoussis (France) 
Claire Gardent 
University of Edinburgh 
2 Buccleuch Place 
Edinburgh EH8 9LW (Scotland) 
Abstract 
This paper describes the general architecture of genera- 
tion in the ACORD project. The central module of this ar- 
chitecture is a planning component, which allows to plan 
single sentences as an answer to a KB query. The planner 
works for three different languages (English, French and 
German) and for sentence generators based on two dif- 
ferent grammar formalisms (ucG for English and French, 
Lr'G for German) independent of the particular grammar 
or grammar formalism. It uses several knowledge sources 
of the ACORD system to make its decisions. The output 
of the planner is used for the language specific genera- 
tors as well as for the update of information needed for 
pronoun resolution. 
1 Introduction 
'I~he ACOI~D prototype offers an interactive update and 
query of a knowledge-base (Kn). In the query situation 
the user asks the KB using natural language and possi- 
bly graphical pointing. The final response composed of 
natural language and also if appropriate, graphical high- 
lighting, is generated with a language specific generator, 
in the three ACORD languages (English, French and Ger- 
man) using the same grammar formalisms and grammars 
employed in parsing: ucc for English and French, and 
LFG for German. The generators are fully described in 
\[9\] for the UCG framework, and in \[3\] and \[5\] for the LFG 
framework. 
The objective of this paper is to describe the modules 
common to the three languages, which build the seman- 
tics of the answer to be generated using the semantics of 
the question posed to the system, the dialogue history, 
and the KB answer. 
2 The Semantic Representation 
)dost components in the ACORD system share a seman- 
tic representation language called \[nL (Indexed Language 
(see \[8\])). InL is based upon Ramp's Discourse Repre- 
sentation Theory (see \[1\] and \[2\]). The generators work 
on a derived representation called SynInL, which was 
designed during the project. 
2.1 Resolution within InL 
The parsers produce information which allows a central 
component, the resolver, to determine the possibilities of 
coreference between anaphoric expressions and their an- 
tecedents (see \[7\]). This additional information is incor- 
porated into an InL expression in the form of occurrence 
information or lists, stating for every element which may 
be coreferential with some other element properties rele- 
vant for determining coreference. We refer to InL expres- 
sions which incorporate such information as unresolved 
InLs and to lnL expressions where this information has 
been used to determine coreference (and thereafter re- 
moved) as resolved InLs. 
*The work reported here has been carried out as part of 
the ESPRIT project P393 ACORD on "The Construction and In- 
terrogation of Knowledge-Bases using Natural Language Text 
and Graphics". 
2.2 The problems encountered using InL ibr 
generation 
Planning and generation operate on a different but 
derivated semantics formalism called SynInL. Several 
reasons brought us to design and use SynInL as opposed 
to InL: 
First, to work efficiently the ucG generators require that 
their input be canonical with respect to the respective 
grammars. Canonicity means that only those InL formu- 
las are treated, which could be produced by the parser, 
but not all formulas, which are logically equivalent 1. In 
the context of InL, the notion of canonicity cannot be for- 
malized outside the grammar definition. We then needed 
a semantic language where canonicity could always be 
provided, even though an expression was produced with- 
out any grammar dependent information. 
Second, the generator needs NP planning to control the 
generation of referring expressions (see \[6\]). In order to 
specify information about the type of NP to be gener- 
ated, a representation is required which allows the en- 
coding of syntactic information in addition to semantic 
information. Furthermore, the individual bits of seman- 
tics must be related to the syntactic structure. More gen- 
erally speaking, we need a mechanism for modifying or 
structuring the semantic representation to be generated 
prior to generation. Standard InL, being a purely seman- 
tic representation language, is inadequate for encoding 
this syntactic information. 
Third, and most importantly, all of this has to be 
achieved in a language-, grammar- and formalism- 
independent way 2. 
3 Designing SynInL 
3.1 State of the art 
There is a main difficulty in the concept of planning- 
based generation systems which explains the monolithic 
nature of many systems described in the relevant lit- 
erature. If a planner plans a particular type of syntac- 
tic structure in the absence of grammatical intbrmation, 
there is no guarantee that the structure specified will ac- 
tually be accepted by the grammar as being well-formed. 
There are basicMly two solutions to this problem. One is 
to simply assume that the planner only specifies struc- 
tures from which it will be always possible to gener- 
ate. This works perfectly when there are no interac- 
tions between structures specified locally. An example 
of a grammar formalism with this "locality" property 
is the context free languages. However, for most mod- 
ern approaches to grammar (including Government and 
Binding theory (GB) and all unification-based grammar 
formalisms), the locality property does not hold. In this 
case, we have to assume that the grammar is "loose 
enough" that anything we might plan can in fact be gen- 
erated despite any interactions. Such a planning could 
1To determine whether two syntactically distinct InL ex- 
pressions are logically equivalent under laws such as commu- 
tativity and associativity is factorial in complexity. 
2Language independence must be viewed as language in- 
dependence regarding French, English and German. 
388 
be done deterministically. Itowever, using ~his al,proach 
such a planner would always run the risk that it would 
fail to geuerMe due to inconsistencies with the grammar. 
'\]'he second solution is to interle~rve planning and gener 
ation and allow the possibility that failure to generMe, 
results in different planning choices. Snch systems also 
exist, although they seem to be comparatively recent 
in the literature. We (lid not investigate this possibility 
since it requires a fairly tight integration of planner and 
(grammar and formalism specific) generator which scems 
inconsistent with our requirement th~tt we generate with 
three languages and two grammar formalisms. 
3.2 Description of our aI)I)roach 
Our solution is to attempt an independent level of syn- 
tactic representation which abstracts away from the pe- 
culiarities of the surface syntactic structures of particular 
languages and deals directly with syntactic notions which 
are language~indcpendcnt. Whether one thinks that this 
is possible, depends to a large degree on one's particular 
theoretical perspective. 
What might sucl, an "abstract" syntactic representa- 
tion look like'." There are several related concepls in di- 
terse linguistic theories which salisfy the criteria. The 
most directly related concept is lhat of l)-struelure in 
(~1~. l)-strncture is a lew~l of syntactic struct.ure which 
mirrors semantic funcl.or-argun~e~lt strnctnre directly 
(via the 0.-eril.erion and lhe l)rojeclion Principle) and 
which is also relaled io surface syntactic structure 1)y 
the rule of M,,,,e-a, a lransformation that nloves COll- 
!:liluenls fronl oue position Io anolher, l~elaled IlOiiOlls 
of sirltctnl'o which Captllre the relation belwc(,n Selllal/- 
l ic funtl.or-argllillellt slrncture (or predicate-arglmlent 
sirueture) and "abstract" or "deep" syntactic sirnclnre 
are tile f-s|rllcl/Ircs of LI.'C, and the. grammalical funclion 
hierarchy-based accounls of subcategorisalion in ltPSG 
and t'CG. All of lhese have the desiraMe pfoperiy that 
i.}ley express a level of representation which relate sub- 
eategorisatJ(qt, s(:ilianties and snrfac(! slruelllr('.. 
Ply using such represenlations whic'h arc hypothesized lo 
be 'qinguislically .niversat" t()ass()ciate parlial seman- 
!ic representations wilh abstract syntaclic constituents, 
we also solve t}|e ol, ller requirements mentioned above. 
\[:'irst, most instances of noncanonicity are elimina.ted be- 
<ause sul)-formulas are associated direetly with syntactic 
constiiuents..Second, quantifier scope readings are elim- 
inated fi'om consMeration at this level of representation. 
'Fhird, since the level of representation is taken to be 
ltniversal, the,'e are language-dependent maps from the 
represerttation to surface syntactic structure. 
3.3 SyIllnI, description 
'\['lie al)|)l'Oac\]l Lakell \]lcre is to encode synla<:tic strnc- 
| ure in ierm.,; of sc:hematie X theory familiar fl'om mosl 
luodern generative gra.lnlllar fOI'H|&\]iSIlIS. As mentioned 
above, this is most similar to D-structure i, cm t\]mory. 
~:;ynlnL expresses both syntactic and semantic inibrma- 
I ion. 
Idealizing considerably, SynlnL formulas consist of four 
types: heads, complements, modifiers and specifiers. This 
corresponds directly to the stamtard constituent types in 
theory. (We follow LI.'¢; f-structure and UCG subcate- 
gorisation structnre in treMing subjects as ordinary com- 
I,lements raLher l\],an spe(:ifiers of clauses). These four 
(alegories are Meal for attaining a level of language- 
i,del)cndence in liiIgnistie description and are general 
(:,tough lhat it is reasonable to expect that such X repre-- 
s(mtations cant be mapped onto lallgl;age-depcn(lent sux'- 
face syllla(:Iic slrllCl.llres. 
The idea then is Io encode this )( struct,lre in Synlttl, 
formulas. SpeciJiers in Synlnl, are of tile general Ibrm: 
specifier (Semantics, tfead) 
\]'hat is, they specify their own semantics and the prop- 
erties of their head. 
Heads are of the general form: 
head(Semantics, /trgList, hdjunc'tList) 
That is, they specify their own head semantics and a lisI 
of arguments and adjuncts which are also either specifier 
or head structures. 
All of these struclures also allow the encoding of syntac- 
tic requirements on arguments and adjnncts. IIowever, 
there is no indication of either surface syntactic order of 
the complements and adjuncts or of the relative scope 
of quantitiers occurring in either complements or mod- 
ifiers. Tile language generators are free to realize both 
scope and surface syntactic structure in any way which 
is consistent with the SynlnL specification. 
ttow is this augmented representation built ? The parsers 
produce nnresolved lnL. This InL contains enm~gh syn- 
tactic infm'mation for a uniqne mapping into an equiv- 
alent SynlnL expression. This mapping is done by the 
InL -~ SynlnL module. 
C, iven av Inl, expression, it distinguishes between struc- 
lural and prime predicales. For prime predicates there is 
ahva.ys a real)ping into a SynlnI, formula with a unique 
category. The structural predicates then determine how 
to merge the Synlnl, formnlas which replace the origil!:d 
parlial InL expfession. 
4 The Phmning Component 
'Fhe role of the planning component is to produce SynInL 
expressions from which phrases can be generated by lhe 
langnage specilic generators and lo decide whether any 
objects on the screen have to be highlighled. 
\Vithin ACOIlD, the planner gets as input lhe Synlnb ex- 
pression corresponding to the user question (yes/no rifles - 
tion, wh-queslion or 'how mnch'/'how many'-question) 
and the KB answer, q'he planner output consists of an 
optional canned texl marker and the Synl. L of the an- 
swer Io be generated. 
The planner uses three snb-planners i'or planning verb 
phrases, NPs and modificalions. 
4.1 Architecture of the generator 
The answer process consists of the following steps: 
e The question is parsed. The output is the InL rep- 
resenta.tion of the question with informalion for 
resol, tion. 
* This InL expressiof is transformed into SynlnL by 
the Ill l, --~ SynlnL module a.nd also resoh, cd using 
the occurrence inh)rmation by the resolver. The 
resolver provides the generator with information 
which encodes the user's qnestion as vnderstood 
by Ihe system. 
® The resolved lnL is passed on to tile KB which 
provides the KB answer. 
, The planner module takes as input the SynlnL ex- 
pression of the query and the KB answer. Depend- 
ing on the. type of questions asked, the planner 
makes decisions such as: what kind of canned text 
prefix is needed, what type of NP planning is nec- 
essary, what ldnd of answer is expected and what 
type of processing ca.n be done on lids answer. It 
calls the NP sub-planner in order to process all the 
NPs appearing i~ the queslion, as well as the \[~I~ 
answer which is trans\[brmed into an appropriate 
Syn\[n L cx pression (generally an N 1'). 'l'he ou \[pHt 
389 
of the planner is a SynlnL representation of the 
answer. 
• The SynInL answer is the input to the language 
specific generator of the current language. The se- 
lected generator produces the final answer. 
4,2 Planning the Structure of Verb Phrases 
Within the ACOIID lexicon, verbal predicates may only 
take arguments which refer to objects. This means that 
we do not do any planning for arguments which denote 
events or states, i.e., verbal or sentential complements. 
Consequently we only distinguish between two types of 
predicates: the copula, which only takes a subject and a 
noun phrase or PPS as complement, and all other verbs. 
Other active verb forms take either one, two, or three 
arguments. The first argument always corresponds to the 
subject (in an active clause), the second to the object or 
a prepositional complement, and the third to the second 
object or a prepositional complement. 
Given ~ list of arguments, the verb planner calls the NP 
planner on each argument, providing information relative 
to the function of the argument position under scrutiny, 
its posilion in the argument list, and the subject of the 
sentence in which the argument occurs. 
'\]?he list of modifications of the original query (if any) is 
processed last. For each element of this list a call to the 
modification sub-planner is made. 
4.a Planning Noun Phrases 
The planning component is responsible for providing 
the best expression for Nes. It nses the dialogue history 
as well as I,:B knowledge to decide whether to adopt a 
pronominalization strategy, or find a non-pronominal de- 
scription for the NP under analysis. 
The NP planner must be provided with enough informa- 
tion to decide whether and which kind of pronominal- 
ization is allowed, and whether a name coukl be used 
instead of a pronoun where such an option is available. 
It mnst also decide when to use demonstratives, definite 
or indefinite articles, and whether a complex description 
shonh\[ include relative clauses and adjuncts. In addition, 
our planner has to decide which objects should be high- 
lighted on the screen. 
'l~o do so, the NP planner needs a fully specified discourse 
referent and information about the syntactic environ- 
ment of the NP to be produced. 
The output of the NP planner is a fully specitied SynInL 
expression, a possible extension of the list of objects to 
highlight on the screen, a possible extension of the list of 
local antecedents, and a possible change of the informa- 
tion corresponding to the answer in the event that the 
NP planner has produced the NP for the answer. 
4.4 Planning modifications 
Modiftcations appear either in the context of a verb or 
in the context of an NP. They express negation, Pps, rel- 
ative clauses, adjectives and adverbs. The modification 
planner is currently handling relatives and PPS. 
In the case of a relative clause, the identifier of the object 
of the verb is set to the NP discourse referent, and the 
verb planner is called. 
In case of a Pp with exactly one argument, if this argu- 
ment is in the focus of a wh-question, the I,:B answer has 
to give both the internM name and the new argument 
of the preposition. If the answer is no, the planner fails, 
since we currently don't have a semantic definition for 
the various Pp negations like 'nowhere' or 'never'. The 
overall result is then the canned text I don't know. Oth- 
erwise there is in generM a list of adjunct-argument pairs. 
For each pair a Y';ynInl, expression for the preposition is 
generated, calling the planner recursively on the argu- 
ment (pronominalization is not allowed in the context of 
a PP). If there is more than one pair in the list, a pP co- 
ordination is initialized and reduced as will be explained 
below. 
Coordinated PPS are allowed to appear in a.nswers. A list 
of SyninL expressions for l'ps can be reduced, if the same 
preposition is used more than once, and the prepositional 
arguments are not demonstrative pronouns. The result- 
ing ,CjynfnL expression contains the common preposition, 
and art NP coordination corresponding to the arguments 
of the tbrmer SynInf, expressions. The NP coordination 
then can also be reduced as described in \[4\]. 
5 Conclusion 
Generation in ACORD demonstrates how planning can be 
done for several languages with a minimum of language- 
specific information. The basis of our approach is the 
concept of SynInL which encodes language-independent 
syntactic information in addition to semantic informa- 
tion. A SynlnL expression can be deriwtted from an InL 
expression using a deterministic process. 
Language-specific dependencies are still necessary con- 
cerning gender and the syntactic function of NPs. q'hey 
could be reduced further by adopting a slightly different 
architecture concerning the interelation of the generator 
and the resolver. 

References 

\[1\] Kamp, H. \[1981\] A Thcorg o.f Truth and Semantic 
Repcesc~tation, In: Groenendijk, J.A. ct. al. reds.), 
Formal Semantics in the Studg of :Vctluc~d Language, 
Vol. I, Amsterdam \[98t. 

\[2\] Kamp, H. and Reyle, U. \[1990\] From Discourse 
to Logic. Reidel Dordrecht, to appear. 

\[3\] Kohl, D. \[1988\] Gcne,'ierung .fu~ktionalcr Strnk- 
turcn aus einer .~emct~tischc~ \[~cprd'sentatio~. I)\[- 
plomarbeit, Institut fiir \[nformatik. Universitiit 
Stuttgart. 

\[4\] Kohl D., Plainfossd A,, Reape M., Gardent 
C. \[1989\] Text Generation from ._qcma~tie t~cprc- 
sentatiort. Acord deliverable T2.1 (I 

\[5\] Momma, S. and Dbrre, J. \[1987\] Generation 
from f-structures. In: E. Klein and J. van Benthem 
reds.) Categories, Polgrnorphism and Unification, 
(?entre for Cognitive Science, University of Edin- 
burgh. 

\[6\] Reape, M. and Zeevat, H. \[1988\] Generation 
and Anaphora Resolution. Manuscript. Centre for 
Cognitive Science, University of I~dinburgh. In: hr- 
stitnt ffir Maschinelle Sprachverarbeitung reds.) Ez- 
tensioT~ of the At~aphora P, esohdion. ACORN (P393) 
Report 'I'l.7'b, Universit/it Stuttgart, March, 1989. 

\[7\] Zeevat, It. red) \[1988\] Specification of the Cen- 
tral Pronoun I~esolver, .a.CORD Deliverable T1.7'(a). 
Stuttgart 1988. 

\[8\] Zeevat, H. \[1986\] A ,S'pccification 4 hal,. Internal 
ACORD Report. Centre for Cognitive Science, Edin- 
burgh 1986. 

\[9\] Zeevat It., Klein, E. and CaMer, a. \[1987\] An 
Introduction to Unification Categorial Grammar. 
In: Haddock, N..J., Klein, E. and Morril, G. reds.) 
Edinbm'qh Wor~qn9 Papers in Cognitive 5'ciel~cc, 
Vol, l: Categorird Grammar, Unificcttion Grammar 
and Pc~rsin 9. 
