GENERATION FROM UNDER- AND 
OVERSPECIFIED STRUCTURES* 
DIETER KOHL 
Universitlit Stuttgart 
Institut fiir ma.~chiltelle Sl)rachverarbeitung 
Computerlinguistik 
Azenbcrgstrmge 12 
D-7000 Stuttgart 1 
Germany 
EMAIL: dieter t~adler.i)hilosol)hie.uni-st ut tgar t.d(, 
Abstract 
This paper describes informally an algorithm for the gen- 
eration frolll un(|er- all(| overspceified feature structures. 
The generator require~ a grammar, :t goal category m~et a 
feature structure mq input, and derives all strings whose 
corl'eSl)ondillg feature strlltCtllre is llot ill Colltrlu|iction to 
the input structure. 
1 Introduction 
In this paper I will present all algorithut for genera- 
tion fronl under- alld overspccitied feattlrc struetltres 
in the Lr'(; fi'amework 1. Tile algorithm makes use of 
the concept of generation as slructuT~-driven deriva- 
tzon as it is described ill 114, 15, 16\]. Most of tile time 
the algorithm works top-down breadth-first, similar 
to the gcncrator described ill \[7\] and \[61. Only for thc 
creation of the final structure tile algorithm works 
bottonl-Ill), 
2 Motivation 
The algorithm given ill \[14\] allows to generate fi'om a 
fltlly specified feature structure, e.g. tile input struc- 
ture is equal to a structure that would be derived 
during parsing. For ai)plications otlter than testing a 
granllnar for overgeneration the equality-condition is 
too restrictive. 
The algorithm given in \[15\] and \[16\] then Mlows to 
generate frolu all uuderspceified structure, if there 
is a fully specified (semantic) predicatc-argontent- 
structure which is nnt ~dlowed to be extended dur- 
ing generation, e.g. tile l)redicate-argunlent structure 
must be conqllete and coherent with respect to the 
target grammar, One of the disadvantages of this al- 
gorithm is, that it must be marked for tile genera- 
tor, which substructure is not allowed to be changed 
during generation. Further, in certain applications, 
the condition that there is a partiM feature structure 
which is complete and coherent with respect to the 
target grammar might be ,also too restrictive. 
The generator described in this paper had been de- 
ycleped for projects whielt are involved in machine 
translation. While one of the projects makes use only 
of syntactic information encoded in a feature struc- 
ture the other in'eject uses semantic information ~s 
well. In I)oth cases the inI)ut feature structure for tile 
generator is at least undersl)eeified with respect to 
*The work reported here is part of the Sonder- 
forschungsbereich 340 Sp~chtheo,'etische G~ltndlagen der ('omputerlingu£~tik 
l For details of the LFe, formalism see (1 b 
the target grammar, not only for al;omic attribute 
value pail's but also fro' complex pairs. This means 
tile gencrator has to introduce information into the 
given feature structure to get a structure which is 
valid with l-espect to tile target grtunmm~r. 
In both projects a similar architecture is used: 2 
1. parse a sentellCe and return the feature structure 
Fp 
2. extrat:t tile inforlnation for the translation from 
Fp and build F,j 
3. generate fronl F 9 a sentence 
In such an architecture the creation of Fg is usually 
independent of the target grammar, in the sense that 
the creation is not automatically coutroUed by tile 
target gralnular. 
In machiuc traaslation the grammars used for parsing 
and for generation are basically spccilic for tile two 
single languages one wants to translate between. It is 
usually desirable to sl)eeify F~ only in ,~s rudimentary 
and ms general lnauller ;L~; possible. This lueans tile de- 
tails of how to generate a wdid surface string of tim 
target language are only known in the target gram- 
mar, rather than spelled out ill th" translation rela- 
tion. Ill other words, a single grammar G describes 
only the relation of a surface string of a language L 
and a feature structure valid for tile grammar G of L. 
~trther, a valid feature structure for G will represent 
only information necessary for L, but not neeessarily 
information necessary for the lauguage to translate 
into. For example, a gramlaar fro' German will de- 
scribe a fl~atttre structure which h,'us information for 
the tenses past, present, and future, but no informa- 
tion about progressive ms it is required for English. 
Therefore, ill tile translation German to English the 
generator has to generate froln a feature structure 
which might be underspecified with respect to tense 
information, while ill the translation Englislt to Ger- 
man the generator has to generate from a feature 
structure which might be overspecified with respect 
to tense information. 
ht general, in describing the translation relation be- 
tween two languages one lta.s to face tile probleuts of 
interfaces: 
• Infornmtion is missing and must be derived froin 
tim target gralnmar, e.g. tile input structure is 
uuder,~pecified. 
2For the re~ons of this architecture see for example \[4\]. 
There are also other MT projects like GRADE (see \[9\], \[10\] 
and \[8\]) which nl~tke use of a similar architecture. 
ACRES DE COLING-92. NANTES. 23-28 AOt3T 1992 6 8 6 Prec. oF COLING-92. NANTES, AUG. 23-28, 1992 
• There is more information than defined by tile 
target grammar, e.g. there is no string of the tar- 
get language for which the grammar describes 
a feature structure which contains all attribute- 
vahle pairs given ill the iuput structure FS 9. The 
input structure is overspccifled and the overspce- 
if)cation could be ignored duriug geueration. 
• There is informatiou which is incousisteut with 
the target grammar, e.g. the input structure is 
illforrned with respect to the target gramnlar. 
This requires some error treatment. 
All algorithm for generation then h~s to provide 
uwchanisms which allow geueration from underspeci- 
fled structures as well as from overspecilicd ones. This 
will allow to deal with certain types of trauslation 
mismatches as they are described for example in \[2\]. 
Further, the treatmcut of fllformed structures shouhl 
be such. that the invldid elements of the input struc- 
ture could he made visible for debugging purposes, ill- 
stead of just failing to generate anything. As it turned 
ont, even for ulediuul sized grallllnars it Call beconle 
quite dill)cult for a linguist to debug the grammar 
if there is only a debugger available which had been 
develolled for the generM l)urpnse programming lan- 
guage the system is inq)lemented ill, e.g. prolog. 
3 Terminology 
The alger)tirol has been devehlped for grammars 
written in the Ll.'c;-formalism. This uleal!s, it works 
on a eoutext-frec grammar G with annotated fcatm'e 
descriptions. Given a feature structure FSi, as in- 
put the algorithm has to generate all those surface 
strings, for which G ;Lssociates a feature structure 
FS,j, with FSI~ coutpatihle to FS,~. 
V~rhat co're, pal)hie means depends on tile kind of ap- 
plication the generator is used iu: 
• If the application is to test a grammar for over- 
geu(,ration, FSin lnust lie equal to FSu, e.g. lie 
iuformation is introduced into or deleted from 
FSi,, during geueration, and \]i~Si,, unifies ill 
terms of feature unification with FS,j. 
• If the alll)licatiou is to test whether a structure of 
a certain attribute might be sufficient for genera- 
lieu, i.e. whether the senlautic structure does not 
m'ergenerate, FSI,~ must I)e subsumed by FS,~, 
e.g. all information of FSI,, nlust be required for 
generation, and it is only allowed to introduce 
iMonnation lute FSin. 
• If the application is machiue trauslation, FSi,, 
and FSI~ must unify, e.g. FSI,, might contain 
nlore inlorulation and ,also less iuforluatiou th~t.u 
FS u . 
Del)endiug on tile al)l)licati(m the algorithm is 
i)arametrized as to whether it allows the introduc- 
tion of information into FSi, and whether it allows 
FSI, to be overspecified. 
For those not familiar with LFG I will give a short 
overview of tile elements of the feature descriptious 
as I will use them afterwards. In general a feature 
deseril)tiou consists of a coujuuction of equations or 
a disjunction of feature descriptions. In this paper I 
will only cousider feature descriptions without dis- 
junctious. The equations are distinguished into 
• defining equations indicated by tile operator = 
• inequatimts indicated by the operator # 
• constraining equations indicated by the operator 
=e 
All equation consists of a reference to a structure, tile 
el)era)or , and ,'L,~ second argulueut of the operatiou 
oue of 
• all atomic v~due like raas 
• a semantic form, indicated by double quotes, 
with an atou)ic uaule aud all optional arguuleut 
list,, i.e. "man", "give (SuuJ,ot~J}" 
• a referellee to a structure 
A reference to a structure is either a mete-variable 
or a path applied to a mete-variable. Examl)les are 
• the meta-wtriable 1, which stands for the struc- 
ture assnciated with tile nlother llode, e.g. the 
category given on tile left hand side of a rule. 
• ttw meta-variMilc 1, which stands fur tile struc- 
ture a.ssociate(1 with a (laughter uode of a rule, 
e.g. the nolle on the right hand side of a rule 
where tile feature description is an annotation 
of. 
• (~ GENI)ER), which refers to a structure under 
the attrillute (;\[.;NDI.~R ill tile feature structure 
associated with tile mother node. 
Equations, which have references on both sides of a 
equatiou arc called ree~ttr(trtey equations. 
Semantic forms describe unique vMues, e.g. while two 
atoufic values unify if they are described by the same 
fern), two semantic forms will not. The arguments of 
a semantic form of at) attribute A are paths which 
are members of the governable f~mctions of A. This 
set will be named as gf (A). %) alh)w semantic forms 
)~s possil)le values tilt ally attribute is a generaliza- 
tion of the Ilse of sltnlantic forlllS a,s they are given 
in \[1\] where semantic forms are only values of the at- 
trihute PRED. Semantic forms contain all information 
ueeessary to test the conditiolm of COml)leteness and 
coherence. 
3.1 Coherence and Completeness 
Using the generalization tile conditious of complete- 
ness and coherence ms given in \[3, pp. 211/212\] are 
reformulated ~s 
• A feature structure 5' is locMly complete iff for 
each attribute A in S where gf(A) is non-empty 
tile governable functions defined by tile vMue of 
A exist ill S with a value for the attribute A, and 
if all values required are defined. A structure is 
conq)lcte if all of its substructures are locally 
complete. 
• A feature structure S is loeMly coherent, iff for 
each attribute G of S which is member of gf (A) 
G is governed by the value of A, e.g. the argu- 
lueut list of the vMue of A contains G, and if 
,all attributes of S are given by tile grammar. A 
structure is coherent if ,all of its substructures 
are locally coherent. 
Ac~rEs DE COLING-92. NnivrI,kS. 23-28 ^olyr 1992 6 8 7 PRec. OF COLING-92. NANTES. AUG. 23-28. 1992 
The struettn'e FS derived in the generation process 
must at least fttllfiqll these contlitions of completeness 
and coherence, e.g. ally violation of one of these con- 
ditions is treated as an error. Since the input struc- 
ture FSi,, should be part of the derived structure, 
the conditions for attribate-valae pairs of the input 
structure are modified to be able to use the input 
structure to control the generation process and to bc 
able to allow overspecification, a 
• If an attribute A of FSi, is licensed by a defin- 
ing equation or inequation in the rules of tile 
grammar which are not explicitly excluded by 
FSi,, it shouhl be checked that A is actually con- 
stnned daring generation. This condition extends 
the condition of coml)leteness. 
• If an attribute A of FSi, does not occur in any 
equation of the graulmar, tim input structure 
is ovcrspecified. It depends on the application, 
whether this type of overspeeification is allowed, 
e.g. whethcr it should be considercd a.s a vio- 
lation of the coherence condition or shoultl be 
ignored. 
• If an attribute A of FSi, is not lieeased by a 
defining eqnation or an inequation in the rules 
of the granunar which are not explicitly excluded 
by FSi, the input structurc is overspecified. It 
depcnds on tbc allplication whether this type of 
overspecifieatiml is allowed. In ease overspecifi- 
cation is allowed, A and its value are ignored, 
otherwise it is treated ,as a violation of the co- 
herence condition. 
As indicated by tile last extension to the coherence 
and completeness conditions, it depends on the ap- 
plication what kind of input structure is considered 
to be a valid one for the target gralonlar. Ill case a 
grammar should he tested for overgeneration a valid 
input structure is not allowed to be extended tlnriug 
generation and is not anowed to be ow~rspecifictl. 
In the case of machine translation the input structure 
can be considered as a valid one, even it is underspec- 
ified. Del)ending on the language pair it might be also 
apl)ropriate to consider an overspeeified input struc- 
ture ms valid. 
4 The Algorithm 
The algorithm works on a granmmr tlescription and 
an input feature structure. The grammar description 
cuasists of context free rules with annotated feature 
descriptions. 
For siml)licity it is assumed that the annotated fea- 
ture descriptions do not contain disjunctions. A dis- 
junction in a feature description can always be trans- 
formed into a disjunction of nodes on the c-structure 
level. Furthernmre, a siugle ode is a concatenation of 
terminal and uon-termiual nodes, and for each cate- 
gory C of a grammar the rules for C are treated as 
one disjunction. 
aThis mealm, it is not sufficient to require, that the inptlt 
structure has to Ilnify with a structure derived from the gram- 
mar to get a generatim~, since this would allow to produce 
sentences which do not contain all of the semantics given in 
the inptll structure as well ms to produce sentences with any 
kind of possible modifiers the grammar could derive, that is 
infinile many. 
Tim algorithm starts witb a current category C~, ini- 
tialized with the gual category, and a current fea- 
ture structure FS~, initialized with the input feature 
structure FSin. 
The algorithm proceeds as follows: 
• Match the current feature strncture FS~ with 
the current category C~ by matehiug FS~ with 
the feature descriptions FDi of the nodes Ni on 
the right hand side of tile rule for Cc, where FSc 
is bound to the mata variable T which deaotates 
the structure associated with the nlother node 
C,, on the left hand side. The matching works 
top-down I)readth-first. During tile match FS~ 
will lint be nmdified. 
• Eztend FS,. by the application of a feature de- 
scription FD. 
4.1 Matching 
The matching of the current feature structure FSe 
with the current category C~ will always te,'minate. 
During the matching a structure which is used as a 
chart and an agenda is built which keeps track of 
• which structures are already matched with which 
categories. 
• whether there occurs a trivial recursion, e.g. 
given a structure and a category there is a re- 
cursion on tile c-structure level which uses the 
salne structure. 
• tim use of whicb nodes can be constrained by 
tim input strncture, and what is tile result, e.g. 
is the usage of the node excluded or licened by 
tile input structure. 
• which nodes are lmrely eontroUed on tile 
e-structure level, e.g. there it~ no equation for 
a node which dcnotates the structure of the 
mother node. Such nodes bare to produce only 
finite many snhstrings. 
For each category C ~fll its rules arc considered in 
parallel, which avoids ally dependency almut the or- 
dering of the single rules for C. 
For each node N on the right hand side of C~ the 
input feature structure is matched with its feature 
description FD. This match results ill at least one of 
the following descriptions: 
Exclusion: FSc is not coml)atil)le with FD. There- 
fore the node N will be excluded. Other results 
of the matching are of no relevance. The exclu- 
sion of N excludes those nodes which are part of 
the same rule as N. 
Activation: FD defines a path-value-pair which is 
already part of FS~, or FD defines a reentrency 
which already exists ill FSc. 
Examination: In FD occurs a reentranee equation 
where only one of the paths exists ill FS~. The 
result ezamination contains the category CN 
named by the node N and tile associated sub- 
structure FS.. 
Tile folh)wing cases are distinguished: 
Amw~s DE COL1NG-92. NANTES, 23-28 AOt~q" 1992 6 8 8 PROC. Or COLING-92, NANTES, AUO. 23-28, 1992 
trivial equation: N is a non-terminal node. 
The catgories C,: and CN are associated 
with tile same (sub)structure. Beside 1" - .\[ 
equations uf the form (1 X) = (1 X) are 
also considered ,as triviM equations. 
(1 X) = l: N is a non-terminal node. The cate- 
gory CA" will be matched with the structure 
denotated by (~ X). 
(~ X) - (~ Y): N is a iron-terminal (lode. Tile 
category CN will be matched for (.\[ Y) 
with the structure denotated by (1 X). This 
ease covers the treatment of multiple ro~)ted 
structnres a-s they nlight occur in gralnnlars 
written in all IIPSt; style 4. 
(T X) = (1 V): C'~ will be mat,:hed for (1 Y) 
with tile structure denotated by (1 X). 
Uncontrolled: FD does not contain any equation 
which can be applied oil FSc. In this case FS~ 
does not eontroll the oceurcnce of tile substring 
associated with the node N, and it depends on 
tile partial c-structure alone given I W the cat- 
egory C~, whether there are tinite ninny sub- 
strings described. 
Suspension: FD contains equations which allow 
controll of generation by FS,., but FS,, does 
not contain enough information to make a (teci- 
sion al)out exc\[usiolL activation or exatninatiolt. 
Therefore, the matching of N with FS~ has to 
he decided later. In case the application forbids 
introduction of infornmtion into FS~. during gem 
eration the conditions of suspension will lead to 
immediate exclusion. 
Only tile results activation and examination may 
occure in parallel The result examination causes a 
further exanfination of the category CN with tile 
selected (sub)-structure, if they have (lot Mready 
()(!eli eXalllined and are not already under exaluilla- 
lion. Thus tho matching of a category with a (sub)- 
structure is performed only once during the matching 
of the input feature structure with the goal category. 
This guarantuecs the termination of the matching 
and is efficient. 
Since the matching works top-down breadth-first it is 
llOSSible to detect inconsistencies between the iupttt 
feature structure and parts of the rules fairly early. 
From the complete match it is possible to deter: 
mine the set of these attribute=value pairs, which 
are part of tile original input structure and which 
could I)e used either by a defining equation or all 
incquation. These attribute-value pairs are marked 
that they have to be used which is an equivalent 
of adding temporarely constraining equations to the 
grammar, which guarantee that a maxinmm of ill- 
formation from the input structure is used for gen- 
eration. It should be noted, that this step is only 
necessary, if overspecification of the input structure 
is allowed. Otherwise all attribute value pairs of the 
input structure could be marked at star(up that they 
have to be used during generation. 
The matching produces a set of IIossible solutions. 
This makes it possible to distinguish a failure caused 
by an illegal input structure from the generate-and- 
test Iiehaviour of the backtracking *nechanism. Since 
4 For a description of Ites(\] se~ 11 l\]. 
there is enough illfornlation of the current goal in tile 
generation process, it is possil)le to produce an error 
message which descril)es 
* the c-structure build so far 
* the node and its ammtated feature description 
which is inconsistent with the input structure 
* the part of tile input structure which caused the 
failure 
~l(ch all error luessage v¢onld lie in tern(s of the gram- 
mar rather than in terms of the iinplenlention lan= 
guage of the algorithm. An error message eouhl be 
I couldn't yenemtc aTt NP for the structure \[ 
PRH) (ua((\] spt:c idef J because SPEC' 
: idef is ille.rlal 
for the grammar. 
Since it is distinguishc<I which parts of the struc- 
ture are intruduccd during generation it is possible 
to show tufty those faihu'es which are caused by the 
original input structtu'e. This would also allow one to 
ignore illegal parts of the inliut structure emnpletely 
alld t\[) ev~211 ~Cllcl';ttc fr()lll illformcd structures. In 
con(flint to the cmue of overspccification this would 
require repairing either tile input structure or extend- 
ing tile target gr~.(nlllar. 
4.2 Extension 
Tile extension of FS~ by a feature description FD 
means, that all information fi'om FD is incorporated 
into FS,,. Since only non-disjuuctiw~, feature descrip- 
tions are cmtsideretl it is not necessary to describe 
tile treatment of disjunctive information. The only 
source of alternatives are the rules. These alterna- 
tives are treated by backtracking. The selection of 
alternatives starts with those disjuncts, which do not 
lead to reeursion. This guarantees that recurs(on is 
applied oaiy in those ca.ses, where it could be part of 
tile c-structure to generate. 
The extension h~t~ several aspects. First, it is made 
explicit in tile feature structure which attrilmte-value 
pairs are defined by the grammar, and how often a 
definition h~u oceured during tile generation. The lat- 
ter information is used to stop the generation from in- 
finite loolis I)y giving a maximum amonnt of repeated 
definitions of the same l)ieee of information. Reason- 
able limits are values between 10 and 20. It should 
be noted that the semantic foT~ns of LFG reduce this 
linfit to 1 for attributes which take a semantic for((( 
as value 5. 
Second, a partial representation of the e-strncture 
is built in parallel to the feature structure, which 
allows at the end of the generation process to ex- 
tract the surface string by a traversal of the complete 
c-structure. 
Third, it can be deternfined which attribute-value 
pairs have been introduced into the original struc- 
ture. Only these attrilmte-value pairs are relevant to 
reexamine suspended nodes. 
SFor LFG grammars this aspect of semantic forms is the 
main reason that tile generation will terminate without the 
superficial limltati!m of repeated definitions. 
ACRES DE COLING-92, NANTES. 23-28 AOI3T 1992 6 8 9 PROC. oF COLING-92, NANTEs, Auo. 23-28, 1992 
4.3 The main loop 
1. For each node Nj of the right hand side of the 
rule of the current category Cc match tlle anno- 
tated feature description FDj with the current 
feature structure FS~. The matching ternfiuates 
always, and during the matching no new infor- 
mation is introduced into FSc. The match deter- 
miues, whether the node Nj might be excluded, 
activated, suspeuded, and whether the category 
N should be examined for some part of FSc. 
2. If there are uo nodes left which can be activated, 
nodes which are still suspended axe excluded attd 
tile filial coherence and completeness tests are 
performed on the input structure FSI,. In case 
of success the surface string can be extracted 
from the c-structure which is built in parallel 
to the derivation of the input feature structure. 
Ill case of failure, other solutions are tried by 
backtracking. 
3. Select only these nodes which can be activated 
which will not lead to a recursion. Extend the 
partial feature structures associated with these 
nodes by applying the annotated feature descrip- 
tions. 
4. Compaxe those nodes again which have been sus- 
pended ms in step 1. 
5. Repeat the steps 3 aud 4 until there are no nodes 
left which can be activated aud which do not lead 
to it recursion. 
6. Nodes which could be activated but lead to re- 
cursiou axe activated only in case there is ltO in- 
dication that the recursion conld be applied in- 
finite many times s. 
7. Contimte with step 2. 
5 Example 
In order to illustrate how tbe algorithm works, I will 
oaly give a very simple and somewhat superficial ex- 
ample. For more detailed examples especially on the 
treatment of recursion see \[5\]. 7 
The exantple makes nse of the grammax in figure 1 to 
generate a German sentence with a simple NP and all 
intransitive verb. The grammar is written ill a usual 
LFG notation. The input feature structure for genera- 
tiun is given in figure 2. For the example it is assumed 
that the feature stucture contains the semantic rep- 
resentation of the analysis of the Englisb sentence 
the man is running which should be translated into 
German, The goal category for generation is S. 
The generation starts with the matching of S with 
FSo. The NP node of the right haud side of the S 
rule is suspended, since there is no attribute SUBJ 
in the input structure. The trivial equation of the 
V1 a node immediately leads to the matching of FSo 
with the category VP. The trivial equation on the V 
node leads in turn to the matching of the category V 
with FSo. The existence of (SEM REL) = r~n in FSo 
6In this paper infinite loops are only assumed in case the 
limit of repeated definitions is reached. A more detailed treat- 
ment of the detection of iaflnite loops is given in \[51 
~There would be not etlollgh space to show a more compli- 
cated example in this paper. 
lUalUU 
der: 
rennt: 
rannte: 
S ~ NP VP 
(T SUBJ) = I T = l 
NP ~ D N 
T=lT=l 
NP ~ N T=I 
VP ~ V 
l=l 
N, (T PRED) = "mmm" 
(1 NUM) = sg 
(T GENDER) = mas 
(T CASE) # gen 
(\]" SEM REL) = "man" 
(T SEM NUM) = sg 
D, (T SPEC) = def 
(j" GENDER) = mas 
(1 CASE) = nom 
(T NUM) = sg 
(T SEM SPEC) =def 
V, (T PRED) : "rennen (SUBJ)" 
(1" TENSE) = present 
(T SUBJ CASE) :- nora 
(I SUBJ NUM) : sg 
(T SEM REL) = "run" 
(\]" SEM TIME START) = now 
(1" SEM ARG1) = (T SUBJ SEM) 
V, (T PRED) = "rennen (SUBJ)" 
(T TENSE) = past 
(\]" SUBJ CASE) = nora 
(T SUBJ NUM) = sg 
(\[ SEM REL) -- "run" 
(T SEM TIME START) = \])ast 
(I SEM TIME END) = past 
(T SEM ARG1) = (1" SUBJ SEM) 
Figure 1: Example grammar 
would allow to activate botb verbs of the example lex- 
icon, but the equation (T SEM TIME END) = past 
excludes the eutry for rannte. 
The resulting partial c-structure of the match is 
S--NP ...suspended ... 
VP--V--"rennt" 
Tile following attribute value l)alrs of FSo must be 
used during generation: 
(SF, M REL) 
(SEM ARG1) 
(SE1vl TIME START) 
Since tile solution set of the match does not require to 
use (SEM TIME END) tiffs information can be ignored 
for the further generation, although it had been used 
to exclude an entry. This shows a case of overspeci- 
fieatiou, where an attribute is in the set of possible 
attributes of a gramntax but is not always determined 
by the grammax. 
The extension of FSo then leads to the structure 
in figure 3. It should be noted that the algorithm 
autontatically selected the semantic head, although 
ACTES DE COLING-92. NANTES, 23-28 AOt~q" 1992 6 9 0 PROC. OF COLING-92, NAh'TES, AUG. 23-28, 1992 
feature structure c-strnctnre \] "rau" I +L +,,L 
altG1 53/S,','¢ clef / / / \[\] s,+,\[\] sg '// 
Figure 2: Inl)ut structure for geucration 
the bead is eml)edded in at substructure. Tiffs means 
the algorithnl is implicit head-driven without any as- 
sunq)tions which part of an inj)ut structure the head 
should be. As it is shown ill \[5\], this allows to gen- 
erate in cases of head-switching, where syntactic att(l 
semantic head differ. 
\[\] 
5EM 
PRED 
'\['ENSI'~ 
\[ REL "ruu'l \] 
A,tGl \[\] \['s~::,~c '¢;n}an"l 
Ll'I M F \[~ \[E'~ADtT }::'t;: r e\] J 
"renneu (SUBI)" 
l)resent 
s+ •\[++:+'+l;llll 
_ t smM \[\] J 
Figure 3: First extension of the input structure 
Tit(" introduction of SUBJ leads to tim matching of 
the suspended NP imde with FSo. The equation 
(T SUB J) = J. leads to the nmtchiug of the category 
NP with FS4. 
For the NP rule there are three nodes to be 
matched with FS4. Siucc on all three nodes a triv- 
ial equation is atmotated, the categories D and 
N have to be matched with FS4. The equations 
(l SEM REL) = man and (T SEM NUM) = sg acti- 
vates tile noun curry, and requires that (SEM ltEL) 
and (SEM NUM) of FS4 nlust be nsed for geueratiou. 
The equation (1" SEM SPEC) = dcf activates the de- 
terminer entry and requires to use (SEM SPEC) of 
FS 4 . 
The two alternatives of the NP rule "allow to consider 
two lmssible extension shown in table 1. 
Since (SEM SPEC) of FS4 must be used, the second al- 
ternative will be rejected by tile final constraint test. 
Therefore, the only solution is tile first alternative. 
This results in tile e-structnrc 
S--NP--D--"der" 
N Illnanllll 
VP--V--"rennt" 
from which the string der mann rennt is generated. 
I. 
2. 
:?n t 
/SEM \[\] / ~IP~. ",nauu" / 
/GENDEIt in;~s | 
Lsvl.:c def J 
\[\] I m'+M X\] I PRE1) "tltanu n 
\[G I~\]NDI~R Illas \] 
NP--D -"der" 
N---"nlalul" 
NP--N )'nlann" 
Table 1: Possible exteusious of the NP rule 
6 Comparison with Shiebers approach 
The semantic-head driven ,algorithm giveu in \[13\] 
also starts with a tol)-down initalizatiou with a 
I)ottom-u l) generation. In Shieber ct al the nodes 
whicll eoutam the semantic head arc determined dur- 
ing tile couq)ilation of the grammar. This seems to 
be a bit problenmtic fur gramluars which describe 
head-switching t)henomcnons, ~ in 100 l~tres of wine, 
where a possibh~ ananlysis is that 100 litres syntacti- 
cally governs ultn.e, but semantically is a moditicr of 
wine. The algorithm llreseuted here does not require 
to llrecomlmte tile nodes which contain tile semantic 
head, but finds the head relewmt for the giveu input 
structure automatically. 
Tile problem with free variables for the coherence 
constraint given in Slficbcr ct al does not occur for 
the alguritbm l/reseuted in this paper, since it "always 
distinguishes between the struetnre and the descril)- 
tiun of the structurc, and keeps track of which parts 
of the structure are already derived during genera- 
tiun. Since the a\[gorithln I)resented here always hmq 
infurmatiml at)out wlfi(:h parts are from the original 
input structure and which ones have been added, it 
is possible to check the coherence couditiuu at any 
step of the generation process. In addition, the so 
lution in Slfieber et al with binding variables seems 
somewhat llroblematic, since it requires to know for 
sure, that the variable part of the semantics shouht 
uot lie exteuded. 
The augmentation of the generator described ill 
Shiet)er et al with a chart to avoid rccomputation 
att(l elinfinate redtmdaucies is an integral part of the 
algorithut presented here. 
7 Summary 
Ill tiffs l)aper an algoritlun had t)een described which 
can be used to generate from filly specified feature 
structures a.s well as front variants of under- or over- 
specified feature structures in the LFG framework. 
The algorithm covers the cases given it, 114\] and \[151 
&s a subset. The treatment of recursion allows even 
for infinite many possible generations that the soht- 
tions can I)e presented one by one, e.g. the generator 
will not go into an infinite loop between two solutions. 
The generator is implicit head-driven, e.g. it selects 
the head automatically for a given input structure 
with respect to the target grammar. As it is shown in 
ACTES DE COLING-92, NAMES, 23-28 ^O~rt ' 1992 6 9 1 PROC. OF COLING-92. N^N'rEs, AUG. 23-28. 1992 
\[5\] this behaviour of the algorithm allows the efficient 
trcatment of head-switching phenomenons. 
It has been shown, that the algorithm provides infor- 
mation which allows in ease of failure to produce de- 
bugging information in terms of the target grammar, 
rather than in terms of the programming language 
the algorithm is iml)lemented in. 
The algorithm is implemented in PROLOG in the ed- 
inburgh syntax. Currently the implemention of the 
delmgging meehmfisms is incomplete. 
Although it is not shown in tiffs paper, the technique 
used for the generator could be easily adopted for 
parsing, where the input string takes tile part of the 
iuput feature structure. Ill this sense tile c-structure 
is only considered as an auxiliary structure where the 
gramntar describes basically a relation between a sur- 
face string and a feature structure. To adopt tile tech- 
nique for parsing would have the advantages 
• to use basically the same maclfinery for parsing 
and generation where the nmehinery is optimized 
for each task, 
• to have the same improved possibilities for de- 
bugging, and 
e to allow to start the parsing of striugs while they 
are typed in, and not only after the complete 
string to be parsed is known. 
One of the major goals for the fi~ture development of 
the algorithm is to reduce the use of backtracking ,as 
much as possible by using disjunctions as part of the 
feature strncture. 
The algorithm should be also applicable to other 
grammar formalisms like PATR-II (see \[12\]) which 
make use of a context-fl'ee backbone and anotated 
descriptions. It is also intended to nse tlte algoritlnn 
for formalisms like ItPSC. 

References 

\[11 Joan Bresnan, editor. The Mental Represen- 
tation of Grammatical Relations. MIT Press, 
Cambridge, Massachusetts, first edition, 1982. 

\[2\] Megumi Kameyama, Ryo Ochitani, and Stanley 
Peters. Resolving translation mismatches with 
information flow. Iu Proceedings of the 29th An- 
nual Meeting of the Association for Computa. 
tional Linguistics, pages 193 200, Berkley, Cali- 
fornia, USA, 18 21 June 1989. University of Cal- 
ifornia, Association for Computational Linguis- 
tics. 

\[3\] Ronakl M. Kaplan and Joan Bresuan. Lexical- 
flmctional grammar: a formal system for gram- 
matical representation. In Joan Bresnan, editor, 
The Mental Representation of Grammatical Re- 
lation.s, chapter 4, pages 173 281. MIT Press, 
Cambridge, Massachusetts, 1982. 

\[4\] Ronald M. Kaplan, Klaus Netter, Jiirgen 
Wedekiud, and Annie Zaenen. Translation by 
structural correspondences. In Proceedings of 
the 4th Conference of the European Chapter of 
the Association for Computational Linguistics, 
Manchester, 1989. 

\[5\] Dieter Kohl. Generierung aus unter- und 
iiberspezifizierten Merkmalsstrukturen in LPG. 
Arbeitspapiere ties SFB 340 Sprachtheoretische 
Grundlagen flit die Computerlinguistik Berieht 
Nr.9, Institut fiir nlaschinelle Sprachverar- 
beitung, Universit£t Stuttgart, July 1991. 

\[6\] Dieter Kohl and Stefan Momma. LFG based 
generation in ACORD. In Gabriel Bes, edi- 
tor, The Construction of a Natural Language 
and Graphic Interface Results and perspectives 
from the ACORD project. Part Generation ill 
ACORD, Chapter 5. Springer, (to appear) 1992. 

\[7\] Stefan Momma and Jochen DiJrre. Genera- 
lion from f-structures. In Ewan Klein and Jo- 
hun van Benthem, editors, Categories, Polymor. 
phism and Unification. Cognitive Scieuee Cen- 
tre, University of Edinburgh and Institute for 
Language, Logic and Information, University of 
Amsterdam, Edinburgh and Amsterdam, 1987. 

\[8\] Makoto Nagao. The transfer l)hase of tim nm 
machine translation system. In Proceedings of 
the 11th International Conference on Computa- 
tional Linguistics, pages 97-103, 1986. 

\[9\] Makoto Nagao, Toyoaki Nishida, and Jun-ichi 
Tsujii. Dealing with incompleteness of linguis- 
tic knowledge in language translation. In Pro- 
ceedings of the lOlh International Conference on 
Computational Linguistics, pages 420 427, 1984. 

\[10\] Jun-ichi Nakamura, JmMehi Tsujii, and Makoto 
Nagao. Grannnar writing system GRADE of 
rnn-maehine translation project and its elmrae- 
teristics. In Proceedings of the lOth International 
Conference on Computational Linguistics, pages 
338 343, 1984. 

Ill\] Carl J. Pollard and Ivan A. Sag. Information- 
Based Syntax and Semantics. Vol. 1 Fundamen- 
tals, volume 13 of CSLI Lecture Notes. Univ. 
Press, Chicago, 1987. 

\[12\] Stuart M. Shieber, Hans Uszkoreit, Fer- 
nando C.N. Pereira, J. Robinson, and M. Tyson. 
The formalism and implementation of PATR- 
Ii. In B. J. Grosz and M. E. Stiekel, editors, 
Research on Interactive Acquisition and Use of 
Knowledge. SRI report, 1983. PATR refereuee. 

\[13\] Stuart M. Shieber, Gertjan van Noord, Robert 
C. Moore, and Fernando C.N. Pereira. Semantic- 
head-driven generation. Computational Linguis- 
tics, 16(1):30 42, March 1990. Refs for bottom- 
Ul) geueration problems. 

\[14\] Jiirgen Wedekind. A concept of derivation 
for LFG. In Proceedings of the 11th Inter- 
national Conference on Computational Linguis- 
tics, pages 486-489, Bonn, West Germany, 25- 
26 August 1986. Institut fiir Kommunikations- 
forsclmng und Pbonetik, University of Bonn. 

\[15\] Jiirgen Wedekind. Generation as structure 
driven derivation. In Proceedings of the 12th In. 
ternational Cortference on Computational Lin- 
guistics, pages 732-737, Budapest, Hungary, Au- 
gust I988. 

\[16\] Jiirgen Wedekind. Uniflkationsgrammatiken 
und ihre Logik. Dissertatiou, Universit/i.t 
Stuttgart, Stuttgart, 1990. 
