Semantic-driven Generation with LFG- 
and PATR-style Grammars 
Jiirgen Wedekind* 
University of Stuttgart 
To find an appropriate utterance for a semantic representation is a problem normally 
treated in the domain of (tactical) natural language generation. For unification-based 
approaches, like LFG, PATR, or HPSG (Kaplan and Bresnan 1982; Shieber et al. 1983; 
Pollard and Sag 1994), this problem turns out to be a formal problem of the underlying 
grammar formalism, when the mapping between strings and semantic representations 
is defined by the grammar. Semantic representations are then encoded in a separate 
part of the feature structures (henceforth f-structures) that are assigned to the sen- 
tences by the grammar. This is normally achieved by a distinct attribute SF.M (or an 
additional a-projection that is formally reconstructable by such an attribute) whose 
value is intended to represent the semantics of the sentence the f-structure is assigned 
to. The f-structure given in (1), which might be assigned to the sentence John arrives 
by a unification grammar for English, is a simple example. 
(1) 
'I "PREDTENSE 'ARRIVE((SUBJ))"PRES 
subs \[PREp 
L SEM \[REL arrive\] \[ARG1 john \] 
Since the f-structures assigned to the sentences are always subsumed by the se- 
mantic representations they contain, a semantic-driven generator has to compute for a 
given semantic representation ~ a sentence with an f-structure q~ that is subsumed by 
the input (in the following, notated by ~ r- ~). To state the underlying decidability 
problem more formally, we need the fact that a unification grammar G defines a binary 
relation Ac between terminal strings w and f-structures/I~, as given in (2) 
(2) Ac (w, ~) iff G assigns • to w. 
The problem of determining for a given semantic representation ~r whether there is a 
sentence with an f-structure ~ that is subsumed by the input turns out then to be an 
instance of the problem of whether we can decide (3) 
(3) 3w3~(~' r- ~ A Ac(w,q~)) 
for any given input ~. 
The undecidability of the generation problem in (3) was shown for definite clause 
grammars by Dymetman (1991), who reduced the problem to Hilbert's Tenth Prob- 
* Institute for Natural Language Processing, University of Stuttgart, Azenbergstr. 12, D-70174 Stuttgart, Germany. E-mail: juergen@ims.uni-stuttgart.de 
~) 1999 Association for Computational Linguistics 
Computational Linguistics Volume 25, Number 2 
lem. Van Noord (1993) provided a proof for PATR-style grammars using a reduction to 
Post's Correspondence Problem. Moreover, a reduction to Hilbert's Tenth Problem was 
also used by Roach (1983) to show the undecidability of the emptiness problem of lexical- 
functional languages, a result that was later shown by Nishino (1991) using a reduction 
to Post's Correspondence Problem. In this brief note, we want to investigate the close 
relationship between the emptiness problem of lexical-functional and PATR languages 
and the generation problem in (3). We give a much simpler undecidability proof of the 
emptiness problem using a reduction to the emptiness problem of the intersection of 
arbitrary context-free languages, a reduction that Wedekind and Kaplan (1996) used 
to show the undecidability of ambiguity-preserving generation. The close connection 
of the problems--already indicated by the fact that their undecidability proofs were 
achieved by the same reductions--results, then, from the fact that the undecidability 
of the emptiness problem trivially implies the undecidability of semantic-driven gen- 
eration. This result also applies to other unification-based formalisms such as HPSG, 
since they are powerful enough to simulate context-free derivations. 
We begin our construction by defining for each context-free language L a unifica- 
tion grammar that generates L and that associates with each derivable terminal string 
an f-structure consisting of the string's difference list encoding (plus concatenation 
information). 1 For the association of the annotated information with the constituents 
described by a context-free rule of the form A --* w, we use---similar to PATR--a set 
of distinct metavariables {x0 ..... Xiw I }; x0 refers to the mother and xi (i = 1 .... , \]w\[) to 
the ith daughter. 
Definition 
Let G be a context-free grammar in Chomsky normal form whose nonterminal vo- 
cabular)5 terminal vocabulary, start-symboL and rules are given by (VN, VT, S, R). I.e., 
each rule has the form A --* e, A --* a or A --+ BC with A, B, C E VN, a E VT and c de- 
noting the empty string. A string grammar String(G) for G is a unification grammar 
(VN, VT, S, Rs> whose rule set is determined as follows. In the first step we construct 
for each context-free rule r -- A --* w a set of annotations Sr: 
{(x0 IN) ~ (X0 OUT)} if W = C 
Sr = {(X0 IN FIRST) ~ a, (X0 IN REST) ~ (X 0 OUT)} if w = a 
{(x0 IN) ~,~ (Xl IN), (Xl OUT) ~ (X2 IN), (x 0 OUT) ~, (x2 OUT)} if w = BC. 
The set of rules is then given by Rs = {(r, St> \[ r C R}. 2 
Figure 1 illustrates the f-structure encoding of a terminal string generated by a 
simple string grammar. By induction on the depth of the derivation trees, it can eas- 
ily be shown that G and String(G) have the same language and that the f-structure 
assigned to a terminal string w encodes w, as stated more precisely in the following 
Lemma: 
1 We separated this construction out of the main proof, since it might be useful for analyzing other 
problems. 
2 We used PATR-style notation, since it facilitates the construction of string grammars. For LFG 
grammars where we do not have the possibility to refer from one daughter to her sister (necessary for 
(Xl OUT) ~ (x2 IN)) we need a slightly more complex construction. If w ~-- BC then B has to be 
annotated by (T B1) '~ ~ and (T IN) ~ (~ IN) and C by (T c2) ~,~ ~, (T OUT) ~ (~ OUT), and 
(T B1 OUT) ~ (J, IN). If W = a we need (T In FIRST) ~ a and (T In REST) ~-~ (T OUT) and for w = e 
the equation (T IN) ~ (T OUT). With this construction we get the same undecidability results for 
classical LFG grammars. The only difference is that the constructed grammars are tree grammars rather 
than string grammars. 
278 
Wedekind Semantic-driven Generation 
Sx 0 
Axl Dx2 
Bx3 Cx4 
i I b c d 
IN OUT 
°°uw ' 
~' REST b~' REST • REST 
FIRST l FIRST l FIRST l 
b c d 
Figure 1 
A sample constituent structure and the associated f-structure provided by a simple string 
grammar. The metavariables of the rules are instantiated by the variables attached to the 
nodes of the constituent structure. To each variable xi is assigned the f-structure element ai. 
Lemma 
Let String(G) be a string grammar. Then L(G) = L(String(G)) and if there is a deriva- 
tion of a terminal string w with root Sx0 and f-structure • then the substructure of 
which comprises the elements accessible from a0 in • is a minimal solution of 
{(X0 IN REST i-1 FIRST) ~ Wil 1 <_ i <_ \[wl} U {(Xo OUT) ~ (X0 IN RESTIWl)}. 3 
If we combine two arbitrary string grammars in such a way that the string encod- 
ings of the derived terminal strings get unified, we can show the undecidability of the 
emptiness problem by a simple reduction to the emptiness problem of the intersection 
of arbitrary context-free languages. 
Theorem 
It is undecidable for an arbitrary unification grammar G whether L(G) -- O. 
Proof 
Let G 1 = (V~, V 1, S 1, R 1) and G2= (~flN, V2, $2, R2} be context-free grammars for two 
arbitrary context-free languages. Without loss of generality, we can assume that 
V~ n V2N -- 0 and that each rule in R i (i = 1, 2) is in Chomsky normal form. On the ba- 
sis of String(G 1) and String(G 2) we construct a unification grammar G = (VN, VT, S, R) 
with VN = VIu   u {S} and S ~ V~U~fl N 
v:=v uv 2 
f Xo ~ Xl, XO ,~, X2, R: {(S SlS 2,\[(x0 OUT FIRST) 1UR  
"x 
such that # is a new atomic value not in VT. If we assume for G constant-consistency 
(i.e., axioms of the form t- a ~ b for all atomic values a, b E VT U {# } with a ~ b) then 
the problem whether L(G)= 0 reduces to the undecidable problem whether 
L(G 1) N L(G 2) = 0. In order to get a derivation of a well-formed terminal string wlw 2 
from S with w 1 derived from S 1 and w 2 from S 2, w I must be identical with w 2, since 
both string encodings get unified by the S-rule and (xo OUT FIRST) ~ # ensures that 
one string is not a proper prefix of the other. 4 Thus, L(G) = {ww I w E L(G 1) n L(G2)} 
and L(G) = 0 iff L(G 1) N L(G 2) -- 0. • 
3 The whole f-structure encodes the complete difference list derivation of wx - x, which is induced by 
the derivation tree by relabeling each (nonterminal) node dominating substring v of uvz = w by 
vzx - zx, since the annotations of each rule of the form A --* BC encode the difference list of the 
mother as the concatenation of the lists of its daughters (X - X2 = X - X1 + X1 - X2). 
4 The annotation (xo out FroST) ~ # is not necessary if acyclicity is assumed. 
279 
Computational Linguistics Volume 25, Number 2 
By taking the smallest f-structure _L as an input the undecidability of our generation 
problem reduces trivially to the undecidability of the emptiness problem, since 
L(G) = {w I = (w 13 (± _E A aG(w, 
That is, if the emptiness problem of L(G) is undecidable for a unification grammar G 
then G's generation problem in (3) must be undecidable too. (The other direction does 
not hold, of course.) 
Corollary 
For an arbitrary unification grammar G and an arbitrary f-structure @P it is undecidable 
whether there is an f-structure @ and a terminal string w such that @' u ~ and 
Although it might be argued that we show the undecidability on the basis of a 
rather special case, namely the smallest f-structure, the undecidability of the empti- 
ness problem is nevertheless sufficient, since we always get a (superficially) less triv- 
ial direct proof of the corollary by using any proof of the theorem and adding some 
(new) nontrivial input informati0ri to the S-rule. If we add, for example, the equation 
(x0 SEM) ~ 1 to the S-rule of Our proof 
f X ,~ Xl, X 0 ,~ X2, 
(S --+ 8182 , ~ (x0 OUT FIRST) ~, #'/) 
((x0 SEa) ~ 1 
then the problem whether we can find for \[SEM 1\] (= ~') an f-structure • and a 
terminal string w such that \[SEM 1\] __G ~ and AG(w, ~) reduces to the undecidable 
problem whether L(G) = 0 as well. s 
Our construction shows that an LFG or PATR grammar G can simulate the valid 
computations of an arbitrary Turing machine M, since they are known to be speci- 
fiable by the intersection of two context-free languages. Since L(M) = 0 is undecid- 
able, the emptiness problem of L(G) must be undecidable too. By adding a bit of 
semantic representation ~' to the S-rule these properties are trivially carried over 
from L(G) to the set of possible realizations assigned to ~ by G, given by the lan- 
guage {w I 3~(q)' _G ~ A A~(w, (I)))}. Our proof construction works, of course, even if 
the grammatical formalisms satisfy the off-line parsability restriction. 6 Thus, the decid- 
ability of the membership problem--similar to context-sensitive grammars--does not 
imply the decidability of the emptiness (and the semantic-driven generation) problem. 7 
From a cognitive point of view it seems quite unrealistic that our language gen- 
eration capabilities require mathematical models of Turing machine power. Hence, 
natural language grammars (of the LFG and PATR formalisms) must satisfy condi- 
tions that do not allow us to show the undecidability of the problem. We assumed 
the semantic representations to be structurally unrelated to the f-structures they sub- 
sume. It seems more plausible that there is a proportion k that bounds the size of an 
5 Van Noord (1993) used the equation (x0 SOLUTION) ,~ yes in his proof. 
6 If the context-free grammars G 1 and G 2 are off-line parsable then the unification grammars G used in 
the undecidability proofs are off-line parsable as well. Since we can decide e E L(G') for any 
context-free grammar G r and can reduce G' to an off-line parsable grammar G" with 
L(G') - {e} = L(G"), L(G 1) N L(G 2) = 0 and hence L(G) = 0 must be undecidable even if the 
grammars satisfy the off-line parsability restriction. 
280 
Wedekind Semantic-driven Generation 
f-structure q~ assigned to a string by the size of its subsuming semantic representa- 
tion • ': \]~l < kl~'I. This would force the f-structures of the surface realizations of a 
semantic representation ~' given by {q~ I ~' G ~ A 3w(Ac(w, ~))} to be included in a 
finite and computable set of structurally related f-structures {q~ I ~' _G q~ A I~I < kI~'I}- 
Since the generation problem is decidable (Wedekind 1995), i.e., {w I At(w, ~)} = 0 is 
decidable for any given f-structure {b, and only a finite number of structurally related 
f-structures q~ has to be tested for {w I At(w, ~)} ---- 0, semantic-driven generation must 
be decidable. But we must, of course, admit that it is far from being evident yet, how 
this structural relation is realized in natural language grammars. 
Acknowledgments 
Ron Kaplan had the idea that the proof 
construction which we used in Wedekind 
and Kaplan (1996) might be useful for other 
purposes. Thanks to him for valuable 
suggestions. 
References 
Dymetman, M. 1991. Inherently reversible 
grammars, logic programming and 
computability. In Proceedings of the ACL 
Workshop: Reversible Grammar in Natural 
Language Processing. Berkeley; CA, pages 
20-30. 
Kaplan, R. M. and J. Bresnan. 1982. 
Lexical-Functional Grammar: A formal 
system for grammatical representation. In 
J. Bresnan, editor, The Mental 
Representation of Grammatical Relations. MIT 
Press, Cambridge, MA, pages 173-281. 
Nishino, T. 1991. Mathematical analysis of 
lexical-functional grammars. Language 
Research, 27(1): 119-141. 
Pollard, C. and I. Sag. 1994. Head-Driven 
Phrase Structure Grammar. The University 
of Chicago Press, Chicago, IL. 
Roach, K. 1983. LFG languages over a 
one-letter alphabet. Manuscript, Xerox 
PARC, Palo Alto, CA. 
Shieber, S., H. Uszkoreit, F. Pereira, J. 
Robinson, and M. Tyson. 1983. The 
formalism and implementation of 
PATR-II. In B. Grosz and M. Stickel, 
editors, Research on Interactive Acquisition 
and Use of Knowledge. SRI Final Report 
1894. SRI International, Menlo Park CA, 
pages 39-79. 
van Noord, G. 1993. Reversibility in Natural 
Language Processing. Ph.D. thesis, 
Rijksuniversiteit Utrecht. 
Wedekind, J. 1995. Some remarks on the 
decidability of the generation problem in 
LFG- and PATR-style unification 
grammars. In Proceedings of the 7th 
Conference of the European Chap ter of the 
Association for Computational Linguistics. 
Dublin, pages 45-52. 
Wedekind, J. and R. M. Kaplan. 1996. 
Ambiguity-preserving generation with 
LFG- and PATR-style grammars. 
Computational Linguistics, 22(4): 555-558. 
7 This fact is already illustrated by the languages of the grammars we used in the undecidability proofs; 
they all have a decidable membership problem, since w E L(G 1) N L(G 2) is decidable for arbitrary 
context-free grammars G 1 and G 2. 
281 

