File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/j99-2006_metho.xml
Size: 10,407 bytes
Last Modified: 2025-10-06 14:15:21
<?xml version="1.0" standalone="yes"?> <Paper uid="J99-2006"> <Title>formalism and implementation of</Title> <Section position="2" start_page="0" end_page="278" type="metho"> <SectionTitle> * Institute for Natural Language Processing, University of Stuttgart, Azenbergstr. 12, D-70174 Stuttgart, Germany. E-mail: juergen@ims.uni-stuttgart.de </SectionTitle> <Paragraph position="0"> ~) 1999 Association for Computational Linguistics Computational Linguistics Volume 25, Number 2 lem. Van Noord (1993) provided a proof for PATR-style grammars using a reduction to Post's Correspondence Problem. Moreover, a reduction to Hilbert's Tenth Problem was also used by Roach (1983) to show the undecidability of the emptiness problem of lexical-functional languages, a result that was later shown by Nishino (1991) using a reduction to Post's Correspondence Problem. In this brief note, we want to investigate the close relationship between the emptiness problem of lexical-functional and PATR languages and the generation problem in (3). We give a much simpler undecidability proof of the emptiness problem using a reduction to the emptiness problem of the intersection of arbitrary context-free languages, a reduction that Wedekind and Kaplan (1996) used to show the undecidability of ambiguity-preserving generation. The close connection of the problems--already indicated by the fact that their undecidability proofs were achieved by the same reductions--results, then, from the fact that the undecidability of the emptiness problem trivially implies the undecidability of semantic-driven generation. This result also applies to other unification-based formalisms such as HPSG, since they are powerful enough to simulate context-free derivations.</Paragraph> <Paragraph position="1"> We begin our construction by defining for each context-free language L a unification grammar that generates L and that associates with each derivable terminal string an f-structure consisting of the string's difference list encoding (plus concatenation information). 1 For the association of the annotated information with the constituents described by a context-free rule of the form A --* w, we use---similar to PATR--a set of distinct metavariables {x0 ..... Xiw I }; x0 refers to the mother and xi (i = 1 .... , \]w\[) to the ith daughter.</Paragraph> <Paragraph position="2"> Definition Let G be a context-free grammar in Chomsky normal form whose nonterminal vocabular)5 terminal vocabulary, start-symboL and rules are given by (VN, VT, S, R). I.e., each rule has the form A --* e, A --* a or A --+ BC with A, B, C E VN, a E VT and c denoting the empty string. A string grammar String(G) for G is a unification grammar (VN, VT, S, Rs> whose rule set is determined as follows. In the first step we construct for each context-free rule r -- A --* w a set of annotations Sr:</Paragraph> <Paragraph position="4"> The set of rules is then given by Rs = {(r, St> \[ r C R}. 2 Figure 1 illustrates the f-structure encoding of a terminal string generated by a simple string grammar. By induction on the depth of the derivation trees, it can easily be shown that G and String(G) have the same language and that the f-structure assigned to a terminal string w encodes w, as stated more precisely in the following Lemma: grammars where we do not have the possibility to refer from one daughter to her sister (necessary for</Paragraph> <Paragraph position="6"> the equation (T IN) ~ (T OUT). With this construction we get the same undecidability results for classical LFG grammars. The only difference is that the constructed grammars are tree grammars rather than string grammars.</Paragraph> <Paragraph position="7"> grammar. The metavariables of the rules are instantiated by the variables attached to the nodes of the constituent structure. To each variable xi is assigned the f-structure element ai. Lemma Let String(G) be a string grammar. Then L(G) = L(String(G)) and if there is a derivation of a terminal string w with root Sx0 and f-structure * then the substructure of which comprises the elements accessible from a0 in * is a minimal solution of {(X0 IN REST i-1 FIRST) ~ Wil 1 <_ i <_ \[wl} U {(Xo OUT) ~ (X0 IN RESTIWl)}. 3 If we combine two arbitrary string grammars in such a way that the string encodings of the derived terminal strings get unified, we can show the undecidability of the emptiness problem by a simple reduction to the emptiness problem of the intersection of arbitrary context-free languages.</Paragraph> <Paragraph position="8"> Theorem It is undecidable for an arbitrary unification grammar G whether L(G) -- O. Let G 1 = (V~, V 1, S 1, R 1) and G2= (~flN, V2, $2, R2} be context-free grammars for two arbitrary context-free languages. Without loss of generality, we can assume that</Paragraph> <Paragraph position="10"> such that # is a new atomic value not in VT. If we assume for G constant-consistency (i.e., axioms of the form t- a ~ b for all atomic values a, b E VT U {# } with a ~ b) then the problem whether L(G)= 0 reduces to the undecidable problem whether L(G 1) N L(G 2) = 0. In order to get a derivation of a well-formed terminal string wlw 2 from S with w 1 derived from S 1 and w 2 from S 2, w I must be identical with w 2, since both string encodings get unified by the S-rule and (xo OUT FIRST) ~ # ensures that one string is not a proper prefix of the other. 4 Thus, L(G) = {ww I w E L(G 1) n L(G2)}</Paragraph> <Paragraph position="12"/> </Section> <Section position="3" start_page="278" end_page="278" type="metho"> <SectionTitle> 3 The whole f-structure encodes the complete difference list derivation of wx - x, which is induced by </SectionTitle> <Paragraph position="0"> the derivation tree by relabeling each (nonterminal) node dominating substring v of uvz = w by vzx - zx, since the annotations of each rule of the form A --* BC encode the difference list of the mother as the concatenation of the lists of its daughters (X - X2 = X - X1 + X1 - X2).</Paragraph> </Section> <Section position="4" start_page="278" end_page="280" type="metho"> <SectionTitle> 4 The annotation (xo out FroST) ~ # is not necessary if acyclicity is assumed. </SectionTitle> <Paragraph position="0"> Computational Linguistics Volume 25, Number 2 By taking the smallest f-structure _L as an input the undecidability of our generation problem reduces trivially to the undecidability of the emptiness problem, since</Paragraph> <Paragraph position="2"> That is, if the emptiness problem of L(G) is undecidable for a unification grammar G then G's generation problem in (3) must be undecidable too. (The other direction does not hold, of course.) Corollary For an arbitrary unification grammar G and an arbitrary f-structure @P it is undecidable whether there is an f-structure @ and a terminal string w such that @' u ~ and Although it might be argued that we show the undecidability on the basis of a rather special case, namely the smallest f-structure, the undecidability of the emptiness problem is nevertheless sufficient, since we always get a (superficially) less trivial direct proof of the corollary by using any proof of the theorem and adding some (new) nontrivial input informati0ri to the S-rule. If we add, for example, the equation</Paragraph> <Paragraph position="4"> then the problem whether we can find for \[SEM 1\] (= ~') an f-structure * and a terminal string w such that \[SEM 1\] __G ~ and AG(w, ~) reduces to the undecidable problem whether L(G) = 0 as well. s Our construction shows that an LFG or PATR grammar G can simulate the valid computations of an arbitrary Turing machine M, since they are known to be specifiable by the intersection of two context-free languages. Since L(M) = 0 is undecidable, the emptiness problem of L(G) must be undecidable too. By adding a bit of semantic representation ~' to the S-rule these properties are trivially carried over from L(G) to the set of possible realizations assigned to ~ by G, given by the language {w I 3~(q)' _G ~ A A~(w, (I)))}. Our proof construction works, of course, even if the grammatical formalisms satisfy the off-line parsability restriction. 6 Thus, the decidability of the membership problem--similar to context-sensitive grammars--does not imply the decidability of the emptiness (and the semantic-driven generation) problem. 7 From a cognitive point of view it seems quite unrealistic that our language generation capabilities require mathematical models of Turing machine power. Hence, natural language grammars (of the LFG and PATR formalisms) must satisfy conditions that do not allow us to show the undecidability of the problem. We assumed the semantic representations to be structurally unrelated to the f-structures they subsume. It seems more plausible that there is a proportion k that bounds the size of an 5 Van Noord (1993) used the equation (x0 SOLUTION) ,~ yes in his proof.</Paragraph> <Paragraph position="5"> 6 If the context-free grammars G 1 and G 2 are off-line parsable then the unification grammars G used in the undecidability proofs are off-line parsable as well. Since we can decide e E L(G') for any context-free grammar G r and can reduce G' to an off-line parsable grammar G&quot; with L(G') - {e} = L(G&quot;), L(G 1) N L(G 2) = 0 and hence L(G) = 0 must be undecidable even if the grammars satisfy the off-line parsability restriction.</Paragraph> <Section position="1" start_page="280" end_page="280" type="sub_section"> <SectionTitle> Wedekind Semantic-driven Generation </SectionTitle> <Paragraph position="0"> f-structure q~ assigned to a string by the size of its subsuming semantic representation * ': \]~l < kl~'I. This would force the f-structures of the surface realizations of a semantic representation ~' given by {q~ I ~' G ~ A 3w(Ac(w, ~))} to be included in a finite and computable set of structurally related f-structures {q~ I ~' _G q~ A I~I < kI~'I}-Since the generation problem is decidable (Wedekind 1995), i.e., {w I At(w, ~)} = 0 is decidable for any given f-structure {b, and only a finite number of structurally related f-structures q~ has to be tested for {w I At(w, ~)} ---- 0, semantic-driven generation must be decidable. But we must, of course, admit that it is far from being evident yet, how this structural relation is realized in natural language grammars.</Paragraph> </Section> </Section> class="xml-element"></Paper>