File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/p93-1018_metho.xml
Size: 35,794 bytes
Last Modified: 2025-10-06 14:13:28
<?xml version="1.0" standalone="yes"?> <Paper uid="P93-1018"> <Title>PARALLEL MULTIPLE CONTEXT-FREE GRAMMARS, FINITE-STATE TRANSLATION SYSTEMS, AND POLYNOMIAL-TIME RECOGNIZABLE SUBCLASSES OF LEXICAL-FUNCTIONAL GRAMMARS</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> PARALLEL MULTIPLE CONTEXT-FREE GRAMMARS, FINITE-STATE TRANSLATION SYSTEMS, AND POLYNOMIAL-TIME RECOGNIZABLE SUBCLASSES OF LEXICAL-FUNCTIONAL GRAMMARS </SectionTitle> <Paragraph position="0"> Hiroyuki Seki tt Ryuichi Nakanishi t Yuichi Kaji t Sachiko Ando t Tadao Kasami $t t Department of Information and Computer Sciences, Faculty of Engineering Science, Osaka University 1-1 Machikaneyama, Toyonaka, Osaka 560, Japan :~ Graduate School of Information Science, Advanced Institute of Science and Technology, Nara</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> A number of grammatical formalisms were introduced to define the syntax of natural languages.</Paragraph> <Paragraph position="1"> Among them are parallel multiple context-free grammars (pmcfg's) and lexical-functional grammars (lfg's). Pmcfg's and their subclass called multiple context-free grammars (mcfg's) are natural extensions of cfg's, and pmcfg's are known to be recognizable in polynomial time. Some sub-classes of lfg's have been proposed, but they were shown to generate an AlP-complete language. Finite state translation systems (fts') were introduced as a computational model of transformational grammars. In this paper, three subclasses of lfg's called nc-lfg's, dc-lfg's and fc-lfg's are introduced and the generative capacities of the above mentioned grammatical formalisms are investigated. First, we show that the generative capacity of fts' is equal to that of nc-lfg's. As relations among subclasses of those formalisms, it is shown that the generative capacities of deterministic fts', dc-lfg's, and pmcfg's are equal to each other, and the generative capacity of fc-lfg's is equal to that of mcfg's. It is also shown that at least one Af79-complete language is generated by fts'. Consequently, deterministic fts', dc-lfg's and fc-lfg's can be recognized in polynomial time.</Paragraph> <Paragraph position="2"> However, fts' (and nc-lfg's) cannot, if P C/ AfT 9.</Paragraph> </Section> <Section position="3" start_page="0" end_page="130" type="metho"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> A number of grammatical formalisms such as lexical-functional grammars (Kaplan 1982), head grammars (Pollard 1984) and tree adjoining grammars (Joshi 1975)(Vijay-Shanker 1987) were introduced to define the syntax of natural languages. On the other hand, there has been much effort to propose well-defined computational models of transformational grammars. One of these is the one to extend devices which operate on strings, such as generalized sequential machines (gsm's) to devices which operate on trees.</Paragraph> <Paragraph position="1"> It is fundamentally significant to clarify the generative capacities of such grammars and devices.</Paragraph> <Paragraph position="2"> Parallel multiple context-free grammars (pmcfg's) and multiple context-free grammars (mcfg's) were introduced in (Kasami 1988a)(Seki 1991) as natural extensions of cfg's. The subsystem of linear context-free rewriting systems (Icfrs') (Vijay-Shanker 1987) which deals with only strings is the same formalism as mcfg's. The class of cfl's is properly included in the class of languages generated by pmcfg's, which in turn is properly included in the one generated by mcfg's. The class of languages generated by pmcfg's is properly included in that of context-sensitive languages (Kasami 1988a). Pmcfg's have been shown to be recognized in polynomial time (Kasami 1988b)(Seki 1991).</Paragraph> <Paragraph position="3"> A tree transducer (Rounds 1969) takes a tree as an input, starts from the initial state with its head scanning the root node of an input. According to the current state and the label of the scanned node, it transforms an input tree into an output tree in a top-down way. A finite state translation system (fts) is a tree transducer with its input domain being the set of derivation trees of a cfg (Rounds 1969)(Thatcher 1967). A number of equivalence relations between the classes of yield languages generated by fts' and other computational models have been established (Engelfriet 1991)(Engelfriet 1980)(Weir 1992). Especially, it has been shown that the class of yield languages generated by finite-copying fts' equals to the class of languages generated by lcfrs' (Weir 1992), hence by mcfg's.</Paragraph> <Paragraph position="4"> In lexical-functional grammars (Ifg's) (Kaplan 1982), associated with each node v of a derivation tree is a finite set F of pairs of attribute names and their values. F is called the f-structure of v. An lfg G consists of a cfg Go called the underlying cfg of G and a finite set Pfs of equations called functional schemata which specify constraints between the f-structures of nodes in a derivation tree. Functional schemata are attached to symbols in productions of Go. It has been shown in (Nakanishi 1992) that the class of languages generated by lfg's is equal to that of re- null cursively enumerable languages even though the underlying cfg's are restricted to regular grammars. In (Gazdar 1985)(Kaplan 1982)(Nishino 1991), subclasses of lfg's were proposed in order to guarantee the recursiveness (and/or the efficient recognition) of languages generated by lfg's. However, these classes were shown to generate an A/P-complete language (Nakanishi 1992).</Paragraph> <Paragraph position="5"> In this paper, three subclasses of lfg's called nc-lfg's, dc-lfg's and fc-lfg's are proposed, two of which can be recognized in polynomial time.</Paragraph> <Paragraph position="6"> Moreover, this paper clarifies the relations among the generative capacities of pmcfg's, fts' and these subclasses of lfg's.</Paragraph> <Paragraph position="7"> In nc-lfg's, a functional schema either specifies the vMue of a specific attribute, say atr, immediately (Tart = val) or specifies that the value of a specific attribute of a node v is equal to the whole f-structure of a child node of v (Tatr =l).</Paragraph> <Paragraph position="8"> An nc-lfg is called a dc-lfg if each pair of rules P\] : A --~ aa and P2 : A --~ a2 whose left-hand sides are the same is inconsistent in the sense that there exists no f-structure that locally satisfies both of the functional schemata of Pl and those of p2. Intuitively, in a dc-lfg G, for each pair (tl, t2) of derivation trees in G, if the f-structure and nonterminal of the root of tl are the same as those of t2, then t\] and t2 derive the same terminal string.</Paragraph> <Paragraph position="9"> Let G be an nc-lfg. A multiset M of nonterminals of G is called an SPN multiset in G if the following condition holds: Let M = {{A1,A2,'..,An}} be a multiset of nonterminals where different Ai's are not always distinct. There exist a derivation tree t and a subset of nodes V = {v\],v2,...,v,~} of t such that the label ofvi is Ai (1 < i < n) and the f-structures of vi's are the same with each other by functional schemata of G.</Paragraph> <Paragraph position="10"> If the number of SPN multisets in G is finite, then G is called an fc-lfg.</Paragraph> <Paragraph position="11"> Our main result is that the generative capacity of nc-lfg's is equal to that of fts'. As relations among proper subclasses of the above mentioned formalisms, it is shown that the generative capacities of dc-lfg's, deterministic fts' and pmcfg's are equal to each other, and the generative capacity of fc-lfg's is equal to that of mcfg's. It is also shown that a (nondeterministic) fts generates an Af:P-complete language.</Paragraph> </Section> <Section position="4" start_page="130" end_page="131" type="metho"> <SectionTitle> 2 Parallel Multiple Context-Free Grammars </SectionTitle> <Paragraph position="0"> A parallel multiple context-free grammar (pmcfg) is defined to be a 5-tuple G = ( N, T, F, P, S) which satisfies the following conditions (G1) through (Gh) (Kasami 1988a)(Seki 1991).</Paragraph> <Paragraph position="1"> (G1) N is a finite set of nonterminal symbols. A positive integer d(A) is given for each nonterminal symbol A * N.</Paragraph> <Paragraph position="2"> (G2) T is a finite set of terminal symbols which is disjoint with N.</Paragraph> <Paragraph position="3"> (G3) F is a finite set of functions satisfying the following conditions. For a positive integer d, let (T*) a denote the set of all the d-tuples of strings over T. For each f * F with arity a(f), positive integers r(f) and di(f) (1 _<</Paragraph> <Paragraph position="5"> denote the ith argument of f for 1 < i < a(f).</Paragraph> <Paragraph position="6"> (fl) For 1 < h < r(f), the hth component of f, denoted by f\[h\], is defined as;</Paragraph> <Paragraph position="8"> the form A ---* f\[A1,A2,...,Aa(y)\] where A, Aa,A2,...,Aa(/) * N, f * F, r(f) = d(A) and di(f) = d(Ai) (1 < i < a(f)). Ifa(f) = 0, i.e., f * (T*) r(f), the production is called a terminating production, otherwise it is called a nonterminating production.</Paragraph> <Paragraph position="9"> (Gh) S * N is the initial symbol, and d(S) = 1. If all the functions of a pmcfg G satisfy the following Right Linearity condition, then G is called a multiple context-free grammar (mcfg). \[Right Linearity \] For each xlj, the total number of occurrences of xij in the right-hand sides of (2.1) from h = 1 through r(f) is at most one.</Paragraph> <Paragraph position="10"> The language generated by a pmcfg G = (N, T, F, P, S) is defined as follows. For A * N, let us define LG(A) as the smallest set satisfying the following two conditions: (L1) If a terminating production A --* & is in P, then ~ * LG(A).</Paragraph> <Paragraph position="11"> (L2) If A --~ f\[A1,A2,...,Aa(y)\] * P and</Paragraph> <Paragraph position="13"> Define L(G) a=La(S). L(G) is called the parallel multiple context-free language (pmcfl) generated by G. If G is an mcfg, L(G) is called the multiple context-free language (mcfl) generated by</Paragraph> <Paragraph position="15"> a,f\[(x)\] = xx. GExl is a pmcfg but is not an mcfg since the function f does not satisfy Right Linearity. The language generated by GEx~ is {a 2&quot; In > 0}, which cannot be generated by any mcfg (see Lemma 6 of (Kasami 1988a)).</Paragraph> <Paragraph position="16"> The empty string is denoted by C/.</Paragraph> <Paragraph position="17"> Example 2.2: Let GEx2 = (N, T, F, P, S) be a pmcfg, where N = {S,A), T = {a,b}, F =</Paragraph> <Paragraph position="19"> be a pmcfg. For a given string w, it is decidable whether w E L (G) or not in time polynomial of I~1, where I~1 denotes the length of w.</Paragraph> </Section> <Section position="5" start_page="131" end_page="132" type="metho"> <SectionTitle> 3 Finite State Translation Systems </SectionTitle> <Paragraph position="0"> A set ~ of symbols is a ranked alphabet if, for each cr E ~, a unique non-negative number p(c~) is associated, p(cr) is the rank of ~. For a set X, we define free algebra T~.(X) as the smallest set</Paragraph> <Paragraph position="2"> 7-~.(X), then t-= or(t1,.., tn) E T~(X). t~ is called the root symbol, or shortly, the root of t.</Paragraph> <Paragraph position="3"> Hereafter, a term in 7&quot;~ (X) is also called a tree, and we use terminology of trees such as subtree, node and so on.</Paragraph> <Paragraph position="4"> Let G - (N, T, P, S) be a context-free grammar (cfg) where N, T, P and S are a set of non-terminal symbols, a set of terminal symbols, a set of productions and the initial symbol, respectively. A derivation tree in cfg G is a term defined as follows.</Paragraph> <Paragraph position="5"> (T1) Every a E T is a derivation tree in G.</Paragraph> <Paragraph position="6"> (T2) Assume that there are a production p : A ---* X1...X,~ (A E N, XI,...,Xn E NUT) in P and n derivation trees tl,...t,~ whose roots are labeled with Pl,..., pn, respectively, and * ifXi E N, then pl is a production Xi --~ &quot; &quot;, whose left-hand side is Xi, and * ifXiET, thenpi=ti=Xi.</Paragraph> <Paragraph position="7"> Then p(tl,..., t,~) is a derivation tree in G. (T3) There are no other derivation trees.</Paragraph> <Paragraph position="8"> Let T~(G) be the set of derivation trees in G, and 7C/s(G) C 7C/(G) be the set of derivation trees whose root is labeled with a production of which left-hand side is the initial symbol S. Clearly, T~s(G) C_ T~(C/) holds. Remark that 7C/s(G) is a multi-sorted algebra, where the nonterminals are sorts, and the terminals and the labels of productions are operators.</Paragraph> <Paragraph position="9"> A tree transducer (Rounds 1969) defines a mapping from trees to trees. Since we are mainly interested in the string language generated by a tree transducer, a &quot;tree-to-string&quot; version of transducer defined in (Engelfriet 1980) is used in this paper. For sets Q and X, let Q\[X\]~{q\[x\] l q e Q,x e X).</Paragraph> <Paragraph position="10"> A tree-to-string transducer (yT-transducer or simply transducer) is defined to be a 5-tuple M = (Q, ~., A, q0, R) where (1) Q is a finite set of states, (2) ~ is an input ranked alphabet, (3) A is an output alphabet, (4) q0 E Q is the initial state, and (5) R is a finite set of rules of the form q\[c~(xl,..., xn)\] --* v where q e Q, e = and v e (Z uQ\[{xl, ..., xn}\])*. If any different rules in R have different left-hand sides, then M is called deterministic (Engelfriet 1980).</Paragraph> <Paragraph position="11"> A configuration of a yT-transducer is an element in (A U Q\[T~.(C/)\])*. Derivation of M is defined as follows. Let t ----- alq\[a(tl,..., tn)\]a2 be a configuration where al, a2 E (A U Q\[T~.(C/)\])*, q E Q, ~ E ~, p(a) = n and Q,...,tn E T~.(C/).</Paragraph> <Paragraph position="12"> Assume that there is a rule q\[cr(xl,..., Xn)\] --* V in R. Let t ~ be obtained from v by substituting t\],..., tn for xl,..., xn, respectively, then we define t ~M ultra2 * Let ::~ be the reflexive and transitive closure of :=~. If t =C/.~ t ~, then we say t ~ is derived from t. If there is no w E A* such that t ~ w, then we say no output is derived from t.</Paragraph> <Paragraph position="13"> A tree-to-string finite state translation system (yT-fts or fts) is defined by a yT-transducer M and a cfg G, written as (M,G) (Rounds 1969)(Thatcher 1967).</Paragraph> <Paragraph position="14"> We define yL(M,G), called the yield language generated by yT-fts (M, G), as yL(M,a)~{w e A* 13t e ~s(a),qo\[t\] ~*M w} where A is an output alphabet and q0 is the initial state of M. An fts is called deterministic (Engelfriet 1980) if the transducer M is deterministic. Engelfriet introduced a subclass of fts' called finite-copying fts' as follows (Engelfriet 1980): Let (M,G) be an fts with output alphabet A and initial state q0, t be a derivation tree in G and t ~ be a subtree of t. Assume that there is a derivation a : q0\[t\] =~ w. Now, delete from this derivation a all the derivation steps which operates on t t. This leads to the following new derivation which keeps t ~ untouched;</Paragraph> <Paragraph position="16"> The state sequence of t ! in derivation a is defined to be (qi~,...,qi.). Derivation a has copying-bound k if, for every subtree of t, the length of its state sequence is at most k. An fts (M, G) is a finite-copying, if there is a constant k and for each w * yL(M, G), there is a derivation tree t in G and a derivation q0\[t\] ~ w with copying-bound k. It is known that the determinism does not weaken the generative capacity of finite-copying fts' (Engelfriet 1980).</Paragraph> <Paragraph position="17"> We note that an fts (M, G) can be considered to be a model of a transformational grammar: A deep-structure of a sentence is represented by a derivation tree of G, and M can be considered to transform the deep-structure into a sentence (or its surface structure).</Paragraph> </Section> <Section position="6" start_page="132" end_page="134" type="metho"> <SectionTitle> 4 Subclasses of Lexical-functional </SectionTitle> <Paragraph position="0"> grammars A simple subclass of lfg's, called r-lfg's, is introduced in (Nishino 1992), which is shown to generate all the recursively enumerable languages (Nakanishi 1992). Here, we define a nondeterministic copying Ifg (nc-lfg) as a proper subclass of r-lfg's. An nc-lfg is defined to be a 6-tuple G = (N, T, P, S, N~t~, A~tr~) where: (1) N is a finite set of nonterminal symbols, (2) T is a finite set of terminal symbols, and (3) P is a finite set of annotated productions. Sometimes, a nonterminal symbol, a terminal symbol and an annotated production are abbreviated as a nonterminal, a terminal and a production, respectively, i 4) S * N is the initial symbol, (5) Nat~ is a finite set of attributes, and (6) A~tm is a finite set of atoms.</Paragraph> <Paragraph position="1"> An equation of the form T atr =~ (atr * Nat,) is called an S (structure synthesizing) schema, and an equation of the form T atr .-= val (atr * Natr, val * A~tm) is called a V (immediate value) schema. A functional schema is either an S schema or a V schema.</Paragraph> <Paragraph position="2"> Each production p * P has the following form:</Paragraph> <Paragraph position="4"> where A * N, B1,B2,.&quot;,Bq * NUT. Ev is a finite set of V schemata and Esj (1 _< j <_ q) is a singleton of an S schema. A --~ B1B2&quot;.. Bq in (4.2) is called the underlying production of p. Let P0 be the set of all the underlying productions of P. Cfg Go = (N, T, P0, S) is called the underlying c/g o/ C.</Paragraph> <Paragraph position="5"> An f-structure of G is recursively defined as a set F -=- {(atrl, call), (atr2, val2>,..., latrk, valk)} where atr\], atr2,..., and atrk are distinct attributes, and each of vail, val2,.&quot; &quot;, and valk is an atom or an f-structure. We say that vali (1 < i < k) is the value of atri in F and write F.atri -= vali. For a cfg G' = ( N ~, T', P~, S~), derivation relations in G ~, denoted by A ::~a' a and A =~* G ~ (A * N',a * (N' u T')*) are defined in the usual way.</Paragraph> <Paragraph position="6"> Suppose Go = i N, T, P0, S) is the underlying cfg of an nc-lfg G = (N, T, P, S, Nat,, Aa,m). Let t be a derivation tree in Go. (In 4.,7. and 8., the label of a leaf of a derivation tree is allowed to be a nonterminal.) Every internal node v in t has an f-structure, which is called the f-structure of v and written as Fv. If an underlying production</Paragraph> <Paragraph position="8"> labeled with either P0 itself, or p (* P) of which P0 is the underlying production, if necessary. Let vi be the ith child ofv (1 < i < q). We define the values of both sides of a functional schema attached to the symbol in p (on v) as follows: * the value of T atr(atr * Nat,) is Fv.atr, * the value of + in an S schema is Fv~ if the S schema is attached to the i(1 _< i _< q)th symbol in the right-hand side of p, and * the value of atom atm in a V schema is arm itself.</Paragraph> <Paragraph position="9"> We say that v satisfies functional schemata if for each functional schema lls = rib of p, the values of lls and r/s on v are defined and equals with each other. In this case, it is also said that Fv locally satisfies the functional schemata of p. NOTE : Because the meaning of a V schema is independent of the position where it is annotated, V schemata are attached to the left-hand side in this paper.</Paragraph> <Paragraph position="10"> For a nonterminal A E N and a sentential form a E iN t_J T)*, let t be a derivation tree of a derivation A =** Go a. If all internal nodes in t satisfy functional schemata, then a is said to be derived from A and written as A =~* . a a In this case, the tree t is called a derivation tree of A:=~* G a. We also call t a derivation tree (of a) in G simply.</Paragraph> <Paragraph position="11"> The language generated by an nc-lfg G, denoted by LIG), is defined as L(G) = {w e &quot;Esj (1 < j < q) is either a singleton of an S schema or an empty set&quot;, the generative capacity of nc-lfg is not changed. Example 4.1: Let G~xs = (N, T, P, S, Nat,, A~tm) be an nc-lfg where N = {S,A,B}, T = {a, b,c, d}, Nat~ = {count}, Aatrn = {e}, and pro-</Paragraph> <Paragraph position="13"> The language generated by GExs is L(GExs) =</Paragraph> <Paragraph position="15"> S~---* the woman and A drinks and B</Paragraph> <Paragraph position="17"> G~xs generates &quot;respectively&quot; sentences such as &quot;the woman and the men drinks and smoke respectively&quot;. null For a set X of functional schemata, X is consistent iff neither the following (1) nor (2) holds. (1) {T atr = Call, T atr = val2 } c X for some atr E Na,, and some vall,val2 E Aatm such that call # val2.</Paragraph> <Paragraph position="18"> (2) iT atr = val, T atr =~} _C X for some atr E Nat~ and some val E Aatm.</Paragraph> <Paragraph position="19"> Productions pl,''',Pn are consistent iff Ul<i<_n E (0 is consistent where E (/) is the set of functional schemata of Pl. If productions are not consistent then they are called inconsistent.</Paragraph> <Paragraph position="20"> An nc-lfg G is called a deterministically copying Ifg (dc-lfg), if any two productions A --+ al and A --+ a2 whoes left-hand sides are the same are inconsistent.</Paragraph> <Paragraph position="21"> Suppose G = (N,T, P, S, Nat,, Aatm) is an nc-lfg. Let {{el,e2,-'.,en}} denote the multi-set which consists of elements el, e2,&quot; * *, en that are not necessarily distinct. An SPN (SubPhrase Nonterminal) multiset in G is recursively defined as the following 1 through 3: 1. {{S}} is an SPN multiset.</Paragraph> <Paragraph position="22"> 2. Suppose that {{A1, A2,'&quot;, Ah}} (A1, A2,'&quot; &quot;, Ah E N) is an SPN multiset. Let A1 --~ al, * .', Ah ~ O:h be consistent productions. For each atr E Nat,, let MS~,~ be the multi-set consisting of all the nonterminals which appear in al,''',ah and have an S schema T atr --l. If MSat~ is not empty, then MS~t~ is also an SPN multiset.</Paragraph> <Paragraph position="23"> 3. There is no other SPN multiset.</Paragraph> <Paragraph position="24"> An nc-lfg such that the number of SPN multisets in G is finite is called a finite-copying lfg (fc-lfg). Example 4.4: Consider GEX s in Example 4.1.</Paragraph> <Paragraph position="25"> Productions /912 and P14 are inconsistent with each other and so are P13 and Ply. SPN multisets in GEX3 are {{S}} and {{A,B)). Hence GEXS is a dc-lfg and is an fc-lfg. GEX5 is also a dc-lfg and is an fc-lfg by the similar reason. Similarly, GEX4 in Example 4.2 is a dc-lfg. SPN multisets in C~x~ are {{S}}, {{S, S}), {{S, S, S, S)}, .... Hence GEx4 is not an fc-lfg.</Paragraph> <Paragraph position="26"> NOTE : L (GExs) is generated by a tree adjoining grammar. Suppose that a sentence has three or more phrases which have co-occurrence relation like the one between the subject phrase and the verb phrase in the &quot;respectively&quot; sentence. Tree adjoining grammars can not generate such syntax while fc-lfg's or dc-lfg's can, although the authors do not know a natural language which has such syntax so far.</Paragraph> <Paragraph position="27"> By Lemma 2.1 and Theorem 8.1, fc-lfg's are polynomial-time recognizable. Hence, it is desirable that whether a given lfg G is an fc-lfg or not is decidable. Fortunately, it is decidable by the following lemma.</Paragraph> <Paragraph position="28"> Lemma 4.1: For a given nc-lfg G, it is decidable whether the number of SPN multisets in G is finite or infinite.</Paragraph> <Paragraph position="29"> Proof. The problem can be reduced to the boundedness problem of Petri nets, which is known to be decidable (Peterson 1981).</Paragraph> </Section> <Section position="7" start_page="134" end_page="134" type="metho"> <SectionTitle> 5 Overview of the Results </SectionTitle> <Paragraph position="0"> Let ~'nc-lfg, ~'dc-lfg and ~-'fc-lfg denote the classes of languages generated by nc-lfg's, dc-lfg's and fc-lfg's, respectively, and let y~#,, Y~.d-fts and YElc-#s denote the classes of yield languages generated by fts', deterministic fts' and finite-copying fts', respectively. Let l:vmcla and PS:mcfg be the classes of languages generated by pmcfg's and mcfg's, respectively. Also let PS:ta9 be the class of language generated by tree adjoining grammars.</Paragraph> <Paragraph position="1"> Inclusion relations among these classes of languages are summarized in Figure 2. An equivalence relation *1 is shown in (Weir 1992). Relations *2 are new results which we prove in this paper. We also note that all the inclusion relations are proper; indeed,</Paragraph> <Paragraph position="3"> A relation B~ A is shown in (Engelfriet 1980). By Lemma 2.1, all languages in the region enclosed with the bold line are recognizable in polynomial time. On the other hand, it is shown in this paper that Unary-3SAT, which is known to be A/P-complete (Nakanishi 1992), is in A. Hence, if ~ ~ A/~, then Unary-3SAT E A - B and the languages generated by fts' (or equivalently, nclfg's) are not recognizable in polynomial time in general.</Paragraph> </Section> <Section position="8" start_page="134" end_page="137" type="metho"> <SectionTitle> 6 Generative Capacity of fts' </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="134" end_page="135" type="sub_section"> <SectionTitle> 6.1 Deterministic fts' </SectionTitle> <Paragraph position="0"> Here, the proof of an inclusion relation yEd-#s C_ /:vmc/g is sketched.</Paragraph> <Paragraph position="1"> Let (M, G) be a deterministic yT-fts where</Paragraph> <Paragraph position="3"> the set of derivation trees of G, we assume that = {Pl,.-. ,Pro, al,..., an} without loss of generality. null We will construct a pmcfg G I -=- ( N ~, T ~, F', P', S') such that yL(M, G) ---- L(G') N A*. Since /:pmc/g is closed under the intersection with a regular set (Kasami 1988a)(Seki 1991), it follows that yL(U, G) E PS'pmclg. Let T' = A td {b} where b is a newly introduced symbol and let</Paragraph> <Paragraph position="5"> be constructed to have the following property.</Paragraph> <Paragraph position="7"> by lcfrs' is equal to C. (2) : The class of language generated by head grammars is equal to D.</Paragraph> <Paragraph position="8"> Property 6.1: There is (a~,... ,a~) e LG,(Rh) (resp. LG,(Ah)) such that each of a,,,...,as~ does not contain b, and every remaining at,,..., a,~ contains b if and only if there is a derivation tree t of G such that the root is Ph (resp. ah) and { qs, \[t\] ==>~ c~s~ (1 < j < u) no output is derived from q,~\[t\] (1 _< j < v). D The basic idea is to simulate the move of tree transducer M which is scanning a symbol Ph (resp. ah) with state ql by the ith component of the nonterminal Rh (resp. Ah) of pmcfg G I. During the move of M, it may happen that no rule is defined for a current configuration and hence no output will be derived* The symbol b is introduced to represent such an undefined move explicitly. null We define RS(X) (X E N tO T) as follows.</Paragraph> <Paragraph position="9"> {Rh \[the left-hand side of Ph is X}</Paragraph> <Paragraph position="11"> for every Z~ E RS(Y~) (1 < u < k), where fph is defined as follows: For 1 < i < g, * if the transducer M has no rule whose left-hand side is qi~ah(Xl,..., xk)\], then</Paragraph> <Paragraph position="13"> (Since M is deterministic, there exists at most one rule whose left-hand side is qi~h('&quot; &quot;)\] and hence the above construction is well defined*) Step 2: For each ah E T, construct a terminating production Ah -&quot;+ fah where f~h is defined as follows: For 1 < i < i, * if M has no rule whose left-hand side is qi\[ah\], then ~a~\[i\] ~--b.</Paragraph> <Paragraph position="14"> * ifM has a rule qi\[ah\] --+ hi, then f\[~&ai.</Paragraph> <Paragraph position="15"> Step 3: For each Rh E RS(S), construct S' --+ /fi~st\[Rh\] where /fi,st\[(x\], ..., xl)\]~x\]. Intuitively, the right-hand side of this production corresponds to the initial configuration, that is, M is in the initial state ql and scanning the root symbol Ph of a derivation tree, where the left-hand side of Ph is the initial symbol S.</Paragraph> <Paragraph position="16"> The pmcfg G I constructed above satisfies Property 6.1. Its proof is found in (Kaji 1992) and omitted in this paper. By Property 6.1, we obtain the following lemma.</Paragraph> <Paragraph position="17"> Lemma 6.1: yl:d_f, s C ff.pmcfg. 0 The reverse inclusion relation l:p,~c/g C_ Y~.d-B, can be shown in a similar way, and the following theorem holds* Theorem 6.2: yf-.d./,s : E-pmcfg* 0</Paragraph> </Section> <Section position="2" start_page="135" end_page="137" type="sub_section"> <SectionTitle> 6.2 Nondeterministic fts' </SectionTitle> <Paragraph position="0"> In this section, the generative capacity of nondeterministic yT-fts' is investigated, from the view-point of computational complexity* We have already shown that Y~.d-~s : ~.pmcfg, and hence every language in this class can be recognized in time polynomial of the length of an input string* Our result here is: there is a nondeterministic fts that generates an A/'~-complete language* In the following, a language called Unary-3SAT, which is ArT'-complete (Nakanishi 1992), is considered, and then it is shown to belong to yL:/,a.</Paragraph> <Paragraph position="1"> A Unary-3CNF is a (nonempty) 3CNF in which the subscripts of variables are represented in unary. A positive literal xi in a 3CNF is represented by 1i$ in a Unary-3CNF. Similarly , a negative literal --xl is represented by 12#. For example, a 3CNF (xi v x2 v ~xa) A (xa V ~x\] v ~x~) is represented by a Unary-3CNF 15115111# A I1151#Ii#.</Paragraph> <Paragraph position="2"> Unary-3SAT is the set of all satisfiable Unary3CNF's. null Next, we construct a nondeterministic yT-fts (M, G) that generates Unary-3SAT. Define a cfg G = (N,T,P, S) where N = {S,T,F}, T = {e} and the productions in P are as follows:</Paragraph> <Paragraph position="4"> Let M = (Q, E, A, qo, R) where q = {qo,q~,qt, qa}, ~ {rSS,... ,rFe}, z~ = {L^,$,#}. Since there are many rules in R, we will use an abbreviated notation. For example, following four rules qaIrTelXll-~ 15, q~\[rTe(X)\] ~ 1# qdrF~tX)\] + 15, qo\[rF~(X)\] -~ 1# are abbreviated as &quot;q~\[rT~(X)\] = q~\[rF~(X)\] --* 15 or 1#'. By using this notation, the rules in R are defined as follows.</Paragraph> <Paragraph position="6"> The readers can easily verify that this yT-fts generates Unary-3SAT.</Paragraph> <Paragraph position="7"> 7 Equivalence of f-'nc-lfg and YPSfts First, we show PS,~c-lfg C_ YPSqt~. For a given nc-lfg G = (N, T, P, S, Nat,, A~m), an equivalent fts (M, G I) is constructed in the following way.</Paragraph> <Paragraph position="8"> Let t be a derivation tree in lfg G and the f-structure of the root node of t be F = {(atrl,F1),..., (atr,~,Fn)}. F is represented by a derivation tree r = p,p(Tl,'-., rn) in G', where ri (1 < i < n) is a derivation tree in G' which represents Fi recursively. And sp is a set of productions such that F locally satisfies the functional schemata of all productions in sp. M transforms r into the yield of t, i.e., the terminal string obtalned by concatenating the labels of leaves, in a top-down way.</Paragraph> <Paragraph position="9"> \[TRANS 7.11 Let N = {A1,'&quot;,Am}, S = A1 and Nat, = {atrl,-.., atr,~}. Define SP as the set of all consistent subsets of P.</Paragraph> <Paragraph position="11"> For a derivation tree r in G' and a node v ' is applied, the snbtree rooted by the where p,p ith child of v represents the value of attribute atr i.</Paragraph> <Paragraph position="12"> Step 2: M = (Q,E,T, ql,R) is defined as follows. null Define Q = {ql,..., qm}. A state qj (1 < j _< m) corresponds to nonterminal A t in N. Define E -{d} where p(p'.,) = p(p .... .~) = ' = and p(d) = O. And define R by the following (i) through (iii).</Paragraph> <Paragraph position="13"> (i) qj~ .... .,(x)\] -~ qj\[x\] (1 _< j < m) belongs to R for each sp * SP.</Paragraph> <Paragraph position="14"> (ii) Let r be a derivation tree in G '. When plsp is the production applied at the root of r and a state of M is q,o, M chooses a production p whose left-hand side is Auo , if exists, in sp. NOTE : Since productions in sp are consistent, there is an f-structure, which locally satisfies the functional schemata of all productions in sp.</Paragraph> <Paragraph position="15"> For each production p E sp in SP</Paragraph> <Paragraph position="17"> where A~z E N and al E T*(0 < l < L), the following rule belongs to R: q#o~tsp(Xl, , *',xn)\] -~ &quot;0q,,\[X~,\]&quot;I..-&quot;L--lq,~\[X,~\]~. (7.5) (iii) No other rule belongs to R.</Paragraph> <Paragraph position="18"> Next, YPS~s C_ PS~c-zf9 is shown. For a given fts (M, G), the following algorithm constructs an nc-lfg G' such that L(G') --- yL(M, G).</Paragraph> <Paragraph position="19"> \[TRANS 7.2\] Suppose that a given fts (M, G) is G -- (N, T, P, S) and M -- (Q, E, A, ql, R) where Q = {ql,q2,'&quot;,qm}. Let n be the maximum length of the right-hand side of a production in P. Define an nc-lfg G I = ( N', A, P', S I, N~r, Aatm) as follows.</Paragraph> <Paragraph position="21"> a production in P}.</Paragraph> <Paragraph position="22"> A derivation tree t = p(tl,'&quot; ,th) in G is represented by an f-structure {(rule, p),(atrl, El), * &quot;.,(atrh, Fh)} of G' where Fi (1 < i < h) is an f-structure which represents the subtree ti recursively.</Paragraph> <Paragraph position="23"> Each pair of a symbol (either nonterminal or terminal) X of G and a state qj of M is represented by a single nonterminal X\[J\] in G'. Step 2: A move when M at state qj reads a symbol p which is the label of a production p : C --+ ..., can be simulated by a production in G ~ whose left-hand side is C\[J\] {T ute = p}&quot; Formally, the set P~ of productions of G I is constructed as follows.</Paragraph> <Paragraph position="24"> (i) Let p : C --* X1 &quot;&quot; Xh be a production in P where CE N, Xi E NUT (1 <i < h), and let: qj\[p(x\],..., Xh)\] --~ ajoq,7,, \[z~,,, \]aj,...q,7,zj \[X~,,L ' \]O~jL, be a rule in R where ~k E A* (0 < k < Lj), q'Tj~ E Q, and xvj~ e tXl,'&quot;,Xh}(1 < l < L j).</Paragraph> <Paragraph position="25"> Then, the following production belongs to P~: y\[r/jl\] V\[nJLj\] C\[J\] 7...40tjo-~vjl Otjl &quot;'&quot; AI~jLj OtjLj&quot; {Trute = p) {Tatr , {Tatr j (ii) Let qj\[a\] ---* flj be a rule in R where a 6 T and flj 6 A*. Then the production a\[J\] --~ flj belongs to P'.</Paragraph> <Paragraph position="26"> (iii) No other production belongs to P'.</Paragraph> <Paragraph position="27"> By TRANS 7.1 and TRANS7.2, the following theorem is obtained. A formal proof is found in (Nakanishi 1993).</Paragraph> <Paragraph position="28"> Theorem 7.1: f~nc-lfg = Y~'fts.</Paragraph> <Paragraph position="29"> Corollary 7.2: ~'dc-lfg ---- Y~.d-fts.</Paragraph> <Paragraph position="30"> Proof. In TRANS 7.1, if G is a dc-lfg, then no sp E SP contains distinct productions whose left-hand sides are the same and hence the constructed transducer M becomes deterministic by the construction. Conversely, in TRANS 7.2, if M is deterministic, then there exist no consistent productions p~ and p~ in P~ whose left-hand sides are the same and hence the constructed nc-lfg is a dc-lfg.</Paragraph> </Section> </Section> class="xml-element"></Paper>