File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/c86-1047_metho.xml
Size: 15,933 bytes
Last Modified: 2025-10-06 14:11:49
<?xml version="1.0" standalone="yes"?> <Paper uid="C86-1047"> <Title>The Weak Generative Capacity of Parenthesls-Free Categorial Gramnmrs</Title> <Section position="3" start_page="0" end_page="199" type="metho"> <SectionTitle> \])EHNITIONS </SectionTitle> <Paragraph position="0"> Notation. We use A, B, C to denote atomic category symbols, and U, V, X, Y to denote arbitrary (complex) category symbols.</Paragraph> <Paragraph position="1"> The number of occurrences of atomic category symbols in X is I X I' Strings of category symbols are denoted by x, y. Mor pheme, s are denoted by a, b; *norpheme strings by u, v, w.</Paragraph> <Paragraph position="2"> A categorial grammar under certain reduction rides is a qua druple G1C/ : (VT, VA, S, F), where: VT is a finite set of morphelnes, VA a tinite set of attolnic categories, S E VA a dis tinguished elelnent, F a function from VT to 2 cA such that for every a E VT, F(a) is finite, where CA is the category set and is defined as: i) if A EVA, then A E CA, ii) if X E CA and A C VA, then X/A E CA, iii) nothing else is in CA.</Paragraph> <Paragraph position="3"> The set, of reduction rules R can include any combination of the folh)wing: (1) (F Rule) If U/A E CA, A E VA, the string U/A A can be replaced by U. We write: U/A A -* U; {2) (\]i'P Rule) If U/A, A/V E CA, where A E VA, the string U/A A~ V can be replaced by U~ V. We write: U/A A.'V --+ U/V; (3) (FI'2 Rule) If U//A, A/l\] E CA, where A, 13 E VA, the string U/A A/ll Call he replaced by U/B. We write.: U/A A/~ ~ U/IJ; (4) (FP s Ru e) Same as (2) except that U/A must he headed by S; (5) (B Rule) If U/A E CA, A EVA, the string A U/A can be replaced by U. We write: A U/A ~ U; (6) (B s Rule) Same as (5) except that U/A must be headed by S.</Paragraph> <Paragraph position="4"> When it WOIl~t cause confusion, we write GfC/ to denote a categori al grammar with rule set R, and specify a categorial grammar by just spe, cifying its lexicon F.</Paragraph> <Paragraph position="5"> The reduce relation > on CA* x CA* is defined as: for all oq fl E CA* and all X,Y,Z E CA, o~XYfl -> o~Z\[3 if XY--, Z. Let :>* denote the reflexive and transitive closure of relation A rnorpheme string W=WlW='''wn, where wi E VT, i=1,2,.. * n, is accepted by G,t :-: (VT, VA, S, F) if there is X i E F(w i ) for i =1,2, * . . n, such that X 1X= * * * X, ->* S. The language accepted by Gn =- (VT, VA, S, F), L(GR) is the set of all nmrpheme strings that are accepted by G1C/ * The categorial grammar recognition problem is: given a categorial gl'amrnar GI? = CGR ( VT, VA, S, F) and a morpheme string w E VT*, decide whether w E L(G R ).</Paragraph> <Paragraph position="6"> The derivable cateyory set DA c_ CA lmder a set R of reduction rules is the set of categories including all the primary categories designated by F, and all the reachable categories under that set of reduction rules. It is formally defined as: i) X is in DA ifthcreisan a E VT such that X E F(a), ii) For allX, Y E DA and Z E CA, if X Y -~ Z by some rule in R then Z E DA, and iii) Nothing else is in DA.</Paragraph> <Paragraph position="7"> GRAMMARS W\[TI1 I,'OItWARD CANCELLATION ONLY We begin by looking at the most restricted form of the reduction rule set R = {F}. The single cancellation rule is the forward combination rule. It is well known that traditional categorial grammars are equivalent to context--free grammars. We examine the proof to see that it still goes through for categorial grammars GR with R = IF}.</Paragraph> <Paragraph position="8"> Theorem The categorial grammars GI~, R = {F}, generate exactly the context free langnages.</Paragraph> <Paragraph position="9"> Proof (1) l,et GR be a eategorial grammar with R = IF}. Gt~ becomes a traditional categorial gralnmar once parentheses are restored by replacing them from left to right, so that, e.g., A/B/C becomes ((A/B)/C). Hence, its language is CF.</Paragraph> <Paragraph position="10"> (2) To show that every context- free language can be obtained, we begin with the observation that every context free language has a grammar in Greibach 2 form, that is, with all rules of the three forms A ~> aBC, A -> aB, and A -> a, where A,B, C are in VN and a is in VT \[6\]. A corresponding classical categorial grammar can be irnmediate\]y constructed: ~&quot;(~) ~-I((A/C)/B), (A/B), A}. These are the categories A/C/B, A/B, and A of a parenthesis free categorial grammar. The details of the proof can be easily carried out to show that the two languages generated are the same.</Paragraph> </Section> <Section position="4" start_page="199" end_page="200" type="metho"> <SectionTitle> CRAMMARS WITH BACKWARDS CANCELLATION </SectionTitle> <Paragraph position="0"> :\['he theorem shows that with R = {F} exactly the context free languages are obtained. What happens when the additional metarules are added? We examine now parenthesis-free categorial grammars with R = {F, B} and R = {F, P, s}. Rule B s is the version adopted in \[11; B is an obvious generalization. In either case we are adding the equivalent of context free rules to a grammar; the result must therefore still yield a context free language. So one guess might be that categorial gram-mars ol these types will still yield exactly the context free languages, perhaps with more structm'es for each sentence. An alternative conjecture would be that fewer languages are obtained, for we have now added some &quot;involuntary&quot; context free rules to every grammar.</Paragraph> <Paragraph position="1"> Example: Consider the standard context free language L 1 = { a&quot; b n \] n>0}.Theea~alestgrammaris S-> aSb, S-> ab. The Creibach 2 form grammar is S -> aSH, B ->b, S -> aB. The constructed categorial grammar GI~ then has f(a)= {S/B, S/B/S} and F (b) = {B }. If R = IF}, this yields exactly I, v Ilowever, with R = {F, B} or R = {F, B s}, here equivalent, GR yields alanguage L2 = {ab, ba, aabb, abab, bbaa, baba, baab, ...</Paragraph> <Paragraph position="2"> }, which contains L1 and other strings as well. It is the language of the context free grammar with rule set {S ->bC, S -> Cb, C -> aS', C-> Sa, C-> a}.</Paragraph> <Paragraph position="3"> Reversible languages. Let x 1C/ be the reverse of string x.</Paragraph> <Paragraph position="4"> That is, ifx = ala2&quot; &quot; &quot; an (a, E VT), then x R = a,~ ..&quot; a2al. Call a language L reversible if x EL iff x n EL.</Paragraph> <Paragraph position="5"> Examples: The set of all strings on {a, b} with equal numbers of a's and b's is a reversible CF language. {a &quot;b I n >0} is not a reversible language.</Paragraph> <Paragraph position="6"> Theorem The languages of categorial grammars GR with R = IF, B} are reversible.</Paragraph> <Paragraph position="8"> the mirror image of the one for x in which rules F and B have been interchanged.</Paragraph> <Paragraph position="9"> Theorem Let G,~ be a categorial grammar with R contains {F, B} or {l&quot;, B s }- R may or may not also contaln some form of FP rules. If L (G~) contains any sentence of length greater than one, then it contains at least one sentence w uv such that vu is also in L (GR).</Paragraph> <Paragraph position="10"> Proof Let w be a sentence of L (G n ) of length greater than one. Suppose the final step of the reduction to S uses rule F. Then w u vwhere u -> ~ S/A and v >* A. But then v u >* A S/A -> Sby rule B or Bs. No form of FP can be used a~q the final step of the reduction to S, so its presence does not affect the result.</Paragraph> <Paragraph position="11"> Corollary There are context free languages that cannot be obtained by any categorial grammar G~, where R contains {F, B} or {F, B s}.</Paragraph> </Section> <Section position="5" start_page="200" end_page="200" type="metho"> <SectionTitle> CATBGORIAL GRAMMAR, IS CONTEXT-FREE 1F THE FP RULE IS RESTRICTEI) </SectionTitle> <Paragraph position="0"> Tile method that had been used to construct a context free grammar G equivalent to a classical categorial grammar can be formally described as following: /~ / Foreaehae VT, ifX~ F(a),then put X -> a in G; For each derivable category X/Y, put X ~ X/Y Y in G.</Paragraph> <Paragraph position="1"> This method remains valid when B s rule is added. We just need to put an additional rule X -> Y X/Y in G whenever X is head ed by S. But this doesn't work when the FP rule is allowed. We might put in the CF rule U/V -> U/A A/V for each derivable category U/V and for each atomic category A, but in case there is a category like A/B/A, then any category symbol headed by A followed by B's and ended by A is a derivable category. There are infinitely many of them, so by using this construction method, we might have to put in an infinite number of CF rules. Therefore, this method does not always find a finite context free grammar equivalent to a category grammar with the FP rule. As we shall see, there may be no such context free grammar.</Paragraph> <Paragraph position="2"> Let's now enf'orce some restrictions on the FP rule so that it won't cause an infinite number of derivable categories. Actually, using the FP rule sometimes violates the parenthesis convention, e.g. applying FP ~n 4 'B t? &quot;(?/D bnplle~ ~hat B/C/D is inter preted as (B/(C/D)). tlowever, by the parenthesis convention, B/C/D is the abbreviation of ((B/C)/D). Notice, however, when the second category symbol ha~ exactly two atomic sym bols, i.e., is in form A/B, the FP rule does not violate the convention. Coincidentally, if the FP rule is accordingly restricted as to FP z, the derivable category set becomes finite.</Paragraph> <Paragraph position="3"> Lemma For a categorial grammar G~(VA, VT',S,F), let RI={F,FP2}, R2={F,FP2, Bs}; and Rz={F,FPe, P,}, then DAI~ 1 = DAR2 = DAR3.</Paragraph> <Paragraph position="4"> Proof From the definition ii) of DA, we can see that any new category Z added to DA by a form of the B rule can be added by the F rule. The lemma follows.</Paragraph> <Paragraph position="5"> \[\] Lemma The derivable category set DA of a eategorial grammar GI~ with R = /F, FP 2} is finite and constructible.</Paragraph> <Paragraph position="6"> Sketch of Proof We begin with the observation that none of the reduction rules in R increases the length of category symbols, and the initial lexical category symbols are all of finite length. This implies that the length of all the derivable category symbols are bounded. So there are only finitely many of them.</Paragraph> <Paragraph position="7"> We now give an algorithm for computing DA, to show that, it is constructible.</Paragraph> </Section> <Section position="6" start_page="200" end_page="200" type="metho"> <SectionTitle> CATEG ORIES </SectionTitle> <Paragraph position="0"> Now the next question is what if the I&quot;P rule is not restricted to U/A A/B -~ U/B. Intuitively, we can see that the application of the FP rule on a category which is not headed by S is not crucial in the sense that it carl be replaced by an application of the F rule, because whenever U/A A/V appears in a valid derivation to a sentence, the V part nlust be cancelled out sooner or later, so we can make a new derivation that cancels the V part first and get U/A A on which we can apply the F rule instead of the FP rule. But this doesn't hold if U/A is headed by S. For ex ample, when we have A S/B B/A, we can't do backward comb| nation on A and S/A if we don't combine S/B and B/A first. So, ~e expect that the weak generative power of categorial grammar would remain unchanged if the FP rule is restricted t~, bt used only on categories which are headed by S. This in fact follows as our next theorem.</Paragraph> <Paragraph position="1"> Lemma Given a categorial grammar GiC/ ( VT , VA , S, F ) with R ={F,FP,Bs}, for any w E CA* and A ~VA, if there is a reduction w -->* A, then there is a reduction of w to C using FP rule only on categories which are headed by S.</Paragraph> <Paragraph position="2"> Sketch of Proof Formalize tile idea illustrated abow!. \[\] As an almost immediate consequence, we have: Theorem The language accepted by categorial grammar GIe(VT, VA,S,F) with R = {F,FP,Bs} is the same as tImt accepted with R = {F, FP s, B s }.</Paragraph> <Paragraph position="3"> Proof It trivially follows the lermna. \[\] Corollary FP rule is useless if there is no form of the B rule, i.e., any GIe (VT, VA, S, F) with R = {F, FP} will generate the same language ~us that germrated with R = {F}.</Paragraph> </Section> <Section position="7" start_page="200" end_page="200" type="metho"> <SectionTitle> A CONTEXT SENSITIVE LANGUAGE GENERATt{D USING UNRESTRICTED FP RULE </SectionTitle> <Paragraph position="0"> This section gives a categorial grammar with unrestricted FP rule thai. genera.tes a language which is not context free. Consid er categorial grammar G1 = GIe ( V A , V T , S, F ), where V T {a, b, c }, VA -- {A, C, S}, r(a) = {A }, F(b) = {S/A/C/S, S/A/IU}, F(c) = {C},andR = {F,Bs,FP}.</Paragraph> <Paragraph position="1"> Claim 1 a i b i e' ~L (G1) for i > 0.</Paragraph> <Paragraph position="2"> Proof For any i. > O~ we can find a corresponding categorial string for a' b' c' : A' (S /A /C /S)i-t(S /A /C )C' . A reduc tion to S is straightforward. \[\] Let gb~ (a) denote the number of occurrences of a in string w.</Paragraph> <Paragraph position="3"> Claim 2 For all w ~ V T *, if w E L ( G I) then</Paragraph> <Paragraph position="5"> Proof First, it is ea~sy to see that from the lexical categories, we cannot get any complex category headed by either A or C, and we can get atomic category symbol A or C only directly from the lexicon.</Paragraph> <Paragraph position="6"> Second, each morpheme b would introduce one A and one C within a complex category symbol which must be cancelh~l out sooner or later in order to reduce the whole string to S. In gen eral, there are two ways for such A and C being cancelled: (1) with an A headed or C headed complex category by the FP rule, which is impossible in this example; (2) with a single atomic category A or C by either the F or P, s rule. We have seen that such single A and C can only be introduced by the morpheme a and c, respectively. So 4) w (a) ::= q~,0 (b) = ~b w (c).</Paragraph> <Paragraph position="7"> \[\] To show that L (G 1) is not context free, we take its intersec tion with the regtl\]ar language a*b:~c :~ . \]?,y claim 1 and 2, the in tersection is exactly the laugu;Lge {an b&quot; c ~' \] n > 0} which is well known to be non context free. Since the intersection of a context free language with a regular set must be conte.xt free, L (GI) cannot be context free.</Paragraph> <Paragraph position="8"> tq{OCESSORS A categorial grammar is certainly no worse than context sensitive. We can verify this by using a noudctermiuistic linear bounded auLomatoll to model the reduction process. For even in the case of reduction by the unrestricted l,'P rule, the category symbol obtained by reduction is shorter than the corn biqed length of the two inputs t,o 1he rule.</Paragraph> <Paragraph position="9"> Ades and Steedman \[1\] propose a processor that is a push down stack automaton and pushdown stack automata are known to correspond to the context free languages. Itow can we reconcile this with the cnntext sensitive example abow~? The contradiction arises because the stack of their processor must be able to contain any derived eal~egory symbol of DA, and thus the size of the stack symbols is unlimited. The processor is thus not a pushdowrl autoulaton in the usual sense.</Paragraph> <Paragraph position="10"> Ael~nowh~(lg~;mnt- ~C/V~w-,ould like to thank t~amarathnam Ven katesau and Remko Scha for he.lpful discussions o1&quot; lhe material. This work wa.s supported in part by National Science Foundation Grant No. \]ST 8317736.</Paragraph> </Section> class="xml-element"></Paper>