RECOGNITION OF 
LINEAR CONTEXT-FREE REWRITING SYSTEMS* 
Giorgio Satta 
Institute for Research in Cognitive Science 
University of Pennsylvania 
Philadelphia, PA 19104-6228, USA 
gsatta@linc.cis.upenn.edu 
ABSTRACT 
The class of linear context-free rewriting sys- 
tems has been introduced as a generalization of 
a class of grammar formalisms known as mildly 
context-sensitive. The recognition problem for lin- 
ear context-free rewriting languages is studied at 
length here, presenting evidence that, even in some 
restricted cases, it cannot be solved efficiently. This 
entails the existence of a gap between, for exam- 
ple, tree adjoining languages and the subclass of lin- 
ear context-free rewriting languages that generalizes 
the former class; such a gap is attributed to "cross- 
ing configurations". A few other interesting conse- 
quences of the main result are discussed, that con- 
cern the recognition problem for linear context-free 
rewriting languages. 
1 INTRODUCTION 
Beginning with the late 70's, there has been a consid- 
erable interest within the computational linguistics 
field for rewriting systems that enlarge the gener- 
ative power of context-free grammars (CFG) both 
from the weak and the strong perspective, still re- 
maining far below the power of the class of context- 
sensitive grammars (CSG). The denomination of 
mildly context-sensitive (MCS) has been proposed 
for the class of the studied systems (see \[Joshi et 
al., 1991\] for discussion). The rather surprising fact 
that many of these systems have been shown to be 
weakly equivalent has led researchers to generalize 
*I am indebted to Anuj Dawax, Shyam Kaput and Owen 
Rainbow for technical discussion on this work. I am also 
grateful to Aravind Joshi for his support in this research. 
None of these people is responsible for any error in this work. 
This research was partially funded by the following grants: 
ARO grant DAAL 03-89-C-0031, DARPA grant N00014-90- 
J-1863, NSF grant IRI 90-16592 and Ben Franklin grant 
91S.3078C-1. 
89 
the elementary operations involved in only appar- 
ently different formalisms, with the aim of captur- 
ing the underlying similarities. The most remark- 
able attempts in such a direction are found in \[Vijay- 
Shanker et al., 1987\] and \[Weir, 1988\] with the in- 
troduction of linear context-free rewriting systems 
(LCFRS) and in \[Kasami et al., 1987\] and \[Seki et 
a/., 1989\] with the definition of multiple context-free 
grammars (MCFG); both these classes have been in- 
spired by the much more powerful class of gener- 
alized context-free grammars (GCFG; see \[Pollard, 
1984\]). In the definition of these classes, the gener- 
alization goal has been combined with few theoret- 
ically motivated constraints, among which the re- 
quirement of efficient parsability; this paper is con- 
cerned with such a requirement. We show that from 
the perpective of efficient parsability, a gap is still 
found between MCS and some subclasses of LCFRS. 
More precisely, the class of LCFRS is carefully 
studied along two interesting dimensions, to be pre- 
cisely defined in the following: a) the fan-out of 
the grammar and b) the production length. From 
previous work (see \[Vijay-Shanker et al., 1987\]) we 
know that the recognition problem for LCFRS is in P 
when both dimensions are bounded. 1 We complete 
the picture by observing NP-hardness for all the 
three remaining cases. If P~NP, our result reveals 
an undesired dissimilarity between well known for- 
malisms like TAG, HG, LIG and others for which the 
recognition problem is known to be in P (see \[Vijay- 
Shanker, 1987\] and \[Vijay-Shanker and Weir, 1992\]) 
and the subclass of LCFRS that is intended to gener- 
alize these formalisms. We investigate the source of 
the suspected additional complexity and derive some 
other practical consequences from the obtained re- 
suits. 
1 p is the class of all languages decidable in deterministic 
polynomial time; NP is the class of all languages decidable in 
nondeterministic polynomial time. 
2 TECHNICAL RESULTS 
This section presents two technical results that are 
. the most important in this paper. A full discussion 
of some interesting implications for recognition and 
parsing is deferred to Section 3. Due to the scope 
of the paper, proofs of Theorems 1 and 2 below are 
not carried out in all their details: we only present 
formal specifications for the studied reductions and 
discuss the intuitive ideas behind them. 
2.1 PRELIMINARIES 
Different formalisms in which rewriting is applied 
independently of the context have been proposed in 
computational linguistics for the treatment of Nat- 
ural Language, where the definition of elementary 
rewriting operation varies from system to system. 
The class of linear context-free rewriting systems 
(LCFRS) has been defined in \[Vijay-Shanker et al., 
1987\] with the intention of capturing through a gen- 
eralization common properties that are shared by all 
these formalisms. 
The basic idea underlying the definition of LCFRS 
is to impose two major restrictions on rewriting. 
First of all, rewriting operations are applied in the 
derivation of a string in a way that is independent of 
the context. As a second restriction, rewriting op- 
erations are generalized by means of abstract com- 
position operations that are linear and nonerasing. 
In a LCFR system, both restrictions are realized by 
defining an underlying context-free grammar where 
each production is associated with a function that 
encodes a composition operation having the above 
properties. The following definition is essentially the 
same as the one proposed in \[Vijay-Shanker et al., 
1987\]. 
Definition 1 A rewriting system G = (VN, VT, 
P, S) is a linear context-free rewriting system if: 
• (i) VN is a finite set of nonterminal symbols, VT is 
a finite set of terminal symbols, S E VN is the 
start symbol; every symbol A E VN is associated 
with an integer ~o(A) > O, called the fan-out of 
A; 
(it) P is afinite set of productions of the form A --+ 
f(B1, B2,...,Br), r >_ O, A, Bi E VN, 1 < i < 
r, with the following restrictions: 
(a) f is a function in C °, where D = (V~.) ¢, 
¢ is the sum of the fan-out of all Bi's and 
c = 
(b) f(xl,l,..., Zl,~(B,),..., xr,~(B.)) 
= (Yz,...,Y~(a)) is defined by some 
grouping into ~(A) sequences of all 
and only the elements in the sequence 
zx,1, ... ,Zr,~o(v,),ax, ...,ao, a >__ O, where 
aiEVT, l <i<a. 
The languages generated by LCFR systems are 
called LCFR languages. We assume that the start- 
ing symbol has unitary fan-out. Every LCFR sys- 
tem G is naturally associated with an underlying 
context-free grammar Gu. The usual context-free 
derivation relation, written =¢'a, , will be used in 
the following to denote underlying derivations in G. 
We will also use the reflexive and transitive closure 
of such a relation, written :=~a, • As a convention, 
whenever the evaluation of all functions involved in 
an underlying derivation starting with A results in 
a ~(A)-tuple w of terminal strings, we will say that 
* A derives w and write A =~a w. Given a nonter- 
minal A E VN, the language L(A) is the set of all 
~(A)-tuples to such that A =~a w. The language 
generated by G, L(G), is the set L(S). Finally, we 
will call LCFRS(k) the class of all LCFRS's with 
fan-out bounded by k, k > 0 and r-LCFRS the class 
of all LCFRS's whose productions have right-hand 
side length bounded by r, r > 0. 
2.2 HARDNESS FOR NP 
The membership problem for the class of linear 
context-free rewriting systems is represented by 
means of a formal language LRM as follows. Let 
G be a grammar in LCFRS and w be a string in 
V.~, for some alphabet V~; the pair (G, w) belongs 
to LRM if and only if w E L(G). Set LRM naturally 
represents the problem of the recognition of a linear 
context-free rewriting language when we take into 
account both the grammar and the string as input 
variables. In the following we will also study the de- 
cision problems LRM(k) and r-LRM, defined in the 
obvious way. The next statement is a characteriza- 
tion of r-LRM. 
Theorem 1 3SAT _<p I-LRM. 
Outline of the proof. Let (U, C) be an arbitrary in- 
stance ofthe 3SAT problem, where U = {Ul,..., up} 
is a set of variables and C = {Cl,...c,} is a set 
of clauses; each clause in C is represented by a 
string of length three over the alphabet of all lit- 
erals, Lu = {uz,~l,...,up,~p}. The main idea in 
the following reduction is to use the derivations of 
the grammar to guess truth assignments for U and to 
90 
use the fan-out of the nonterminal symbols to work 
out the dependencies among different clauses in C. 
For every 1 < k < p_ let .Ak = {c i \[ uk is a 
substring of ci} and let .Ak = {c i \[ ~k is a substring 
of cj}; let also w = clc2 ...ca. We define a linear 
context-free rewriting system G = (tiN, C, P, S) such 
that VN = {~/i, Fi \[ 1 < i < p + 1} U {S}, every 
nonterminal (but S) has fan-out n and P contains 
the following productions (fz denotes the identity 
function on (C*)a): 
(i) S --* f0(T~), 
s f0(Fd, 
where fo(xl,..., xn) = za ... Xn; 
(ii) for every 1 < k < p and for every cj E .At: 
n - 
Tt -"* fl(Tk+l), 
Tk h(Fk+x), 
where = (=1,... ,=.); 
(iii) for every 1 < k < p and for every c i E Ak: 
Fk --* ~(kD (Fk), 
Fk --. h(Tk+l), 
--. h(fk+x), 
where 7(k'i)(xx, .... z,) = (Zl,... ,xici,... ,z,); 
(iv) Tp+l --*/p+10, 
A+10, 
where fp+10 = (~,"', C). 
From the definition of G it directly follows that w E 
L(G) implies the existence of a truth-assignment 
that satisfies C. The converse fact can he shown 
starting from a truth assignment that satisfies C and 
constructing a derivation for w using (finite) induc- 
tion on the size of U. The fact that (G, w) can he 
constructed in polynomial deterministic time is also 
straightforward (note that each function fO) or 7~ j) 
in G can he specified by an integer j, 1 _~ j _~ n). 
D 
The next result is a characterization of LRM(k) 
for every k ~ 2. 
Theorem 2 3SAT _<e LRM(2). 
Outline of the proof. Let (U,C) be a generic in- 
stance of the 3SAT problem, U = {ul,... ,up} and 
C = {Cl,...,Cn} being defined as in the proof of 
Theorem 1. The idea in the studied reduction is 
the following. We define a rather complex string 
w(X)w(2).., w(P)we, where we is a representation of 
the set C and w (1) controls the truth assignment for 
the variable ui, 1 < i < p. Then we construct a 
grammar G such that w(i) can be derived by G only 
in two possible ways and only by using the first string 
components of a set of nonterminals N(0 of fan-out 
two. In this way the derivation of the substring 
w(X)w(2) ... w(p) by nonterminals N(1),..., N (p) cor- 
responds to a guess of a truth assignment for U. 
Most important, the right string components of non- 
terminals in N (i) derive the symbols within we that 
are compatible with the truth-assignment chosen for 
ui. In the following we specify the instance (G, w) 
of LRM(2) that is associated to (U, C) by our reduc- 
tion. 
For every 1 _< i _< p, let .Ai = {cj \[ ui is in- 
cluded in cj} and ~i = {cj \[ ~i is included in cj}; 
let also ml = \[.Ai\[ + IAil. Let Q = {ai,bi \[ 1 <_ 
i _< p} be an alphabet of not already used sym- 
bols; for every 1 <_ i <_ p, let w(O denote a se- 
quence of mi + 1 alternating symbols ai and bi, i.e. 
w(O E (aibl) + U (albi)*ai. Let G -- (VN, QUC, P, S); 
we define VN ---- {S} U {a~ i) I 1 <_ i <_ p, 1 <_ 
j <_ mi} and w = w(t)w(=)...w(P)cxc2...ea. In 
order to specify the productions in P, we need to 
introduce further notation. We define a function 
a such that, for every 1 _< i _< p, the clauses 
Ca(i,1),Ca(i,2),'"Ca(i,lAd) are all the clauses in .Ai 
and the clauses ea(i,l.a,l+l),...ca(i,m0 are all the 
clauses in ~i. For every 1 < i < p, let 7(i, 1) = albi 
and let 7(i, h) = ai (resp. bl) if h is even (resp. odd), 
2 < h < mi; let also T(i, h) = ai (resp. bi) ifh is odd 
(resp. even), 1 < h < mi - 1, and let ~(i, mi) = albi 
(resp. biai) if mi is odd (resp. even). Finally, let P 
z = ~"~i=1 mi. The following productions define set 
P (the example in Figure 1 shows the two possible 
ways of deriving by means of P the substring w(0 
and the corresponding part of Cl ... ca). 
(i) for every 1 < i < p: 
(a) for 1 < h < \[~4,\[: 
Ai') .-+ (7(i,h),cc,(i,h)), 
A(i) ~ (7(i, h), e), 
(b) for JAil+ 1 < h < mi: 
h), 
A (i) ~ ('~(i, h), c,(i,h)), 
A (0 --~ (~(i, h), e); 
(ii) S--* f(Ail),...,A~!,..., A~), 
91 
i I 
w =... ai bi al bi ai Cjl 
A~ CJl , $ 
.ll , 
... c i:z ... c j3 ... cs4 ... 
E c~,E E 
Figure 1: Let .Ai = {ej2,ej,} and ~i = {cja,cjs}. String w (i) can be derived in only two possible ways in G, 
corresponding to the choice ui = trne/false. This forces the grammar to guess a subset of the clauses contained in 
,Ai/.Ai, in such a way that all of the clauses in C are derived only once if and only if there exists a truth-assignment 
that satisfies C. 
where f is a function of 2z string variables de- 
fined as 
f(z~l),y~l),, g(1) • (1) Z(p) • (p)l 
• ., ~l,Y~l,...1 fl~plyrnpj "-" 
z(1)z(1) z 0) .z~yay2..y. 1 2 "'" ml-. 
and for every 1 _ j _< n, yj is any sequence of 
all variables y(i) such that ~(i, h) = j. 
It is easy to see that \[GI and I wl are polynomi- 
ally related to I UI and I C l- From a derivation of 
w G L(G), we can exhibit a truth assignment that 
satisfies C simply by reading the derivation of the 
prefix string w(X)w(2)...w (p). Conversely, starting 
from a truth assignment that satisfies C we can prove 
w E L(G) by means of (finite) induction on IU l: this 
part requires a careful inspection of all items in the 
definition of G. ra 
2.3 COMPLETENESS FOR NP 
The previous results entail NP-hardness for the de- 
cision problem represented by language LRM; here 
we are concerned with the issue of NP-completeness. 
Although in the general case membership of LRM 
in NP remains an open question, we discuss in the 
following a normal form for the class LCFRS that 
enforces completeness for NP (i.e. the proposed nor- 
mal form does not affect the hardness result dis- 
cussed above). The result entails NP-completeness 
for problems r-LRM (r > 1) and LRM(k) (k > 2). 
We start with some definitions. In a lin- 
ear context-free rewriting system G, a derivation 
A =~G w such that w is a tuple of null strings is 
called a null derivation. A cyclic derivation has the 
underlying form A ::~a. aAfl, where both ~ and 
derive tuples of empty strings and the overall ef- 
fect of the evaluation of the functions involved in 
the derivation is a bare permutation of the string 
components of tuples in L(A) (no recombination of 
components is admitted). A cyclic derivation is min- 
imal if it is not composed of other cyclic deriva- 
tions. Because of null derivations in G, a deriva- 
tion A :~a w can have length not bounded by any 
polynomial in \[G I; this peculiarity is inherited from 
context-free languages (see for example \[Sippu and 
Soisalon-Soininen, 1988\]). The same effect on the 
length of a derivation can be caused by the use of 
cyclic subderivations: in fact there exist permuta- 
tions of k elements whose period is not bounded by 
any polynomial in k. Let A f and C be the set of all 
nonterminals that can start a null or a cyclic deriva- 
tion respectively; it can be shown that both these 
sets can be constructed in deterministic polynomial 
time by using standard algorithms for the computa- 
tion of graph closure. 
For every A E C, let C(A) be the set of all permu- 
tations associated with minimal cyclic productions 
starting with A. We define a normal form for the 
class LCFRS by imposing some bound on the length 
of minimal cyclic derivations: this does not alter the 
weak generative power of the formalism, the only 
consequence being the one of imposing some canon- 
ical base for (underlying) cyclic derivations. On the 
basis of such a restriction, representations for sets 
C(A) can be constructed in deterministic polynomial 
time, again by graph closure computation. 
Under the above assumption, we outline here a 
proof of LRMENP. Given an instance (G, w) of the 
LRM problem, a nondeterministic Turing machine 
92 
M can decide whether w E L(G) in time polynomial 
in I(G, w) l as follows. M guesses a "compressed" 
representation p for a derivation S ~c w such that: 
(i) null subderivations within p' are represented by 
just one step in p, and 
(ii) cyclic derivations within p' are represented in 
p by just one step that is associated with a 
guessed permutation of the string components 
of the involved tuple. 
We can show that p is size bounded by a polynomial 
in I (G, w)\[. Furthermore, we can verify in determin- 
istic polynomial time whether p is a valid derivation 
of w in G. The not obvious part is verifying the 
permutation guessed in (ii) above. This requires a 
test for membership in the group generated by per- 
mutations in C(A): such a problem can be solved 
in deterministic polynomial time (see \[Furst et ai., 
19801). 
3 IMPLICATIONS 
In the previous section we have presented general 
results regarding the membership problem for two 
subclasses of the class LCFRS. Here we want to 
discuss the interesting status of "crossing depen- 
dencies" within formal languages, on the base of 
the above results. Furthermore, we will also derive 
some observations concerning the existence of highly 
efficient algorithms for the recognition of fan-out 
and production-length bounded LCFR languages, a 
problem which is already known to be in the class 
P. 
3.1 CROSSING 
CONFIGURATIONS 
As seen in Section 2, LCFRS(2) is the class of all 
LCFRS of fan-out bounded by two, and the mem- 
bership problem for the corresponding class of lan- 
guages is NP-complete. Since LCFRS(1) = CFG 
and the membership problem for context-free lan- 
guages is in P, we want to know what is added to 
the definition of LCFRS(2) that accounts for the dif- 
ference (assuming that a difference exists between P 
and NP). We show in the following how a binary 
relation on (sub)strings derived by a grammar in 
LCFRS(2) is defined in a natural way and, by dis- 
cussing the previous result, we will argue that the 
additional complexity that is perhaps found within 
LCFRS(2) is due to the lack of constraints on the 
way pairs of strings in the defined relation can be 
composed within these systems. 
Let G E LCFRS(2); in the general case, any non- 
terminal in G having fan-out two derives a set of 
pair of strings; these sets define a binary relation 
that is called here co-occurrence. Given two pairs 
(Wl, w'l) and (w~, w'~) of strings in the co-occurrence 
relation, there are basically two ways of composing 
their string components within a rule of G: either 
by nesting (wrapping) one pair within the other, 
e.g. wlw2w~w~l, or by creating a crossing configu- 
ration, e.g. wlw2w'lw~; note how in a crossing con- 
figuration the co-occurrence dependencies between 
the substrings are "crossed". A close inspection 
of the construction exhibited by Theorem 2 shows 
that grammars containing an unbounded number of 
crossing configurations can be computationally com- 
plex if no restriction is provided on the way these 
configurations are mutually composed. An intuitive 
idea of why such a lack of restriction can lead to the 
definition of complex systems is given in the follow- 
ing. 
In \[Seki et al., 1989\] a tabular method has been 
presented for the recognition of general LCFR lan- 
guages as a generalization of the well known CYK 
algorithm for the recognition of CFG's (see for in- 
stance \[Younger, 1967\] and \[Aho and Ullman, 1972\]). 
In the following we will apply such a general method 
to the recognition of LCFRS(2), with the aim of hav- 
ing an intuitive understanding of why it might be dif- 
ficult to parse unrestricted crossing configurations. 
Let w be an input string of length n. In Figure 2, 
the case of a production Pl : A --* f ( B1, B2, . . . , Br ) 
is depicted in which a number r of crossing con- 
figurations are composed in a way that is easy to 
recognize; in fact the right-hand side of Pl can be 
recognized step by step. For a symbol X, assume 
B2 
I I I I I I I I I i 
Figure 2: Adjacent crossing configurations defining 
a production Pl : A ~ f(B1, B2,..., Br) where each 
of the right-hand side nonterminals has fan-out two. 
that the sequence X, (il, i2),..., (iq-1, iq) means X 
derives the substrings of w that matches the po- 
sitions (i1,i2),..., (iq-l,iq) within w; assume also 
that A\[t\] denotes the result of the t-th step in the 
recognition of pl's right-hand side, 1 < t < r. Then 
each elementary step in the recognition of Pl can 
93 
be schematically represented as an inference rule as 
follows: 
A\[t\], (ia, i,+a), (S',, J,+*) 
• B,+a, (it+a, it+s), (jr+a, Jr+2) 
Air + 1\], (ia, it+s), (jl, Jr+2) O) 
The computation in (1) involves six indices ranging 
over {1..n}; therefore in the recognition process such 
step will be computed no more than O(n 6) times. 
B2 B3 ... 
i~ °" I I I I I I I I I I I I I I I 
Figure 3: Sparse crossing configurations defining a 
production P2 : A ~ f(B1, Bs,..., Br); every non- 
terminal Bi has fan-out two. 
On the contrary, Figure 3 presents a production P2 
defined in such a way that its recognition is consider- 
ably more complex. Note that the co-occurrence of 
the two strings derived by Ba is crossed once, the co- 
occurrence of the two strings derived by B2 is crossed 
twice, and so on; in fact crossing dependencies in P2 
are sparse in the sense that the adjacency property 
found in production Pl is lost. This forces a tabular 
method as the one discussed above to keep track of 
the distribution of the co-occurrences recognized so 
far, by using an unbounded number of index pairs. 
Few among the first steps in the recognition of ps's 
right-hand side are as follows: 
A\[2\], (i1, i4), (i5, i6) 
Bz, li4,i51, lis,igl 
At3\], (it, i6), (is, i9) 
A\[3\], (il, i6), (is, i9) 
B4,(i6, ir),{il,,im} 
A\[4\], (il, i7), (is, i9), (iai, i12) 
A\[4\], (it, i7), (is, i9), (ixl, i\]2) 
/35, (i7, is), (ilz, i14) (2) 
a\[51, (it, i9), (/ix, it2), (ilz, i14) 
From Figure 3 we can see that a different order in 
the recognition of A by means of production P2 will 
not improve the computation. 
Our argument about crossing configurations 
shows why it might be that recognition/parsing of 
LCFRS(2) cannot be done efficiently. If this is true, 
we have a gap between LCFR systems and well 
known mildly context-sensitive formalisms whose 
membership problem is known to have polynomial 
solutions. We conclude that, in the general case, the 
addition of restrictions on crossing configurations 
should be seriously considered for the class LCFRS. 
As a final remark, we derive from Theorem 2 a 
weak generative result. An open question about 
LCFRS(k) is the existence of a canonical bilinear 
form: up to our knowledge no construction is known 
that, given a grammar G E LCFRS(k) returns 
a weakly equivalent grammar G ~ E 2-LCFRS(k). 
Since we know that the membership problem for 
2-LCFRS(k) is in P, Theorem 2 entails that the 
construction under investigation cannot take poly- 
nomial time, unless P=NP. The reader can easily 
work out the details. 
3.2 RECOGNITION OF r-LCFRS(k) 
Recall from Section 2 that the class r-LCFRS(k) is 
defined by the simultaneous imposition to the class 
LCFRS of bounds k and r on the fan-out and on the 
length of production's right-hand side respectively. 
These classes have been discussed in \[Vijay-Shanker 
et al., 1987\], where the membership problem for the 
corresponding languages has been shown to be in 
P, for every fixed k and p. By introducing the no- 
tion of degree of a grammar in LCFRS, actual poly- 
nomial upper-bounds have been derived in \[Seki et 
al., 1989\]: this work entails the existence of an inte- 
ger function u(r, k) such that the membership prob- 
lem for r-LCFRS(k) can be solved in (deterministic) 
time O(IGIIwlU(r'k)). Since we know that the mem- 
bership problems for r-LCFRS and LCFRS(k) are 
NP-hard, the fact that u(r, k) is a (strictly increas- 
ing) non-asymptotic function is quite expected. 
With the aim of finding efficient parsing al- 
gorithms, in the following we want to know to 
which extent the polynomial upper-bounds men- 
tioned above can be improved. Let us consider for 
the moment the class 2-LCFRS(k); if we restrict our- 
selves to the normal form discussed in Section 2.3, 
we know that the recognition problem for this class 
is NP-complete. Assume that we have found an op- 
timal recognizer for this class that runs in worst case 
time I(G, w, k); therefore function I determines the 
best lower-bound for our problem. Two cases then 
arises. In a first case we have that ! is not bounded 
by any polynomial p in \]G I and Iwl: we can eas- 
ily derive that PcNP. In fact if the converse is true, 
then there exists a Turing machine M that is able to 
recognize 2-LCFRS in deterministic time I(G, w)I q, 
for some q. For every k > 0, construct a Turing 
machine M (k) in the following way. Given (G, w) as 
input, M (~) tests whether G E2-LCFRS(k) (which 
94- 
is trivial); if the test fails, M(t) rejects, otherwise 
it simulates M on input (G, w). We see that M (k) 
is a recognizer for the class 2-LCFRS(k) that runs 
in deterministic time I(G, w)I q. Now select k such 
that, for a worst case input w E ~* and G E 2- 
LCFRS(k), we have l(G, w,k) > I(G, w)Iq: we have 
a contradiction, because M (k) will be a recognizer 
for 2-LCFRS(k) that runs in less than the lower- 
bound claimed for this class. In the second case, on 
the other hand, we have that l is bounded by some 
polynomial p in \[G \[ and I w I; a similar argument 
applies, exhibiting a proof that P=NP. 
From the previous argument we see that finding 
the '"oest" recognizer for 2-LCFRS(k) is as difficult 
as solving the P vs. NP question, an extremely dif- 
ficult problem. The argument applies as well to r- 
LCFRS(k) in general; we have then evidence that 
considerable improvement of the known recognition 
techniques for r-LCFRS(k) can be a very difficult 
task. 
4 CONCLUSIONS 
We have studied the class LCFRS along two dimen- 
sions: the fan-out and the maximum right-hand side 
length. The recognition (membership) problem for 
LCFRS has been investigated, showing NP-hardness 
in all three cases in which at least one of the two di- 
mensions above is unbounded. Some consequences 
of the main result have been discussed, among which 
the interesting relation between crossing configura- 
tions and parsing efficiency: it has been suggested 
that the addition of restrictions on these configu- 
rations should be seriously considered for the class 
LCFRS. Finally, the issue of the existence of effi- 
cient algorithms for the class r-LCFRS(k) has been 
addressed. 

References 
\[Aho and Ullman, 1972\] A. V. Aho and J. D. Ull- 
man. The Theory of Parsing, Translation and 
Compiling, volume 1. Prentice-Hall, Englewood 
Cliffs, N J, 1972. 
\[Furst et al., 1980\] M. Furst, J. Hopcroft, and 
E. Luks. Polynomial-time algorithms for permu- 
tation groups. In Proceedings of the 21 th IEEE 
Annual Symposium on the Foundations of Com- 
puter Science, 1980. 
\[Joshi et aL, 1991\] A. Joshi, K. Vijay-Shanker, and 
D. Weir. The convergence of mildly context- 
sensitive grammatical formalisms. In P. Sells, 
S. Shieber, and T. Wasow, editors, Foundational 
Issues in Natual Language Processing. MIT Press, 
Cambridge MA, 1991. 
\[Kasami et al., 1987\] T. Kasami, H. Seki, and 
M. Fujii. Generalized context-free grammars, mul- 
tiple context-free grammars and head grammars. 
Technical report, Osaka University, 1987. 
\[Pollard, 1984\] C. Pollard. Generalized Phrase 
Structure Grammars, Head Grammars and Nat- 
ural Language. PhD thesis, Stanford University, 
1984. 
\[Seki et al., 1989\] H. Seki, T. Matsumura, M. Fujii, 
and T. Kasami. On multiple context-free gram- 
mars. Draft, 1989. 
\[Sippu and Soisalon-Soininen, 1988\] S. Sippu and 
E. Soisalon-Soininen. Parsing Theory: Languages 
and Parsing, volume 1. Springer-Verlag, Berlin, 
Germany, 1988. 
\[Vijay-Shanker and Weir, 1992\] 
K. Vijay-Shanker and D. J. Weir. Parsing con- 
strained grammar formalisms, 1992. To appear in 
Computational Linguistics. 
\[Vijay-Shanker et al., 1987\] K. Vijay-Shanker, D. J. 
Weir, and A. K. Joshi. Characterizing structural 
descriptions produced by various grammatical for- 
malisms. In 25 th Meeting of the Association for 
Computational Linguistics (ACL '87), 1987. 
\[Vijay-Shanker, 1987\] K. Vijay-Shanker. A Study of 
Tree Adjoining Grammars. PhD thesis, Depart- 
ment of Computer and Information Science, Uni- 
versity of Pennsylvania, 1987. 
\[Weir, 1988\] D. J. Weir. Characterizing Mildly 
Context-Sensitive Grammar Formalisms. PhD 
thesis, Department of Computer and Information 
Science, University of Pennsylvania, 1988. 
\[Younger, 1967\] D. H. Younger. Recognition and 
parsing of context-free languages in time n 3. In- 
formation and Control, 10:189-208, 1967. 
