Uniform Recognition tbr Acyclic Context-Sensitive Grammars is 
Nil '-complete 
F, rik Aarts* 
l/.escarch Institute for L~mguagc & Speech 
Trans 10 
3512 JK Utrecht 
The Netherlands 
Abstract 
Context-sensitive grammars in which each rule is 
of the forln aZfl - -~ (-*Tfl are acyclic if the associ- 
ated context-free grammar with the rules Z ~ 3' 
is acyclic. The problem whether an intmt string is 
in the language generated by an acyclic context- 
sensitive grammar is NP-conlplete. 
Introduction 
One of the most well-known classifications of 
rewrite grammars is the Chomsky hierarchy. 
Grammars and languages ~Lre of type 3 (regular), 
type 2 (context-free), type 1 (context-sensitive) 
or of type 0 (unrestricted). It is easy to de- 
cide whether a string is in tile language gener- 
ated by a regular or (:ontext-free gralntnar. For 
context-.free grammars input strings can be re(: 
ogmzed in a time that is polynomiM in the length 
of the input string as well as in the length of the 
grammar. Earley \[197(I\] ha.s shown a t)ound of 
O(\[GI2n a) where G is the size of the grammar 
and n the length of the inlmt string, l/.ecogni- 
lion for context-sensitive gralnmars is harder: it 
is PSPACE-complete \[Garey aaM Johnson, 1979\], 
referring to \[Kuroda, 1964\] and \[Karp, 1972 t. 
II.ecognition of type 0 hulguages is undccidat)le 
(see e.g. Lewis and Papadimitriou \[1981\]). 
The area between context-free grammars and 
context-sensitive grammars is interesting for two 
rea.sons. First, people have tried to describe nat- 
ural languages with rewrite grammars. Context- 
free grammars do not seem powerfull enough to 
descrihe natural languages. Context-free gram- 
mars generate context-free languages. Natural 
*The author was sponsored by project NF 102/62-356 
('StructurM and Semantic Parallels in Natural Languages 
and Programming Languages'), flmded by the Netherlands 
Organization for the Advancement of l\[escarch (NWO). 
languages are probal)ly not context-free. The 
eounterexamples of sentences that caal not be 
described with a context-free grammar are al- 
ways a bit artifieiah Very big subparts of nat- 
IlEal languages are context-free. A grammar for 
naturM languages has to be only a bit stronger 
than context-free. That's why we are interested 
in grammars that are between context-free and 
context-sensitive. 
The second perspective is the one of efficient 
proeessability, lu a context-free model, sentences 
can be processed ellMently. In a context-sensitive 
one, they can not. It is very interesting to know 
where the border lies: in which models sentences 
can be processed efficiently and in which ones they 
Call not'? 
In tile 60's and 70's, attempts have been made 
to put restrictions on context-sensitive grammars 
in order to generate context-fl-ee lmtguages. Ex- 
a:mples are Book \[1972\[, till)bard \[1974\] and Gins- 
burg aud Greibaeh \[1966 I. Baker \[1974\] has shown 
that these methods come down to tile same more 
or less. They all block the use of eontex~ to 
pass information through the string. Book \[1973\] 
gives ;m ow~rview of atteHtpts to generate context- 
flee languages with non-context-free grammars. 
How to restrict permutative grammars in order 
to generate context-free languages is described in 
MiLkkinen \[1985\]. 
t'eters 3r. and Ritchie \[1973\] proposed a lin- 
guisticMly motivated chaatge in the definition of 
the notion grammar. Subsequent replacements 
in a string are relflaced by node admissibility 
constraints in the parse trees of sentences in a 
con(ext.-flee grammar. However, this formalism 
leads to generation of context-free 1,'mguages too. 
The approach of restricting gramlnaxs such that 
they generate context-tree languages does not 
seem interesting from the natural language per- 
spective nor fi'o~l the efficiency perspective. Thc 
Acrl,:s DE COLING-92, NANTES, 23-28 AO()I 1992 1 1 5 7 ~RO¢ O: COI.ING 92 NnNIES, AUG. 23-28, 1992 
oMy advantages of tlfis kind of restrictions lie in 
the possibilty to describe a context-free language 
in a different way, which may be easier for some 
purpose. 
Another argument agMnst blocking information 
\[/3aker, \] 974\] is the problem of unbounded depen- 
dencies. Unbounded dependencies are dependen- 
cies over an mlbounded distance. Wh-movcment 
is an examI)le of it. The number of unbounded de- 
pendencies in naturM hmguage is (almost) always 
restricted. Models that restrict the amount of in- 
formation that can be sent seem to come closer to 
models of hummL language than models restrict 
the distance over wlfich information can be sent. 
In the 70's and 80's attention has shifted to 
the perspective of efficient processing. Context- 
sensitive grammars have been restricted so that 
complexity of recognition lies somewhere between 
7)SPAC$ and T'. Book \[1978\] has shown that 
for linear time context-sensitive grammars recog- 
nition is NP-complete even for (some) fixed gram- 
mars. l~lrthermore there is a result that recog- 
nition for growing context-sensitive grammars is 
t)olynomial for tLxed grammars \[Dalflhaus and 
Warmuth, t986\]. This article also tries to define 
a border between nearly-eflicient and just-efficient 
nmdels. 
We can define the notions uniform (or univer- 
sal) recognition and recognition for a fixed gram- 
mar as follows. 
UNIFORM RECOGNITION 
INSTANCE: A grammar G and a string w. 
QUESTION: Is w in the language generated by G 
? 
The grammar, as well ms the input string are in- 
puts for the problem (these two types of input are 
easily confused!). The uniform recognition prob- 
lena is one problem. 
There are infinitely many other problems: 
Suppose we have a grammar G. 
RECOGNITION FOR, FIXED GRAM- 
MAR G 
INSTANCE: A string w. 
QUESTION: Is w in the language generated by G 
v 
Things are getting even more difficult when we 
say things fikc: "For every grammar G RECOG- 
NITION FOR FD(ED GRAMMAR G ...". The 
difference between uniform recognition and recog- 
nition for all fixed G can be illustrated with an 
example from Barton Jr., Berwick and l~istad 
\[1987\]. They show that uniform recognition for 
unordered context-free grammar (UCFG) can be 
done in time O(21C;In3). It has not been shown 
that the mfiform recognition problem is in 3 °. For 
every G, however, tile fixed recognition problem 
can be solved in time O(n 3) and all these problems 
are in 7 ~. Barton Jr., Berwick and Ristad \[1987\] 
show the problem to be polynomial for any fixed 
grammar by a compilation step. The UCFG is 
compiled into a big context-free grammar. They 
use this grammar and the Earley algorithm in or- 
der to prove a polynomial bound. Just forgetting 
about the grammar size (replacing IGI by a con- 
staslt) gives a polynomial bound too. It is not 
clear why Barton Jr., Berwick and Ristad \[1987\] 
always associate the fixed grarnmar problem with 
compilation (cf. their pp. 27-30, 64-79 and 202- 
206). 
This article is about uniform recognition for one 
type of restricted context-sensitive grammars, the 
acyclie context sensitive grammars (ACSG's). We 
prove it to be NP-complete. This means they are 
as complex as the Agreement Grammars and the 
Unordered CFG's of Barton Jr., Berwick and Ris- 
tad \[1987\]. ACSG's are the pure rewrite gram- 
mars in this group. They fit in the Chomsky hi~ 
erarchy. 
The Uniform Recognition Problem 
PSPACE--complete 
NP--complete 
P 
One might ask when we can use acycfic context- 
sensitive grammars. One can use them every- 
where where one wants to use context-sensitive 
granlmars. But one has to be careful: cycles 
are not allowed. This property of acyclicity can 
be checked easily 1. For most purposes one does 
not need cycles at all. One field where context- 
sensitive grammars can be used is e.g. morphol- 
ogy. Characters in a word are often changed when 
1 It is much easier than checking whether a CSG is t~ lin- 
ear time CSG as defined by Book \[1978\]. One has to reason 
about length of possible derivations. In ACSG, derivations 
a.t'e short as a result of their acyclicity. 
ACnT~S DE COIJNG-92, NAIgfES, 2.3-28 ao~r 1992 1 15 8 PRO(:. OF COLlNG-92, N^r¢ll~s, AuG. 23-28, 1992 
some suffix is added. These changes in a word 
are context-sensitive aald can be described by a 
context-sensitivc grammar. Once a character is 
changed, we normally do not want to change it 
back, the grammax we use is an acychc one. 
The complexity of recognition for ACSG is 
lower thmt in the unrestricted case (CSG, with 
complexity PSPACE) because we restrict the 
amount of information that can be passed through 
the sentence. The number of messages that e~'ut be 
sent is limited (and we do not block the messages 
by barriers as in Baker \[1974\] !). In the unre- 
stricted case we can send messages that leave no 
trace. E.g. after a message that changes 0~s into 
l~s we can send a message that does the reverse. 
In sending a message from one position in the sen- 
tence to another~ the intermediate symbols are not 
chazlged. In fact they are changed twice: back 
and forth. With acycllc context-sensitive gram- 
mars, this is not possible. Every messages leaves 
a trace aatd the amount of information that ca~t 
be sent, is restricted by the gr~munar. 
Definitions 
A grammar is a 4-tuple, G = (V, E, R, S), where 
V is a set of symbols, :E C V is the set of terminal 
symbols. R C V ~ x V* is a relation defined on 
strings. Elements of _R arc called rules. S E V \ 
is the startsymbol. 
A grammar is context-sensitive if each rule is 
of the form aZfl ---* ¢~7fl where Z ff V \ E ; 
c¢,/~, 7 G V* ; 7 5 L e. A granLmar is context-free 
if each rule is of the form Z -~ 3' where Z C V \ 
;TEV*. 
Derivability (-%) between strings is defined as fol- 
lows: uc~v ~ uflv (u,v,c*,fl E V*) iff (~,fl) E R. 
The transitive closure of -% is denoted by =L~. The 
transitive rettcxive closure of =4- is denoted by 
:~. The language generated by G is defined ms 
L(G) = {w E E* I S ~ w}. 
A derivation of a string ~ is a sequence of strings 
zl,x2,...,x,~ with xa = S, for "all i (1 < i < n) 
Xi =2- Xi+l and X n = ~. 
A context-free grammar is acyclic if there is no 
Z E V\E such that Z ~+ Z. Thisimphes that 
there is no string a E V* such that cr ~ a. 
We can map a context-sensitive grammar G onto 
its associated context-free grammar G ~ as follows: 
If G is (V,E,R,S) then G' is (V,E,R',S) where 
for every rule aZfl -~ oeTfl E R there is a rule 
Z -~ 7 ff R r. There axe no other rules in R I. 
Note that the associated grammar does not con- 
tain empty productions. 
We cefll G aeyclic iff the associated context-flee 
grammar C is acycllc. 
The notation we use for context-sensitive rules 
is ms follows: the rule aZfl ---* ceTfl is written 
as Z -~ \[all\[a~\]...iakl 3' \[flllL621 ...\[fill with 
~ : C~la2...¢~k andfl = fllfl2...fll, ai,flj E V 
(l <i<k,l < j_<l). 
An example of a context-sensitive grammar with 
the corresponding context-flee rules is: 
context-sensitive rules context-free part 
1 -~ \[0\] 2 1 ~ 2 
0-~ i \[21 0 -~ 1 
2 -, \[1\] 0 2-~ 0 
This contextMsensitive grammaris cyclic. Iris able 
to permute (}'s and its, 
Recognition is NP-complete 
UNIFORM RECOGNITION FOR 
ACYCLIC CONTEXT-SENSITIVE 
GRAMMAR 
INSTANCE: An acyehc context-sensitive gram- 
mar G = (V, Z,R,S) and a string w G E*. 
QUESTION: Is w in the language generated by G 
? 
The proof can be found in Aarts \[1991b\]. To 
prove that it is in NP wc have to prove that 
derivations ill ACSG's aa'e short (have polynomial 
length). Tiffs follows from the fact that deriva- 
tions in context-free grammars have polynomial 
length. Derivations in an acyclie CSG are iden- 
tical with derivations in the associated context- 
free grammar. The proof of NP-hardness is more 
complicated. The known NP-hard problem 3-SAT 
can be reduced to UNIFORM RECOGNITION 
for ACSG. Any 3-SAT formula can be translated 
in a grammar and an input for ACSG-recogultion. 
AcrEs DE COLING-92, NANTES, 23-28 Aour 1992 1 1 5 9 PROc, O1: COLING-92. NAbrl'ES, AUG. 23-28, 1992 
Recognizing Power 
Any context-free grammar can be transformed 
into ant acyclic context-free grammar without loss 
of recognizing power. A cycle can be removed 
by introduction of a new symbol. This sym- 
bol rewrites to any member of the cycle. Any 
context-free grammar with empty productions 
can bc changed into a context-free grammar with- 
out empty productions that recognizes the same 
language. There's one exception here: languages 
containing the empty string can not be generated. 
Any acyclic context-free grammar withont empty 
productions is an acyclic context-sensitive granl- 
mar. Therefore, ACSG's recognize all context-free 
10alguages that do not contain the empty word. 
Furthermore, acyclic context-sensitive grana- 
mars recognize languages that are not context- 
free. One example is the language 
{anb2~c** In > 1} 
This language is recognized by the grammar ("X" 
is a nouternfinM): 
X~ \[A\] ABB \[B\] B-~\[A\]X\[X\] A-4a 
x -~ IX\] B B\[B\] B -. \[B\] X \[Xl B -~ b 
X-,\[X\]BnC\[C\] ~ ~\[B\]X\[C\] C~e 
S -~ A B B C 
Aderivationof"AABBBBCC" 
S~ABBC-~ABXC~AXXC~ 
AXBBCC~AABBBBCC~ 
aabbbbcc. 
With the pumping lemma one caal prove that the 
l~tbmage is not context-free. 
Discussion 
We have proved that UNIFORM RECOGNI- 
TION FOR. ACYCLIC CONTEXT-SENSITIVE 
Gf\[AMMAR is NP-complete. It turns out to 
be important for complexity of recognition with 
context-sensitive grammars whether sending in- 
formation leaves a trace. 
We have reduced 3-SAT to the uniform recog- 
nition problem for acyclic context-sensitive gram- 
mars. Every 3-SAT formula results in a different 
grammar. Probably it is not possible to construct 
an acyclic context-sensitive grammar that recog- 
nizes all 3-SAT formulas. My conjecture is that 
ACSG-recognition is not NP-hard for any fixed 
grammar. If this is not true, there would exist a 
grammar that recognizes all 3-SAT formulas. For 
this grammar the recognition problem would be 
NP-hard. In such a grammar, not every 3-SAT 
variable is encoded in a different symbol in the 
grammar. The variables are numbered and their 
numbers are encoded in sequences of O's and l's 
e.g.. A grammar that recognizes all 3-SAT for- 
muta's must be able to compare such sequences. 
It must e.g. be able to recognize tile language 
{ww I w • V*}. Ifwis anumber, two numbers 
are compared. Context-sensitive grammars can 
recognize ww. Some can even recognize all 3-SAT 
formula's. 
ACSG's are not that strong. They can not even 
recognize ww. Any ACSG can compare only a 
fixed number of characters (only fixed amounts of 
information cazt be sent). Therefore my conjec- 
ture is that the recognition problem for any fixed 
grammar is not so hard: it's polynomial. Chart 
parsers for ACSG have been designed and imple- 
mented \[Aarts, 1991\]. They recognize inputs for 
many hard grammars in polynomial time. It is 
hard to prove, however, that they run in poly- 
nomial time for every grammar. If it could be 
proved, complexity of ACSG-recognition is similar 
to complexity of UCFG-recognition: NP-complete 
for the uniform case and a known algorithm that 
runs in time something like O(21GIna)) (polyno- 
mial in n but not in G). 
The polynmnial bound (which has not been 
proved yet) would be an explanation of the fact 
that humans can process language efllcicntly. Hu- 
mans have a fixed grammar in mind which does 
not change. The complexity of recognition with a 
fixed grammar should be compared with the speed 
of human language processing. The arguments 
of Barton Jr., Berwick and Ristad \[1987\] against 
this are based on two kinds of arguments. The 
first has to do with compilation or preprocessing. 
We have polynomial bounds without compilation 
or preprocessing (just fix IGD. These arguments 
do not seem to hold. The other ones have to do 
with language acq~fisition. When a child is learn- 
ing a language, the grammar she uses is changing. 
At every sentence utterance or understanding the 
graramar seems to be fixed. The difference be- 
tween uniform recognition and recognition for any 
fixed grammar is that small that we can not draw 
conclusions about what kind of processing chil- 
dren perform when learning a language. 
AcrEs DE COLING-92, NAbrrES, 23-28 ^O~q" 1992 1 1 6 0 PROC. OF COLING-92, NAh'rES, AUG. 23-28, 1992 
Acknowledgements 
I want to thank Peter van Erode Boa.s, Reinhard 
Muskens, Mart Trautwein and Theo Jansen for 
their comments on carher versions of this paper. 
References 
Aarts, E., Itecognition for Acychc Context- 
Sensitive Grammars is probably Polyno- 
mial for Fixed Grammars, Tillmrg Univer- 
sity, ITK Research Memo no. 8, 199L. 
Aarts, E., 
Uniform Recognition for Acychc Context- 
Sensitive Grazumars is NP-complete, pa- 
per presented at Computing Science in the 
Netherlands, Amstcrdam, 19911). 
Baker, B. S., Non-context-Free Grammars Gen- 
erating Context-Frec Languages, Inform. 
and Control, 24,231 -246, 1974. 
Barton Jr., G. E., R. C. Berwick ~ld E. S. Ris- 
tad, Computational complexity and natu- 
rM language, MIT Press, Cambridge, MA, 
1987. 
Book, R. V., Terminal context in context-sensitive 
gramanars, SIAM J. Comput., 1, 20-30, 
1972. 
Book, R. V., On the Structure of Context- 
Sensitive Grammars, lnternat. J. Comput. 
lnform. Sci., 2, 129 139, 1973. 
Book, R. V., On the Complexity of Formal Grain- 
mars, Acta/nform., 9,171 181, 1978. 
Dahlhaus, E. and M. K. Warmuth, Membershil) 
for Growing Context-Sensitive Grammars 
Is Polynomial, lnternat. J. Comput. 1n- 
form. Sci., 33,456--472, 1986. 
Earley, J., An Efficient Context-Free Parsing Al- 
gorithm, Comm. ACM, 13(2), 94-102, 
Feb. 1970. 
(~arey, M. R. aztd D. S. Johnson, Computers 
and lntractabillty: A C~uide to the The- 
ory of NP-Completeness, W. H. Freemall 
and Company, San l~rancisco, CA, 1979. 
Ginsburg, S. and S. A. Greibach, Mappings which 
Preserve Context Sensitive Languages,/n- 
form. and Control, 9, 563-582, 1966. 
Hibbard, T. N., Context-Limited Grammars, J. 
Assoc. Comput. Mach., 21(3), 446-453, 
July 1974. 
Karl), R. M., Reducibihty among combinato- 
rim problems, in CompleJdty of Computer 
Computations, edited by R. E. Miller 
and J. W. Thatcher, pp. 85-103, Plenum 
Press, New York, 1972. 
Kuroda, S. oY., Classes of Languages and Linear° 
Bounded Automata,/nform. and Control, 
7, 207-223, 1964. 
Lewis, H. R. and C. H. Papadlmitriou, Elements 
of the theory of computation, Prentice- 
Hall, Englewood Cliffs, N J, 1981. 
M£kklnen, E., On Permntative Gra~nmars Gen- 
erating Context-Free Lang~tagcs, BIT, 25, 
604-610, 1985. 
Peters Jr., P. S. mad R. W. Ritchie, Context- 
Sensitive Immediate Constituent Anal- 
ysis: Context-Free Languages Revisited, 
Math. Systems Theory, 6(4), 324-333, 
1973. 
AcrF.s DE COLlNG-92, NAMES, 23-28 AOr~q ' 1992 1 1 6 1 PRO(:. OF COLING-92, NANteS. AUG. 23-28, 1992 
