COM PUTATIONAL ('Obl PLEXITY AND 
LEXICAL FUNCTIONAL GRAMMAR 
Robert C. Berwick 
MIT Artificial Intelligence Laboratory, Cambridge, MA 
1. INTRODUCTION 
An important goal of ntodent linguistic theory is to characterize as narrowly 
as possible the class of natural !anguaooes. An adequate linguistic theory 
should be broad enough to cover observed variation iu human languages, and 
yet narrow enough to account for what might be dubbed "cognitive 
demands" -- among these, perhaps, the demands of lcarnability and 
pars,ability. If cognitive demands are to carry any real theoretical weight, then 
presumably a language may be a (theoretically) pos~ible human language, 
and yet be "inaccessible" because it is not leanmble or pa~able. 
Formal results along these lines have already been obtained for certain kinds 
of'rransformational Generative Grammars: for example, Peters and Ritchie 
\[I\] showed that Aspeel~-style unrest~ted transtbrmational grammars can 
generate any recursively cnumerablc set: while Rounds (2\] \[31 extended this 
work by demonstrating that modestly r~tricted transformational grammar~ 
(TGs) can generate languages whose recognition time is provhbly 
expm~cntial. (In Rounds" proof, transformatiocs are subject to a "terminal 
length non-decreasing" condition, as suggested by Peters and Myhill.) Thus, 
in the worst case TGs generate languages whose recognition is widely 
recognized to be computatiofrally intrdctable. Whether this "worst case" 
complexiw analysis has any real import for actual linguistic study has been 
the subject of ~me debate (for discussion, see Chomsky \[4l; Berwiek and 
Weinbcrg \[5\]). Without resolving that cuntroversy here howeser, one thin-g- 
can be said: to make TGs cmciendy parsable one might provide 
con~train~ For instance, these additional s'~'ictutes could be roughly of the 
sort advocated in Marcus' work on patsinB \[6\] -- constraints specifying that 
TG-based languages must haw parsers that meet certain "lecality 
conditions". The Marcus' constraints apparently amount to an extension of 
Knuth's l.,R(k) locality condition \[7\] to a (restricted) version of a two-stack 
deterministic push-down automaton. (The need tbr LR(k)-like restrictions in 
order to ensure efficient processability was also recognized by Rounds \[21.) 
Recently, a new theory of grammar has been advanced with the explictiy 
stated aim of meeting the dual demands of tearnability and pa~ability - the 
Lexical Functional Grammars (LFGs) of Bresnan \[!~ I. The theory of l.exical 
Functional Grammars is claimed to have all the dc~riptive merits of 
transformational grammar, but none of its compotational unruliness, In 
t.FG, there are no transformations (as classically described); the work 
tbrmerly ascribed to transformations such as "passive" is shouldered by 
information stored in Ibxical entries associated with lexical items. The 
climmation of transformational power naturally gives rise to the hope that a 
lexically-based system would be computationally simpler than a 
transformational one. 
An interesting question then is to determine, as has already been done for the 
case of certain brands of transformational grammar, just what the "worg 
case" conlputational complexity for the recognition of LFG languages is. If 
the recognititm time complexiW for languages generated by the basic LFG 
rheas can be as complcx as that for languages generated by a modestly 
restricted U'ansfunnational system, then presumably \[.FG will also have to 
add additional coastraiuts, beyond those provided in its basic theory, in order 
',u ensure efficient parsability. 
The main result of this paper is to show that certain \[.exical Functional 
Grammars can generate languages whose recognition time /s very likely 
ct~mput.'xtionally intractable, at Ie,'LSt a~urding to our current understanding 
of wl~at is or is not rapidly solvable. Briefly. the demonstration proceeds by 
showing how a problem that is widely conjectured to be cumputationally 
dimcult -- namely, whether there exists ~n ~%ignment of Us and O's (or '*T"s 
and "l~'s) to tire litcrals ofa Bta~lcan formula in conjunctive normal form that 
makes the forrnula evaluate to "I" (or "tree") -- can be re-expressed as the 
prublcm of recognizing whctl~er a particular string is or is uot a member uf 
the language generated by a certain lexical functional grammar. This 
"reduction" shows that in the worst case the recognitinn of I.FG lanp, uages 
can be just as hard as the original Boolean satisfiability problem. Since k is 
widcly conjectured that there cannot be a polynomial-time alguriti'n'n for 
satisfiabiliW (the problem is NP-complete), there canno~, be a polynomial-dine 
recognition algorithm for LFG's in general either. Note that this result 
sharpens that in Kaplan and Bresnan \[81: there it is shown only that LFG's 
(weakly) generate some subset of the class of context-sensitive languages 
(including some strictly context-sensitive languages) and therefore, in the 
worst case, exponential time is known to be sufficient (though not necessary) 
to reaognize any LFG language. The result in \[81 thus does not address the 
question of how much time, in the worst case, is necesmry to recognize LFG 
languages. The result of this paper indicates that in the worst case more than 
pnlynomial time will probably be necessary. (The reason for the hedlp." 
"probably" will become apparent below; it hinges upon the central unsolved 
conjecture of current complexity theory.) In short then, this result places the 
• LFG languages more precisely in the complexity hierarchy. 
It also toms out to be instructive to inquire into just why a lexically-based 
approach can tom out to be compurationally difficult, and how 
computational tractability may be guaranteed. Advocates of lexically-based 
theories may have thought (and some Pave explicitly stated) that the 
banishment of transformations is a compumdonally wise move because 
transformations are computationally "expensive." Eliminate the 
transformations, so this casual argument goes, and one has eliminated all 
comptitational problents. In~guingiy though, when one examines the proof 
to be given below, the computational work done by transformations in older 
theories re, emerges in the lexical grammar as the problem of choosing 
between alternative categorizations for lexical items - deciding, in a manner 
of speaking, whether a particular terminal item is a Noun or a Verb (as with 
the word k/ss in English). This power .of choice, coupled with an ability to 
express co-occurrence constraints over arbitrary distances across terminal 
tokens in a string (as in Subjeat-Verb number agreement) seems to be all that 
is required to make the recognition of LFG languages intr~table. The work 
doee by transformations has been exchanged for work done by lexieM 
~.hemas. but the overall computational burden remains mugidy the same. 
This leaves the question posed in the opening paragraph: jug what sorts of 
constraints on natural languages are required in order to ensure efficient 
parsabil)tg? An infoqrln~ argume.nt can be made that Marcus' work \[6} 
provides a good first attack on just this kind of characteriza~n. M~x:us' 
claim was that languages easily parsed {not "garden-pathed") by o¢oole could 
be precisely modeled by the languages easily pm'sed by a certain type of 
restricted, deterministic, two-stack parsing machine. But this machine can be 
spawn to be a (weak) non-canonical extension of the I,R(k) grammars, as 
proposed by Knuth \[51. 
Finally, this paper will discuss the relevance of this technical result for more 
down-to-earth computational linguistics. As it turns out, even though 2eneral 
LFG's may well be computationally intractable, it is easy to imagine a variety 
of additional constraints for I..FG theory that provide a way to sidestep 
arovr,d the reduction argument. All of these additional r~trictions amount to 
making the LFG theory more restricted, in such a way that the reduction 
argument cannot be made to work. For example, one effective restriction is 
to stipulate that there can only be a finite stock of features with which to label 
Icxical items. In any case, the moral of the story is an unsurprising one: 
specificity and constraints can absolve a theory of computational 
intr~tability. What may be more surprising is that the requisite locality 
constraints seem to be useful for a variety of theories of grammar, from 
transformational grmnmar to lexieal functional gr,'unmar. 
7 
2. A REVIEW Ok" 131:DU,,eTI'ION ARGUMENTS 
The demonstration of the computational complexity of I.FGs rcii~ upon the 
standard complexity-theoretic technique of reduction. Becauso this method 
may be unf.',,ndiar to many readers, a short review is presented immediately 
below: this is followed by a sketch of the reduction proper. 
The idea behind the reduction technique is to take a difficult problem, in this 
case. the problem of determining the satisfiability of Boolean .rormu/as in 
conjunctive normal form (CNF), and show that the known problem can be 
quickly transfumled into the problem whns¢ complexity remains to be 
determined, in this case. the problem of deciding whether a given string is in 
the language generated by a given Lexical Functional Grammar. Before the 
reduction proper is reviewed, some definitional groundwork must be 
presented, A I\]ooleanformula in cenjunctDe normal form is a conjunction of 
disjunctions. A formula is satisfiable just in case there exkts some assignment 
of T's and \['~s (or t's and 0's) to the Iiterals of the formula X i that fumes the 
evahmtion of the enure formula to be 1"; oLherwise~ the formula is said to be 
unsmisfiable. For cxmnpl¢ 
(X2VX3 VXT)A(XIV~2VX4)A(X3VXIVX 7 ) 
is satisfiable, since the assignment of Xz=T (hence ~'2= F'), X3= F (hence 
X3='l'). XT=F (.~./=T). XI=T (XI=F), and X4=F makes the whole 
formula cvalute to "T". The reductioo in the proof below uses a somewhat 
more restuictcd format where every term is comprised of the disjunction of 
exacdy three \[itcrats, so-called 3-CNF(or "3-SAT"). "l'his restriction entails 
no loss of" gcncralit!,, (see Hopcmft and Ullman, \[9\]. Chapter 12), since this 
restricted furmat is also NP-complete. 
How does a reduction show that the LFG recognition problem must be at 
least .',s hard (computatiomdly speaking) as the original problem of Boolean 
satisfiability? Ihe answer is that any decision procedure for LFG recognition 
could be used as'a correspondingly f~st procedure for 3-CNF. as follows: 
(1) Given an instance of a 3-CNF problem (the question of whether there 
exists a satisl'ying assignment for a given luminia in 3-CNF), apply the 
transfi~mlational algurithm provided by the reduction: this algorithm is itself 
~L%sumed tO execute quickly, in polynomial time or less. "\]~e algurid'an 
outputs a corresponding LFG decision problem, namely: (i) a lexical 
functional grammar and (ii) a string to be tested lbr membership in the 
language generated by the I.FG. The LFG recognition problem r~presents or 
mimics the decision problem for 3-CNF in the sense that the "yes" and "no ~ 
answers to both ~dsfiability problem and membership problem must 
coincide (if there is a satisfying ag,;ignmenL then the corresponding LFG 
decision problem should give a "yeS" answer, etc.). 
(2) Solve the LFG decision problem -- the string-LFG pair - output by Step 
h if the string is in the LFG language, the original formula was satisfiable; if 
not. unsadsfiable. 
(Note that the grammar and string so constructed depend upon just what 
formula is under analysis; that is. For each different CNF formula, the 
procedure presented above outputs a diffemnt LFG grammar and suing 
combination. In the LFG case it is important to remcmber that "grammar" 
really means "grammar plus lexicon" - as one might expect in a 
lexically-based theory. S. Petet~ has observed that a siighdy different 
reduction allows one to keep most of the grammar fixed across all possible 
input formulas, constructing only different-sized lexicons for each different 
CN\[: Formula; for details, see below.) 
To see how a reduction can tell us something about the "worst ca.~" time or 
space complexity required to recognize whether a string is or is not in an LFG 
language, suppose for example that the decision procedure for determining 
whether a string is in an LFG language takes polynomial time (that is, takes 
time n k on a deterministic "ruling machine, for some integer k, where n= the 
length of the input string). Then. since the composition of two polynomial 
algorithms can be readily shown to take only polynomial time (see \[91 
Chapter 12), the entire process sketched above, from input of the CHF 
formula to the decision about its satisfiability, will take only polynomial time. 
However, CNF (or 3-CNF) has no known polynomial time algorithm, and 
indeed, it is considered exceedi~zgly unlikely that one could exists. "Vaerefore, 
it is just as unJikely that LFG recognition could be done (in general) in 
polynomial time, 
The theory of computational complexity has a much more compact term for 
problems like CNF: CNF is NP-cnmolcte. This label is easily deciphered: 
(1) CNF is in the class NP. that is, the class or" languages that can be 
recognized by a .qD.n-deterministic Tunng machine in Dgivnomial time. 
(Hence the abbreviabon "NP", for "non-deterministic polynomial". To see 
that CNF ,', in the class NP, note that one can simply guess all possible 
combinations of truth assignments to iiterab, and check each guess in 
polynomial lune.) 
(2) CNF is complete, that is. all other languages in the class NP can be quickly 
reduced to some CNF formula, (Roughly. one shows that Boolean formulas 
can be used to "simuiam" any valid computation of a non-determinis~ 
Toting machine,) 
Since the class of problems solvable in polynomial time on a determinist~ 
Turing machine (conventionally notated. P) is trivially contained in the clas~ 
so solved by a nondcterministic Turing machine, the class P must be a subset 
ofdle class NP. A well-known, v, ell-studicd, and still open question is whther 
the class P is a nroner subset of the class NP. that is. whether there are 
problems solvable i.t non-deterministic polynomial time that cannot be 
solved in deterministic polynomial time.. Ik'causc all ofthe several thousand 
NP-eomplcte problems now catalogued have so far proved recalcitrant to 
deterministic polynomial time solution, it is widely held that P must indeed 
Ix a proper subsot of NP, and therefore that dte best possible algorithms for 
solving NP.complcte problems must take more than polynomial time (in 
general, the algorithms now known tbr such pmbtems inw~lve exponential 
combinatorial search, in one fashion or another; these are essentially methods' 
that do no Ixtter than to bnttally simulate -- deterministically, ofcout~e - a 
non-deterministic machine that "guesses" possible answeix) 
To repeat the Force of the reduction argument then, it" all LFG rec~ition 
problems were solvable in polynomial time. then the ability tu quickly reduce 
CNF Formulas to LFG recognition problems implies that all HP-complete 
problems would IX sulvabl¢ in polynomial rime. and that the class P=the 
class NP. This possibility seems extremely remote, tlence, our assumption 
that there is a fast (general) procedure for recognizing whether a string is or is 
not in the language generated by an arbitrary LFG grmnmar must be false. 
In the mrminology of complexity theory, LFG recognition must be NP-hard 
- "as hard as" any other NP problem, including the NP-complete problems. 
This means only that LFG recogntion is at least as haedas other NP-complcm 
problems -- it could still be more ditlicult (lie in some class that contains the 
class NP). If one could also show that the languages generated by LFC.s arc 
in the class NP, then LFGs would be shown to be NP-complcte. This pal~'r 
stops short of proving this last claim, but simply conjectures that LFGs are in 
the clasa NP. 
3.A sg~c8 o~lg~ 
To carry out this demonstration in detail one mug explicidy describe the 
t~nsformauon procedure that takes as input a formula in CHF and outputs a 
corresponding LFG decision problem - a string to be tested for membership 
in a LFG language and the LFG itself. One must also show that this can be 
done quickly, in a number of stc~ proportional to (at most) the lefigth of the 
original formula to some polyoomlal power, l~t us dispose of the last point 
first. The string to be tested for membership in the LFG language will simply 
be the original formula, sans parentheses and logical symbols; the LFG 
recognition problem is to lind a well-formed derivation of this string with 
respect to the grammar to be provided. Since the actual grammar and string 
one has to wrim down to "simulate" the CNF problem turn out to be no 
worse than linearly larger than the original formula` an upper bound of say. 
time n-cubed (where n=length of the original formula) is more than 
sufficient to construct a corresponding LFG; thus the reduction procedure 
itself can be done in polynomial time. as required. This paper will therefore 
have nothing fiarther to say about the time bound on the transformation 
procedure. 
8 
Some caveats are in order .before embarking on a proof sketch of this 
rednctio¢ First of all, the relevant details of the LFG theory will have to be 
covered on-the-fly; see \[8\] for more discussion.' Also, the grammar that is 
output by the reduction procedure will not look very much like a grammar 
for a natural language, ~ilthbugh the grammatical devices that will be 
employed will in every way be those that are an essential part uf the LFG 
theory. (namely, feature agreement, the lexical analog of Subject or Object 
"control", lexical ambiguity, and a garden variety context-free grammar.) In 
other words, although it is most unlikely that any namnd language would 
encode the satisfiability probl.cm (and hence be iutractablc) in just the 
manner oudined below, on the other hand. no "exotic" LFG machinery is 
used in the reduction. Indeed. some of the more powerful LFG notational 
formalisms -- long-distance binding existential and negative feature operators 
- have not been exploited. (An earlier proof made use of an existential 
operator in the feature machinery of LFG, but the reduction presented here 
does not.) 
To make good this demonstration one must set out just what the ~tisfiability 
problem is and what the decision problem for membership in an I..FG 
language is. Recall that a formula in conjunctive normal form is satisfiable 
just in case every conjunctive term evaluates to true, that is, at least one literal 
in each term is true. The satisfiability problem is to find an assignment of'I"s 
and Fs to the literals at the bottom (note that the comolcment of literals is 
also permitted) such that the root node at the top gets the value "T" (for 
li31g). How can we get a lexical functional grammar to represent this 
problem? What we want is for satisfying a.~ignments to correspond to to 
well-formed sentences of some corresponding LFG grammar, and 
non,satisfvint assignments to correspond to sentences that are not 
well-!'ormed, according to the LFG grammar:. 
satisftable non-satisfiable 
fo?la w form la|n~W 
sentence w' IS sente w" IS NOT 
in LFG language L(G) in LFG language L(G) 
Figure I. A Reduction Must Preserve Soludona to the Original Problem 
Since one wants the satisfying/non-satisfying assignments of any particular 
formula "to map over into well-formed/ill-formed sentences, one must 
obviously exploit the LFG machinery for capturing well-formedncm 
conditions for sentences, First of all, an LFG contains a base context-free 
m-ammar. A minimal condition for a sentence (considered as a string) to be in 
the language generated by a lexical-functional grammar is that it can be 
generated by this base grammar:, such a sentence is then said to have a 
well-formed constituent structure. For example, if the base roles included 
S=bNP VP; Vp=Pv NP, then (glossing over details of Noun Phrase rules) 
the sentence John kissed the baby would be well-formed but John the baby 
would not. Note that this assumes, as usual, the existence of a lexicon 
that provides a categorization for each terminal item, e.g., that baby is of the 
eategury N, k/xr, ed is a V, etc. Importantly then. this well-formedness 
cn/~dition requires us to provide at least one legitimate oarse tree for the 
candidate sentence that shows how it may be derived from the underlying 
LFG base context-free grammar. (There could be more than one legitimate 
tree if the underlying grammar is ambiguous.) Note further that the choice of 
categorization for a lexical item may be crucial. If baby was assumed to be of 
category V, then both sentences above would be ill-formed. 
A second major component of the LFG theory is the provision for adding a 
set of se-called functional equations to the base context-free rules. The~ 
equations ,are used to account for that the co-oecurrence restrictions that are 
so much a part of natural languages (e,g., Subject-Ve~ agreement). Roughly, 
one is allowed to associate featur~ with lexical entries and with the 
non-terminals of specified context-free rules; these features have values. The 
equation machinery is used to pass features in certain ways around the par,~ 
tree, and conflicting values for the same feature are cause for rejecting a 
candidate analysis. To take the Subject-Verb agreement example, consider 
the sentence the baby is kissing John. The lexical entry for baby (considered 
as a Noun) might have the Number feature, with the value sinzular. The 
lexieal entry for is might assert that the number feature of the %tbiect above 
it in the parse tree must have the value singular: meanwhile, the feature 
values for Subject are automatically found by another rule (associated with 
the Noun Phrase portion ofS=:,NP VP) that grabs whatever features it finds 
below the NP node and copies them up above to the S node. Thus the S node 
gets the Subject feature, with whatever value it has passed from baby below -- 
namely, the value sintadar: this accords with the dicates of the verb/s, and all 
is well. Similarly, in the sentence, the boys in the band is kissing John, bays 
passes up the number value olural, and this clashes with the verb's constraint; 
as a result this sentence is judged ill-formed: 
,lqp Tp,/jfeatures ¢ Subject Number.Singular or Plural? = CLASHI 
I Number.plural V *, Number:singular 
lJ the boys in the band is" kissing John. 
Figure 2. Co-eccurrence Restrictions are Enforced by Feature Checking in an 
LFG. 
It is important to note that the feature comparability check requires (1) a 
particular constituent structure trec (a pm~c tree); and (2) an assignment of 
terminal items (words) to lexical categories -- e.g., in the first Subject-Verb 
agreement example above, baby was assigned to be of the category N, a 
Noun. The tree is obviously required because the feature checking 
machinery propagates values according to the links specified by the 
derivation tree; the assignment of terminal items to categories is crucial 
because in most ca~ the values of features are derived from those listed in 
the lexical entry for an item (as the value of the numb~er feature was derived 
frtnn the lexical entry for the Noun form of bab~,). One and the same 
terminal item can have two distinct lexical entries, corresponding to distinct 
lexical categorizations; for example, baby can be both a Noun and a Verb. If 
we had picked baby to be a Verb, and hence had adupted ~hatevcr features 
are associated with the Verb entry for baby to be propagated up the tree, then 
the string that was previously well-formed, the baby is kissing John would 
now be considered deviant. If a string is ill-formed under all possible 
derivation trees and assignments of features From possible lexical 
categorizations, then that string is norin the language generated by the LFG. 
The possibility of multiple derivation trees and lexical categorizations (and 
hence multiple feature bundles) for one and the same terminal item plays a 
crucial role in the reduction proof: it is intended to capture the satisfiability 
problem of deciding whether to give a literal X i a value of"l" or "F". 
Finally, LFG also provides a way to express the familiar patterning of 
grammatical relations (e.g.. "Subject" and "Object") found in natural 
language. For example, transitive verl~ must have objects. This fact of life 
(expressed in an Aspects.style transformational grammar by subcategorization 
re~ictions) is captured in LFG by specifying a so-called ~ (for 
predicate) feature with a Verb: the PRED can describe what grammatical 
relations like "Subject" and "Object" must be filled in after feature passing 
has taken place in order for the analysis to be well-formed. For instance, a 
transitive verb like kiss might have the pattern, kiss((SubjeetXObject)), and 
thus demand that the Subject and Object (now considered to be "features") 
have some value in the final analysis. The values for Subject and Object 
might of course be provided from some other branch of the parse tree, as 
provided by the feature propagation machinery; for example, the Obiect 
feature could be filled in from the Noun Phrase part of the VP expansion: 
'SUBJECT: Sue 1 
S (eatures:lPRED !*kiss<(SubjeetXObjec0)l J 
V NP. sue / I 
km John 
Figure 3. Predicate Templates Can Demand That a Subject or Object be 
Filled In. 
But. if the Object were not filled in, thee die analysis is declared func#onally 
incomplele, and is ruled our. This device is used tO cast out sentences such as. 
t/m baby kL~eg 
$o much for the LFG machinery that is required for the reduction proo£ 
(There are additional capabilities in the LFG theory, such as long-distance 
binding, but these will nut be called upon in the demonstration below.) 
What then does the LFG repmsentador, of die satisfiabillty problem look 
like? Basically, there are three parts to the sausfiability problem that mug be 
mimicked by the LFG: (I) the assignment ofvaines to literals, e.g., X2-)'r"; 
X4-Y'F"; (2) the co-ordination of value assignments across intervening literals 
in the formula; e.g., the literal X 2 can appear in several different terms, but 
one is nut allowed to assign it the value "1" in one term and the value "F" in 
another (and the same goes for the complement of ~, literal: if X 2 has die 
value 'T'. "~z cannot have die valu~ "V'): and (3) ~tisfiability must 
corresl~md to LFG wcll-formedness, i.e. each term has the truth value "r" 
just in case at least one literal in the tenn is assigned "I" and all terms must 
evaluate to "l TM. 
Let us now go over how these components may be reproduced in an LFGo 
one by one. 
(t) Assignments: The input string to be tested for membership in the LFG 
will simply be the original formula, sans parentheses and logical symbols: the 
terminal items are thus just a string of Xi's. Recall that the job of checking 
the string for well-formedn,.-~s involves finding a derivation tree for the suing, 
solving the ancillary co-oecurrencc equations (by feature propagatiun), and 
chetking for functional completeness. Now, the cuntext-fre~ grammar 
constructed by the transformation procedure will be set up so ,'ts to generate a 
virtual copy of the associated formula, down to the point where literals X i are 
a~signed dicir values of'r" or "F". If the original CNF form had N terms. 
this part of grammar would look like: 
S~,T 1 T 2 .. T n (one "l" for each term) 
Ti=~Yi Yi Yk (one triple of Y's per term) 
Several comments are in order here. 
(I) The context-free base that is built depends upon the original CNF 
formula that is input, since the number of terms.' n, varies from formula to 
formula. In Stanley Peters' improved version of the reduction proof, the 
context-free base is fixed for all formulas with the rules: 
S='S S' 
S'==' T T TorSmT T ForT F ForT F Tot_ 
(remaining twelve expansions that have at least one "I" in each triple) 
The Peters grammar works by recursing until die right number of terms is 
generated (any sentences that are too long or too short cannot be matched to 
the input formula). Thus, the number of terms in the original CNF formula 
need not be explicidy encoded into the base grammar. 
(2) The subscripts Lj, and k depend on the actual subscripts in the original 
formula. 
(3) The Yi are not terminal items, but are non-terminals. 
(4) This grammar will have to be slightly modified in order for the reduction 
to work. ~ will become apparent shordy. 
Note that so far there are no rules to extend the parse tree down co the level 
of terminal items, the X r The next step does this and at the same time adds 
the power to choose between "r" and "F" assignments to literais. One 
includes in the context-free base grammar two productions deriving eacJa 
terminal item Xi, namely, XiT=~X i and XiF'mpX i, corresponding to an 
assgnment of -r" or "F" to the formula literal X i (it is important not to get 
confused here between the literais of the formula - these are terminal 
elements in the lexical functional grammar - and die literals of the grammar 
- the non-terminal" symbols.) One must also add, obviously, the rules 
Yi=~XiTlXi F, for each i, and rules corresponding to. the negations of 
variables, "~ir--'~i Note that these are not "exotic" t.FG rules: exacdy the 
same sort of rule is required in the baby case, i.e.. N~baby or V=~.baby, 
corresponding to whether baby is a Noun or a Verb. Now. the lexical entries 
for the "XiT " ' categ.rization of X i will look very different from the "XiF' 
eategodzadon of X i. just as one might expect the N and V forms for baby to 
be different. Here is what the entries for the two categorizations of X i look 
like: 
X~ XiT (Ttmth-assignment)=T 
(Tassign Xi)=T 
Xl: XiF (Tassign X i) =F 
The feature assignments for the negation of the literal X i is simply the dual of 
the entries above (since the sense of"T" and "I-" is reve~cd): 
~" .~'iT (T truth-amsignment) = T 
(fa.~igu X.~: F. 
x,v :T 
The role of the additional "truth-ass/gnment" feature will be explained 
bdow. 
Figure 4. Sample Lexieal Entries to Reproduce the Ass/gument of T's and l'~s 
to a literal X r 
The upward-dirked arrows in the entries reflect the LFG re.mum 
propagation machinery. In the case of the X|T entry, for instance, they say to 
"make the Truth-assitnment feature of the node above XiT have the value 
"T =. and make the ~. pordon of the A~izn feature of the node above have 
the value T." This feature propagation device is what reproduces the 
assignment of T's and Fs to the CNF limrala, \[f we have a triple of such 
eicmen~ and at least one of d~m is expanded out to XiT. then the restore 
pmpagauon machinery of LFG will merae the common feature names intu 
one large m~cture for the node above, reflecting the assignments made; 
moreover, the term ~ll get a tilled-in truth assignment value just in case at 
~ag one of the expansions selected an XIT path: 
terminal 
suing: 
T' 
X i 
fPnmre s~rtlCtUr¢: 
i F i kF 
X X k 
t ruth'assignment= I Xj= L L::aJ 
Figure 5. The LFG Feature Pmpagatiun Machinery is Used to Percolate 
Feature Assigumants from the Lexicon. 
10 
(The features are passed transparendy through the intervening 
Yi nodes via the LFG ".copy" device. (T = J.); 
this simply means that all the features of the node below the node to 
which the "copy" up-add-down arrow'~ are attached are to be 
the same as those of the node above the up-and-down arrows.) 
It is p!ain that this mechanism mimics the a.~ignment of valueS~'.o literah 
required by the satisfiability problem. 
(2) Co-ordination of aasignments: One must also guarantee that the X i value 
assigned at one place in the tree is not contradicted by an X| or X i elsewhere. 
To ensure this, we use the LFG co-occurrence agreement machinery: the 
Assilzn feature-bundle is pass~ up from each term T i to the highest node in 
the parse tree (one simply adds the (i" = \]3 notadon to each T i rule in order to 
indicate this). The Assign feature at this node will thus contain the union of 
all ~ feature bundles passed up by all terms. If any X i values conflict, 
then the resulting structure is judged ill-formed. Thus, only compatible Xi 
assignments are well-formed: 
features: Assign: ~..i =T or F3.1 
T~,.... ~ Clashl 
~T X~T 
I 
{Tz~gn X~) = T (Tassign X~ = F) 
Figure 6. The Feature Comparability Machinery of LFG can Fon:e 
Assignments to be Co-ordinated Across Terms. 
(3) Prt.'servation of satisfying assignments. Finally, one has to reproduce the 
conjunctive chanlcter of the 3-CN F prublem -- that is, a sentence is ~atisfiahle 
(wcll-formcd) iff each term has at least one literal assigned the value "1".- 
Part of the disjunctive character of the problcm has already been encoded in 
the feature propagation machinery p~¢~nted so far: if at least one X i in a 
term "\]'j cxpands to the Iexical entry XiT, then the tr~th-a~siRnment feature 
gets the value T. "\['his is just as desired. Ifone, two, or three of the literais X i 
in a term select XiT, then Tl's truth-assigument feature is T. and the analysis 
is well-formed. But how do we rule out the case where all ~ree Xi's in a lerm 
select the "F' path. XiF? And how do we ensure that all terms have at least 
one T below them? 
Both of these problems can be solved by resorting to the LFG functional 
completeness constraint. The ~ck will be to add a Pred feature to a 
"dummy" node atu~ched to cach term; the sole purpose of this feature will be 
to refer to the feature "l'mth:a~,~i~,pm.q2.e=.g~ just as the predicate template for the 
transitive verb ki.~* mentions thc feature Object. Since an analysis is not 
wcll-formcd if the "grmnmatical relations" a Pred mentions are not filled in 
from somewhere, this will have the effect of forcing the Tmth-~i=nment 
t'cature to gct filled in every term. Since the "F" lexical entry does not have a 
l'mth-assimlmcnt value, if all the X i in a term triple select the XIF path (all 
the litcrais are "F") then no Truth-assignment feature is ever picked up from 
the lexicai entries, and that term never gets a Truth-assignment feature. This 
violates what the predicate template demands, and so the whole analysis is 
thrown out. (The ill-formednoss is ex~dy analogous to the case where a 
transitive verb never gets an ObjeCL) Since this condition is applied to each 
term, we have now guaranteed that each term must have at least one literal 
below it that ~clects the 'T" path -- just as desirea. Fo actually add the new 
predicate template, one simply adds a new (but dummy) branch to each term 
'1" v with the appropriate predicate constraint attached to it: 
/ 
11 
T, featureJ:,.~ured: "dummy2<(TTruth-assignmen0~ 
Dum~ty2 r / ~ I / 
lexical entry: i I , ~. ....... ..... 
'dummy2': J "~ XtT XtF ~"~vF : ,", ( I' r 
'dummy2((1' Truth-assignment)> ~, ,X i| 
(TTruth-assignmen0 = T 
Figure 7. Predicates Can be Used to Force at least one ~ Per Term. 
There is a final mbde point here: one must prevent the Pred and 
Truth-assignment features for each term from being passed up to the head 
"S" node. The reason is that if these features were passed up, then since the 
LFG machinery automatically mergea the values of any features with the 
same name at the topmost node of the paine tree, the LFG machinery would 
fume the union of the feature values for Pred and Truth-asugnment over all 
terms in the analysis tree. The result would be that if any term had .at least 
one "I" {hence satisfying the Truth-assignment predicate template in at least 
one term), then the Pred and Truth-assignment would get filled in at the 
topmost node as well. The string below would be well-formed if at-least one- 
term were "T", and this would amount to a disjunction of disjunctions (an 
"OR" of "OR"s), not quite what is ~ugh¢. To eliminate this possibility, one 
must add a final trick: each term T I is given separate Predicate, 
Truth-assignment. and Assign features, but only the Assign feature is 
propagated to the highest node in the parse tree as such, In contrast, the 
Predicate and Truth-assignment features for each term are kept "protected" 
from merger by storing them under separate feature headings labelled 
T1...'r n. "l~e means by which just the ASSIGN feature bundle is lifted out is 
the LFG analogue of the natural language phenomenon of Subject or Object 
"control". whereby just the features of the Subject or Object of a lower clause 
are lifted out of the lower clause to become the Subject or Object of a matrix 
sentence; the remaining features stay unmergeable because they stay 
protected behind the individually labelled terms. 
To actu,'dly "implement" this in an LFG one can add two ncw branches to 
each Term expansion in the base context-free grammar, as well as two 
"conttul" equation specificatious that do the actual work of lifting the 
features from a lower clause to the matrix ~ntence: Natural language case (from \[81, pp. 43-45): 
The girl persuaded the baby to go. 
(part of the) lexicai ena'y for 
perauaded: 
V (T VCOMPSubject)=(T OhjecO 
The notation (T VCOMP Subjec0=(T Object) - dubbed a "control 
equation" -- means that the features of the Object above the V(erb) node am 
to be the same t~ those of the features of the Subject of the verb complement 
(VCOMP). Hence the top-most node of the pa~e tree eventually has a 
feature bundle something like: 
~'ubject: {bundle of features for NP subject "the gift"} 
predicate: 'persuadc<(T Subject)(T ObjectXTVcomp)>' 
3bjecr \[bundle of features for NP Object "the baby"} 
"\ COPIED 
/erb 
3omplement: ~Subject: {bundle ~f features for NP subject "the baby"a~ 
"VCOMP") ~.Predicate: 'go((TSubject)>' ..J 
Note l:ow the Object features have been copied from the Subj~'t 
features of the Verb Complement, via the notation ~k..~cribed above, but 
the Predicate features of the Verb Complement were leR behind. 
The satisfiability analogue of this machinery is almost identical: 
Phrase structure U'ee: 
Af Ti"'~T COMP 
DUm~k 
One now attaches a "control equation" to the A i node that forces the Assi=n 
Feature bundle From the TiCOMP side co be lifted up to gct merged iuto the 
A.~si~n feature bundle of the T i node (and then, in turn, to become merged at 
the topmost node of the tree by the usual Full copy up-and-down arrows): 
(r TiCOMP Assign) = (TAssign) 
Note how this is just like the copying of the Subject Features of a Verb 
Complcmcnt into the Object position of a matrix clause. 
4. REI EVANCE OF COMPI.EXITY RESUI.TS ,~N\[') CONCLUSIONS 
Thc demons~ation of the previous section shows that LFGs have enough 
power to "simulate" a probably computationally intractable problem. But 
what are we to make of this result? On the positive side, a complexity resuR 
such as this one places the LFG theory more precisely in the hierarchy of 
complexity classes. Ifwe conjecture, as seems reasonable, that LFG language 
recognition is actually in the class NP (that is, LFG recognition can be done 
by a non-deterministic Turing machine in polynomial ~rne), then LFG 
language rccognitiun is NP-complete. (This conjecture seems reasonable 
because a non-determfnistic "luring machine should be able to "guess" all 
Feature propagation solutions using its non-deterministic power - including 
any "long-distance" binding solutions, an LFG device not discussed here. 
Since checking candidate solutions is quite rapid - it can be done in n 2 time 
or less, as described in \[$\] - r~ognition should be possible in polynomial 
time on such a machine.) Comparing this result to other known language 
claas~ note that context-sensitive language recognition is in the cia~ 
polynomial space ("PSPACE'). since (non-deterministic) linear bounded 
automata generate exactly the class of context-sensitive languages. 
(Non-deterministic and deterministic polynomial space classes collapse 
together, because of Savitch's wcll-known result \[9\] that any Function 
computable in non-dcterminL'~ic space N can be computed in demrmini,,,~ 
space N2.) Funhennore, the class NP is clearly a subset of PSP^CE (since if 
a function uses Space N, it must use at least Time N), and it is suspected, but 
not known for certain, that NP is a proper subset of PSPACE. (This being a 
Form of the P=NP question once again.) Our conclusion is that it is likely 
that LFG's generete a proper subset of the context-sensitive languages. (In \[81 
it is shown that this includes some strictly context-sensitive languages.) It is 
imeresting that several other "natural" extensions of the context-~ 
languages - notably, the class of languages generated by the so-called 
-mdexcd grammars" - also generam a subset of the conteat-sensitive 
languages, including those su'ictly context-sensitive languages shown to be 
generable hy LFGs in \[8\], but are provably NP-eomplete (soc \[21 for proofs). 
Indeed. a cursory look at the power of the indexed grammars at least sugg~s 
that they might subsume the machinery of the LFG theory; this would be a 
good conjecture to check. 
On the other ~ide of d~e coin. how might one restrict \[.FG theory further so 
az ~o avoid possible intractability? Several c~ape hau:hcs immediately come 
to mind; thc-ze will simply be listed here. Note that all of these "fixes" have 
the effect of adding additional consu'aints to t~rther restrict the LFG thcory, 
I. Rule out "worst case" languages as linguistically irrelevant. 
"\['he probable computational inu'actability arises because co-occurrence 
restrictions (cumpatible a.~signment of Xi's) can be Fumed across arbitrary 
distances in the terminal string in conjunctioo with lexical ambiguity For each 
terminal itcm. \[f some device can be Found in natural languages that filters 
out or removes such ambiguity locally (so that the choice of whether an item 
is "T" or "1 -~' never depends on other itcms arbitrarily far away in the 
terminal string), or if natural languages never employ such kinds of 
co-~currence restrictions, dlen the reduction is theoretically relevant, but 
linguistically irrelevant. Note that such a finding would be a positive 
discovcry, since one would be able to filnhcr r~trict the LFG theory in its 
12 
attempt to characterize all and only the natural languages. This di~"overy 
would be on a par with, for example, Petcrs and Ritchi¢'s observation ~hat 
although the context-sensitive phrase structure roles Formally advanced in 
linguistic theory have the power to generate non-context-Free languages, that 
power has apparendy never been used in immediate constituent analysis \[11\]. 
2. Add "locality principlus" for recognition (or parsing). 
One could simply stipulate that LFG languages meet some condition known 
to ensure efficient recognizability, e.g, Knuth's \[7\] LR(k) restriction, suitably 
extended to the case of cuntext-sonsitive languages. (See \[10\] For more 
3. Restrict the lexicon, 
The reduction depends crucially upon having a n infinite stock oflexieal items 
and an infinite number of Features with which co label them - several for 
each literal X r This is necessary because as CNF Formulas grow larger and 
larger, the number of Iiterals can grow arbitrarily large. If, For whatever 
reason, the stock of lexical items or feature labels is finite, then the reduction 
method must Fail after a certain point. -\[-his restriction seems ad hoe in the 
case ofiexical items, but perhaps less so in dze case of Festures, (Speculating. 
perhaps features require "grounding" in terms of other language/cognitive 
sub-systems -- e.8,, a Feature might be required to be one of a finite number 
of primitive "basis" elements of a hypothetical conceptual or sensort-motor 
cognitive system.) 
ACKNOWI.ED~ F.MEN'TS 
\[ would like to thank Run Kapian. Ray Perrault. Chrisms Pnpadimimou,and 
particularly Sc.,nloy Peters For various discussions about the contents of this 
paper. 
"This n:pon describes rescarctl done at the A~iticial Intelligence \[aboratory 
of" U1c Massachusetts Institute of '\['cchnology. Support For the Laboratory's 
artificial intelligeuce re,catch is provided in part by the Office of Naval 
gc~il~h under Office of Naval Res~treh contr-'t N00014-80_..-C-0508. 
~ E\[-'ERENCF.S 
Ill Peters, S. and Ri~hie` R. "On the generative power of ~.nsform~tional 
grammae~." hffonua¢ien Sciences 6, 1973, pp. 49-83. 
\[2\] Rounds, W. "Complexity of recognition in intermedia~.~.tevet languag¢~" 
Pmcucdings o( the 14th Ann. Syrup, on Switching Theory and Automat=, 
19"/3. 
\[31 Ih)unds W, "A grammatical charactertzadon of" exponential-dine 
languages," Proceedings of the 16th Ann. Syrup. on Switching "rheory ami 
Automata, 1975. pp. 135-143. 
\[4\] Chomsky, N. Rules and Representations New York: Columbia University 
Press, 1980. 
\[5\[ Befwick, R. and Weinberg, A. The Role of Grammars in Model~ of 
Language Use., unpublished Mrr report, forthcoming, 198L 
\[6\] Magus, M. A Theory of S~taedc Recognition for Natural Language, 
Cambridge, MA: MITPreas, 1980. 
\[7.\] Knuth, D. "On the translation of languages from left to right?, 
Information and Conm)i, 8, 1965, pp. 607-639. 
\[8 ! Kaplan. R. and Bresuan. .\[. Lexical-funclional Grommar: A Formal System 
for Grammatical Representation, Cambridge, MA: MIT Cognitive Science 
Occasional Paper # 13, 1981. (also Forthcoming in Bresnan, cal., The Men~l 
Rep~seatation of Grammatical Relations, Cambridge, MA: MIT Press, 1981 
\[9\] HoperoR. J. and Ulhnan, J. Introduction to Automata Theory, Languages, 
and Computation, Reading, MA: Addison-Wesley, 1979. 
\[10\] Bcrwick, R. Locality Principles and the Acquisition of Syntactic 
Knowledge, MIT PhD. cUasenadon, 1981 forthcoming. 
\[ll\] Peters, S. and Ritchie` R. Context-~ensilive bnnwdime constituent 
asaal3~is: contexi-free languages revisiled~ Mathematical Systems Theory, 6:4, 
1973, pp. 324-333. 
