XML Viewer - j82-3001

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/82/j82-3001_metho.xml
Size: 51,562 bytes
Last Modified: 2025-10-06 14:11:27
<?xml version="1.0" standalone="yes"?>
<Paper uid="J82-3001">
  <Title>Computational Complexity and LexicaI-Functional Grammar</Title>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 The phrase &amp;quot;input/output equivalence&amp;quot; simply means that
</SectionTitle>
    <Paragraph position="0"> the two systems - the linguistic grammar and the heuristic principles - produce the same (surface string, underlying structure) pairs.</Paragraph>
    <Paragraph position="1"> Note that the &amp;quot;internal constitution&amp;quot; of the two systems could be wildly different. The intuitive notion of &amp;quot;embedding a linguistic theory into a model of language use&amp;quot; as it is generally construed is much stronger than this, since it implies that the parsing system follows some (perhaps all) of the same operating principles as the linguistic system, and makes reference in its operation to the same system of rules. This intuitive description can be sharpened considerably. See Berwick and Weinberg 1983 for a more detailed discussion of &amp;quot;transparency&amp;quot; as it relates to the embeddability of a linguistic theory in a model of language use, in this case, a model of parsing.</Paragraph>
    <Paragraph position="2"> Copyright 1982 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the Journal reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission. 0362-613X/82/030097-13 $03.00 American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982 97 Robert C. Berwick Computational Complexity and LexicaI-Functional Grammar system can generate, and so can tell us whether the linguistic system is in principle descriptively adequate.</Paragraph>
    <Paragraph position="3"> This method of argument was used in Chomsky's original rejection of finite-state languages as an adequate characterization of human linguistic competence. Second, as mentioned, the resource bound on recognition given by a complexity-theoretic analysis tells us how long recognition will take in the worst possible case.</Paragraph>
    <Paragraph position="4"> Since unrestricted TGs can generate computationally &amp;quot;hard&amp;quot; languages, then plainly, in order to make TGs efficiently parsable, one must supply additional restrictions. These could be either modifications to the theory of TG itself, or constraints on the parsing mechanism. For example, the current theory of TG (see Chomsky 1981) contains several restrictions on the way in which displaced constituents such as wh-phrases may be linked to their &amp;quot;canonical&amp;quot; position in predicate-argument structure. (E.g., Who in Who did Bill kiss is assumed to be linked to a canonical argument position after the verb kiss.) As an example of a constraint on the parsing mechanism, one could proceed as did Marcus 1980, and posit constraints dictating that TG-generated languages must have parsers that meet certain &amp;quot;locality conditions&amp;quot;. 4 For instance, the Marcus constraints amount to an extension of Knuth's 1965 LR(k) locality condition to a (restricted) version of a two-stack deterministic push-down automaton. 5 Recently, a new theory of grammar has been advanced with the explicitly stated aim of meeting the dual demands of learnability and parsability - the Lexical-Functional Grammars (LFGs) of Kaplan and Bresnan 1981. The theory of Lexical-Functional Grammar is claimed to be at least as descriptively adequate as Transformational Grammar, if not more so. Moreover, it is claimed to have none of TG's com4 It is important not to confuse the requirement that TG-generated languages have parsers that meet certain constraints with the claim that such parsers transparently embed TGs. As stated, the only requirement is one of weak input/output equivalence - i.e., that the parser construct the same (surface string, underlying representation) pairs as the TG. Actually, one can show that a modified Marcus parsing system goes beyond this requirement and operates according to the same principles as the recent transformational theory of Chomsky. That is, such a modified Marcus parser makes reference to the same base constraints and representational units as the linguistic theory. Since it abides by the same rules and representations as TG, one is justified in claiming that the model embeds a TG. Note that the Marcus parser does not mimic earlier theories of TG (as presented in Aspects of the Theory of Syntax); there is no rule-for-rule correspondence between an Aspects grammar and the rules of the Marcus parser. But neither is there a rule-for-rule correspondence between modern theories of TG and the Aspects theory. For example, there is no longer a distinct rule of &amp;quot;passive&amp;quot; or &amp;quot;dative movement&amp;quot;. A detailed demonstration of this claim would go far beyond the purpose of this paper. See Berwick and Weinberg forthcoming.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 The possible need for LR(k)-like restrictions in order to
</SectionTitle>
    <Paragraph position="0"> ensure efficient processability was also suggested by Rounds 1973.</Paragraph>
    <Paragraph position="1"> putational unruliness, in the sense that it is claimed that there is a &amp;quot;natural&amp;quot; embedding of an LFG into a parsing mechanism (a performance model) that accounts for human sentence processing behavior. In LFG, there are no transformations (as classically described); the work formerly ascribed to transformations such as &amp;quot;passive&amp;quot; is shouldered by information stored in lexical entries associated with lexical items.</Paragraph>
    <Paragraph position="2"> The underlying representation of surface strings that is built is also different from the deep structures of classical transformational theory; the representation makes reference to functionally defined notions of grammatical terms like &amp;quot;Subject&amp;quot;, rather than defining them structurally, as was done in classical transformation theory. The elimination of transformational power and the use of a different kind of underlying representation for sentences naturally gives rise to the hope that a lexical-functional system would be computationally simpler than a transformational one.</Paragraph>
    <Paragraph position="3"> An interesting question then is to determine, as has already been done for the case of certain brands of Transformational Grammar, just what the &amp;quot;worst case&amp;quot; computational complexity for the recognition of LFG languages is. If the recognition time complexity for languages generated by the basic LFG theory can be as complex as that for languages generated by a moderately restricted transformational system, then presumably LFG will also have to add additional constraints, beyond those provided in its basic theory, in order to ensure efficient parsability. Just as with transformational theories, these could be constraints on either the theory or its performance model realization. null The main result of this paper is to show that certain Lexical-Functional Grammars can generate languages whose recognition time is very likely computationally intractable, at least according to our current understanding of algorithmic complexity. Briefly, the demonstration proceeds by showing how a problem that is widely conjectured to be computationally difficult - namely, whether there exists an assignment of l's and O's (or &amp;quot;T'&amp;quot;s and &amp;quot;F'&amp;quot;s) to the atoms of a Boolean formula in conjunctive normal form that makes the formula evaluate to &amp;quot;1&amp;quot; (or &amp;quot;true&amp;quot;) - can be re-expressed as the problem of recognizing whether a particular string is or is not a member of the language generated by a certain Lexical-Functional Grammar. This &amp;quot;reduction&amp;quot; shows that in the worst case the recognition of LFG languages can be just as hard as the original Boolean satisfiability problem.</Paragraph>
    <Paragraph position="4"> Since it is widely conjectured that there cannot be a polynomial-time algorithm for satisfiability (the problem is NP-complete), there cannot be a polynomial-time recognition algorithm for LFGs in general either. Note that this results sharpens that in Kaplan and Bresnan 1981; there it is shown only that LFGs 98 American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982 Robert C. Berwick Computational Complexity and LexicaI-Functional Grammar (weakly) generate some subset of the class of context-sensitive languages, and, therefore, in the worst case, exponential time is known to be sufficient (though not necessary) to recognize any LFG language. The result in Kaplan and Bresnan 1981 therefore does not address the question of how much time, in the worst case, is necessary to recognize LFG languages. 6 The result of this paper indicates that in the worst case more than polynomial time will probably be necessary.</Paragraph>
    <Paragraph position="5"> (The reason for the hedge &amp;quot;probably&amp;quot; will become apparent below; it hinges upon the central unsolved conjecture of current complexity theory.) In short then, this result places the LFG languages more precisely in the complexity hierarchy of languages.</Paragraph>
    <Paragraph position="6"> It also turns out to be instructive to inquire into just why a lexical-functional approach can turn out to be computationally difficult, and how computational tractability may be guaranteed. Advocates of lexical-functional theories may have thought (and some have explicitly stated) that the banishment of transformations is a computationally wise move because transformations are computationally costly. Eliminate the transformations, so this causal argument goes, and one has eliminated all computational problems. Intriguingly though, when one examines the proof to be given below, the ability to express co-occurrence constraints over arbitrary distances across terminal tokens in a string (as in Subject-Verb number agreement), when coupled with the possibility of alternative lexical entries, seems to be all that is required to make the recognition of LFG languages intractable.</Paragraph>
    <Paragraph position="7"> This leaves the question posed in the opening paragraph: just what sorts of constraints on natural languages are required in order to ensure efficient parsability? As it turns out, even though general LFGs may well be computationally intractable, it is easy to imagine a variety of additional constraints for LFG theory that provide a way to avoid this problem. All of these additional restrictions amount to making the LFG theory more restricted, in such a way that the reduction argument cannot be made to work. For example, one effective restriction is to stipulate that there can only be a finite stock of features with which to label lexical items. In any case, the moral of the story is an unsurprising one: specificity and constraints can absolve a theory of grammar from computational intractability.</Paragraph>
    <Paragraph position="8"> What may be more surprising is that the requisite locality constraints seem to be useful for a variety of theories of grammar, from Transformational Grammar to Lexical-Functional Grammar.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. A Review of Reduction Arguments
</SectionTitle>
    <Paragraph position="0"> The demonstration of the computational complexity of LFGs relies upon the standard complexity-theoretic technique of reduction. Because this method may be unfamiliar to many readers, a short review is presented immediately below; this is followed by a sketch of the reduction proper.</Paragraph>
    <Paragraph position="1"> The idea behind the reduction technique is to take a difficult problem, in this case the problem of determining the satisfiability of Boolean formulas in conjunctive normal form (CNF), and show that the problem can be quickly transformed into the problem whose complexity remains to be determined, in this case the problem of deciding whether a given string is in the language generated by a given Lexical-Functional Grammar. Before the reduction proper is reviewed, some definitional groundwork must be presented. A Boolean formula in conjunctive normal form is a conjunction or disjunction of literals, where a literal is just an atom (like Xi) or the negation of an atom (Xi). A formula is satisfiable just in case there exists some assignment of T's and F's (or l's and O's) to the atoms of a formula that forces the evaluation of the entire formula to be T (true); otherwise, the formula is said to be unsatisfiable. For example, the following formula is satisfiable: (X2VX3VX7)A(X1VX2VX4)A(X3VX1VX7) since the assignment of X2=T, X3=F, X7=F, XI=T, and X4=F makes the whole formula evaluate to &amp;quot;T&amp;quot;. The reduction in the proof below uses a somewhat more restricted format where every term comprises the disjunction of exactly three literals, so-called 3-CNF (or &amp;quot;3-SAT&amp;quot;). 7 How does a reduction show that the LFG recognition problem must be at least as hard (computationally speaking) as the original problem of Boolean satisfiability? The answer is that any decision procedure for LFG recognition could be used as a correspondingly fast decision procedure for 3-CNF, as follows: (1) Given an instance of a 3-CNF problem (the question of whether there exists a satisfying assignment for a given formula in 3-CNF), apply the transformational algorithm provided by the reduction; this algorithm is itself assumed to execute quickly, in polynomial time or less. The algorithm outputs a corresponding LFG decision problem, namely: (i) a Lexical-Functional Grammar and (ii) a string to be tested for membership in the language generated by the LFG. The LFG recognition problem represents or mimics the decision problem for 3-CNF in the sense that the &amp;quot;yes&amp;quot; and &amp;quot;no&amp;quot; answers to both satisfiability problem and member- null 6 This result can also be established by showing that LFGs can generate at least all the indexed languages as defined by Aho 1968. See Berwick 1981 for details.</Paragraph>
    <Paragraph position="2"> 7 This restriction entails no loss of generality (see Hopcroft and Ullman 1979, Chapter 12), since this restricted format can be easily shown to have the power to express any CNF formula. American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982 99 Robert C. Berwick Computational Complexity and LexicaI-Functional Grammar  ship problem must coincide (if there is a satisfying assignment, then the corresponding LFG decision problem should give a &amp;quot;yes&amp;quot; answer, etc.). (2) Solve the LFG decision problem - the string-LFG pair - output by Step 1. If the string is in the LFG language, the original formula is satisfiable; if not, it is unsatisfiable. 8 To see how a reduction can tell us something about the &amp;quot;worst case&amp;quot; time or space complexity required to recognize whether a string is or is not in an LFG language, suppose for example that the decision procedure for determining whether a string is in an LFG language took only polynomial time (that is, takes time n k on a deterministic Turing machine, for some integer k, where n is the length of the input string). Then, since the composition of two polynomial algorithms can be readily shown to take only polynomial time (see Hopcroft and Ullman 1979, Chapter 12), the entire process sketched above, from input of the CNF formula to the decision about its satisfiability, will take only polynomial time.</Paragraph>
    <Paragraph position="3"> However, CNF (or 3-CNF) has no known polynomial time algorithm, and indeed, it is considered exceedingly unlikely that one could exist. Therefore, it is just as unlikely that LFG recognition could be done (in general) in polynomial time. What the reduction shows is that LFG recognition is at least as hard as the problem of CNF. Since the latter problem is widely considered to be difficult, the former inherits the difficulty. null The theory of computational complexity has a  much more compact term for problems like CNF: CNF is NP-complete. This label is easily deciphered: (1) CNF satisfiability is in the class NP. That is, the problem of determining whether an arbitrary CNF  formula is satisfiable can be computed by a non-deterministic Turing machine in polynomial time. (Hence the abbreviation &amp;quot;NP&amp;quot;, for &amp;quot;nondeterministic polynomial&amp;quot;. To see that CNF is indeed in the class NP, note that one can simply guess all possible combinations of truth assignments to literals, and check each guess in polynomial time.) (2) CNF is complete. That is, all other problems in the class NP can be quickly reduced to some CNF formula. (Roughly, one shows that Boolean for8 Note that the grammar and string so constructed depend upon just what formula is under analysis; that is, for each different CNF formula, the procedure presented above outputs a different LFG grammar and string combination. In the LFG case it is important to remember that &amp;quot;grammar&amp;quot; really means &amp;quot;grammar plus lexicon&amp;quot; - as one might expect in a lexically-based theory. S. Peters has observed that a slightly different reduction allows one to keep most of the grammar fixed across' all possible input formulas, constructing only different-sized lexicons for each different CNF formula. Details are provided below.</Paragraph>
    <Paragraph position="4"> mulas can be used to &amp;quot;simulate&amp;quot; any valid computation of a non-deterministic Turning machine.) Since the class of problems solvable in polynomial time on a deterministic Turing machine (conventionally notated, P) is trivially contained in the class so solved by a non-deterministic Turing machine, the class P must be a subset of the class NP. A wellknown, well-studied, and still open question is whether the class P is a proper subset of the class NP. In other words, are there problems solvable in non-deterministic polynomial time that cannot be solved in deterministic polynomial time? Because all of the several thousand NP-complete problems now catalogued have so far proved recalcitrant to deterministic polynomial time solution, it is widely held that P must indeed be a proper subset of NP, and therefore that the best possible algorithms for solving NP-complete problems must take more than polynomial time. (In general, the algorithms now known for such problems involve exponential combinatorial search, in one fashion or another; these are essentially methods that do no better than to brutally simulate - deterministically, of course - a non-deterministic machine that &amp;quot;guesses&amp;quot; possible answers.) To repeat the force of the reduction argument then, if all LFG recognition problems were solvable in polynomial time, then the ability to quickly reduce CNF formulas to LFG recognition problems would imply that all NP-complete problems would be solvable in polynomial time, and that the class P = the class NP.</Paragraph>
    <Paragraph position="5"> This possibility seems extremely remote. Hence, our assumption that there is a fast (general) procedure for recognizing whether a string is or is not in the language generated by an arbitrary LFG must be false. In the terminology of complexity theory, LFG recognition must be NP-hard - &amp;quot;as hard as&amp;quot; any other NP problem, including the NP-complete problems. This means only that LFG recognition is at least as hard as other NP-complete problems - it could still be more difficult (lie in some class that contains the class NP). If one could also show that the languages generated by LFGs are in the class NP, then LFGs would be shown to be NP-complete. This paper stops short of proving this last claim, but simply conjectures that LFGs are in the  class NP.</Paragraph>
    <Paragraph position="6"> 3. A Sketch of the Reduction  To carry out this demonstration in detail one must explicitly describe the transformation procedure that takes as input a formula in CNF and outputs a corresponding LFG decision problem - a string to be tested for membership in a LFG language and the LFG itself.</Paragraph>
    <Paragraph position="7"> One must also show that this can be done quickly, in a number of steps proportional to (at most) the length of the original formula to some polynomial power.</Paragraph>
    <Paragraph position="8"> 100 American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982 Robert C. Berwick Computational Complexity and LexicaI-Functional Grammar One caveat is in order before embarking on a proof sketch of this reduction. The grammar that is output by the reduction procedure will not look very much like a grammar for a natural language, although the grammatical devices that will be employed will in every way be those that are an essential part of the LFG theory.9 In other words, although it is most unlikely that any natural language would encode the satisfiability problem (and hence be intractable) in just the manner outlined below, no &amp;quot;exotic&amp;quot; LFG machinery is used in the reduction. Indeed, some of the more powerful LFG notational formalisms - long-distance binding, existential and negative feature operators- have not been exploited. (An earlier proof made use of an existential operator in the feature machinery of LFG, but the reduction presented here does not.) To make good this demonstration one must set out just what the satisfiability problem is and what the decision problem for membership in an LFG language is. Recall that a formula in conjunctive normal form is satisfiable just in case every conjunctive term evaluates to true, that is, at least one literal in each term is true. The satisfiability problem is to find an assignment of T's and F's to the atoms at the bottom (note that complements of atoms are also permitted) such that the root node at the top gets the value &amp;quot;T&amp;quot; (for true). How can we get a Lexical-Functional Grammar to represent this problem? What we want is for satisfying assignments to correspond to well-formed sentences of some corresponding Lexical-Functional Grammar, and non-satisfying assignments to correspond to sentences that are not well-formed, according to the LFG, as indicated in Figure 1. Since one wants the satisfying/non-satisfying assignments of ahay particular formula to map over into well-formed/illformed sentences, one must obviously exploit the LFG machinery for capturing well-formedness conditions on sentences. To make the discussion clear to the reader will require a brief account of the LFG theory itself.</Paragraph>
    <Paragraph position="9"> satisfiable non-satisfiable formula formula sentence w IS sentence w IS NOT in LFG language L(G) in LFG language L(G)  Just as in a transformational theory, a Lexical-Functional Grammar associates with each generable surface string (sentence) a number of distinct repre9 These include feature agreement, the lexical analog of Sub-ject or Object &amp;quot;control&amp;quot;, lexical ambiguity, and a garden variety context-free base grammar.</Paragraph>
    <Paragraph position="10"> sentations. For our purposes here we need to focus on just two of these: the constituent structure of a sentence (its &amp;quot;c-structure&amp;quot;, roughly, a labeled bracketing of the surface string, annotated with certain feature complexes); and the functional structure of a sentence (its &amp;quot;f-structure&amp;quot;, roughly, a representation of the underlying predicate-argument structure of a sentence, described in terms of grammatical relations such as Subject and Object.) Unlike a Transformational Grammar, however, a Lexical-Functional Grammar does not generate surface sentences by first specifying an explicit, context-free deep structure followed by a series of categorially-based transformations. &amp;quot;Categorially-based&amp;quot; simply means that the transformations move constituents defined in terms of categories, like NP or PP.) Rather, predicate-argument structure is mapped directly into c-structure, on the basis of predicates that are grounded upon grammatical relations (like Subject and Object). The conditions for this mapping are provided by a set of so-called functional equations associated with the context-free rules for generating permissible c-structures, along with a set of conventions that in effect convert the functional equations into well-formedness predicates for cstructures. null In more detail, an LFG c-structure is generated by a base context-free ~rammar. A necessary condition for a sentence (considered as a string) to be in the language generated by a Lexical-Functional Grammar is that it can be generated by this base grammar; such a sentence is then said to have a well-formed constituent structure. For example, if the base rules included S=&gt;NP VP; VP=&gt;V NP, then (glossing over details of Noun Phrase rules) the sentence John kissed the baby would be well-formed but John the baby kissed would not. Note that this assumes, as usual, the existence of a lexicon that provides a categorization for each terminal item, e.g., that baby is of the category N, kissed is a V, etc. Importantly then, this well-formedness condition requires us to provide at least one legitimate parse tree for the candidate sentence that shows how it may be derived from the underlying LFG base context-free grammar. (There could be more than one legitimate tree if the underlying grammar is ambiguous.) Note further that the choice of categorization for a lexical item may be crucial. If baby was assumed to be of category V, then both sentences above would be ill-formed.</Paragraph>
    <Paragraph position="11"> Since the base grammar is context-free, there are well-known algorithms for checking the well-formedness of the strings it can generate in polynomial time. Intractability cannot arise on this score, then.</Paragraph>
    <Paragraph position="12"> A Lexical-Functi0nal Grammar consists of more than just a base context-free grammar, however. As mentioned, a second major component of the LFG theory is the provision for adding a set of so-called American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982 101 Robert C. Berwick Computational Complexity and LexicaI-Functional Grammar functional equations to the base context-free rules.</Paragraph>
    <Paragraph position="13"> The functional equations define an implicit f-structure associated with every c-structure, and this f-structure must itself be well-formed. Part of the linguistic role of f-structures is to account for the co-occurrence restrictions that are an obvious part of natural languages (e.g., Subject-Verb agreement).</Paragraph>
    <Paragraph position="14"> How exactly do the functional equations work? Their job is to specify how the f-structure of a sentence gets built. This is done by associating possibly complex features with lexical entries and with the non-terminals of specified context-free rules; these features have values. The features are pasted together under the direction of the functional equations to form f-structures associated with the sub-consIituents of the sentence; these (now possibly complex) f-structures are in turn assembled to form a master f-structure associated with the root node of the sentence.</Paragraph>
    <Paragraph position="15"> Note then that in this theory a &amp;quot;feature&amp;quot; can be something as simple as an atomic object that is binary valued; for example, a Subiect feature could be either plural or singular in value. But denominators can also have a range of values, and - more crucial for the purposes of the demonstration here - a feature can itself be a complex, hierarchically structured object that contains other features as sub-constituents. For example, the &amp;quot;feature&amp;quot; that eventually becomes associated with the root node of a sentence is in fact an f-structure that represents the full propositional structure of the sentence. Thus if the surface string was the sentence, The girl promised to kiss the baby, then the f-structure associated with the root node of the sentence is a complex &amp;quot;feature&amp;quot; that itself contains an embedded f-structure corresponding to the embedded proposition the girl to kiss the baby.</Paragraph>
    <Paragraph position="16"> As mentioned, well-formedness is also determined by functional equations, dictating (according to certain conventions) how feature complexes are to be assembled. By and large the f-structure complex at a node X is assembled compositionally in terms of the f-structure complexes of the nodes below it in the constituent structure tree. For example, the root node of a sentence will have an associated f-structure with Subject and Predicate sub-features. These structures are themselves complex - the entire Subject NP and Verb-Verb Complement structures, respectively. For instance, the Subject NP in turn has a sub-feature Number; the Predicate contains complex sub-features corresponding to the Verb and Verb Complements.</Paragraph>
    <Paragraph position="17"> The basic assembly directive is the notation (4=4). 1deg When attached to a particular node X, it states that the f-structure of the node above X is to share all the f-structure of the nodes below X. The effect is to merge and &amp;quot;pass up&amp;quot; all the f-structure values of the nodes below X to the node above X. One can also pass along just particular subfields of the f-structure below X by specifying a subfield on the right-hand side of the expansion rule. As an example, the notion (+=+Number) attached to a node X states that the f-structure of the node above X is to contain at least the value of the Number feature. (This &amp;quot;value&amp;quot; may itself be an f-structure.) Similarly, a particular sub-field of the f-structure above a node X may be specified by providing a subfield label on the left-hand side of the arrow notation. For example, the notation, (+Subject=4) means that the Subject subfield of the f-structure built at the node above X must contain the f-structure built below X.</Paragraph>
    <Paragraph position="18"> A basic constraint on f-structures is that the f-structure assembled at X must be uniquely determined; that is, it cannot contain a feature F 1 with conflicting values. This entails, for example, that the Subject sub-f-structure that is built at a root-S node cannot have a Number sub-field that is filled in from one place beneath with the value Singular and from another place with the value Plural. More generally, this restriction means that two or more f-structures that are &amp;quot;passed up&amp;quot; from below according to the dictates of an arrow notation at a single node above must be unifiable - any common sub-fields, no matter how hierarchically complex, must be mergeable without conflict.</Paragraph>
    <Paragraph position="19"> For example, consider Subject-Verb agreement and the sentence the baby is kissin~ John. The lexical entry for baby (considered as a Noun) might have the Number feature, with the value singular. The lexical entry for is might assert that the number feature of the Subiect above it in the parse tree must have the value singular, via the annotation (+Subject=singular) attached to the verb. Meanwhile, the feature values for Subject are automatically found by the annotation (+Subject=4) associated with the Noun Phrase portion of S=&gt;NP VP) that grabs whatever features it finds below the NP node and copies them up above to the S node. Thus the S node gets the Subject feature with whatever value it has passed from baby below namely, the value singular; this accords with the dictates of the verb is___z, and all is well. In contrast, in the sentence, th_.__~e boys in the band is kissin~ John, boys passes up the number value plural, and this clashes with the verb's constraint; as a result this sentence is judged ill-formed, as Figure 2 shows.</Paragraph>
    <Paragraph position="20"> 10 More generally, the assembly directive is specified via the notation (+featl=Sfeat2), where feat1 and feat2 are meta-variables specifying a subfield of the f-structure immediately above or below the node to which the the annotation is attached. If no field is given, then the entire f-structure is assumed. For example, the notation (+ Subject Number= 4) attached to a node X means that the Number subfield of the Subject subfield of the f-structure associated with the node above X is to be filled in with the value of the entire f-structure below X.</Paragraph>
    <Paragraph position="21">  It is important to note that the feature compatibility check requires (1) a particular constituent structure tree (a parse tree); and (2) an assignment of terminal items (words) to lexical categories - e.g., in the first Subject-Verb agreement example above, baby was assigned to the category N, a Noun. The tree is obviously required because the feature-checking machinery propagates values according to the links specified by the derivation tree; the assignment of terminal items to categories is crucial because in most cases the values of features are derived from those listed in the lexical entry for an item (as the value of the number feature was derived from the lexical entry for the Noun form of baby). One and the same terminal item can have two distinct lexical entries, corresponding to distinct lexical categorizations; for example, baby can be both a Noun and a Verb. If we had picked baby to be a Verb, and hence had adopted whatever features are associated with the Verb entry for baby to be propagated up the tree, then the string that was previously well-formed, th.__~e baby is kissin 8 John, would now be considered deviant. If a string is ill-formed under all possible derivation trees and assignments of features from possible lexical categorizations, than that string is not in the language generated by the LFG. The ability to have multiple derivation trees and lexical categorizations for one and the same terminal item plays a crucial role in the reduction proof: it is intended to capture the satisfiability problem of deciding whether to given an atom X i a value of &amp;quot;T&amp;quot; or &amp;quot;F&amp;quot;. Finally, LFG also provides a way to express the familiar patterning of grammatical relations (e.g., &amp;quot;Subject&amp;quot; and &amp;quot;Object&amp;quot;) found in natural language. For example, transitive verbs must have objects. This fact of life (expressed in an Aspects-style Transformational Grammar by subcategorization restrictions) is captured in LFG by specifying a so-called PRED (for predicate) feature with a Verb; the PRED can describe what grammatical relations like &amp;quot;Subject&amp;quot; and &amp;quot;Object&amp;quot; must be filled in after feature passing has taken place in order for the analysis to be wellformed. For instance, a transitive verb like kiss might have the pattern, kiss&lt;(Subject)(Obiect)&gt; , and thus demand that the Subject and Object (now considered to be &amp;quot;features&amp;quot;) have some value in the final analysis. The values for Subject and Object might of course be provided from some other branch of the parse tree, as provided by the feature propagation machinery; for example, the Object feature could be filled in from the Noun Phrase part of the VP expansion. See Figure 3.</Paragraph>
    <Paragraph position="22"> But if the Object were not filled in, then the analysis is declared functionally incomplete, and is ruled out. This device is used to cast out sentences such as th._ee baby kissed.</Paragraph>
    <Paragraph position="23"> So much for the LFG machinery that is required for the reduction proof. (There are additional capabilities in the LFG theory, such as long-distance binding, but these will not be called upon in the demonstration below.) What then does the LFG representation of the CNF satisfiability problem look like? Basically, there are three parts to the satisfiability problem that must be mimicked by the LFG: (1) the assignment of values to atoms, e.g., X2=&gt;&amp;quot;T&amp;quot;; X4=&gt;&amp;quot;F&amp;quot;; (2) the consistency of value assignments in the formula; e.g., the atom X 2 can appear in several different terms, but one is not allowed to assign it the value &amp;quot;T&amp;quot; in one term and the value &amp;quot;F&amp;quot; in another; and (3) the preservation of CNF satisfiability, in that a string will be in the LFG language to be defined just in case its associated CNF formula is satisfiable. Let us now go over how these components may be reproduced in an LFG, one by one.</Paragraph>
    <Paragraph position="24">  (1) Assignments: The input string to be tested  for membership in the LFG will simply be the original formula, sans parentheses and the operators A and V; the terminal items are thus just a string of Xi's. Recall that the job of checking the string for well-formedness involves finding a derivation tree for the string, solving the ancillary co-occurrence equations (by feature propagation), and checking for functional completeness.</Paragraph>
    <Paragraph position="25"> Now, the context-free grammar constructed by the transformation procedure will be set up so as to generate a virtual copy of the associated formula, down to the point where literals X i are assigned their value of &amp;quot;T&amp;quot; or &amp;quot;F&amp;quot;. If the original 3-CNF form had n terms, then denoting each by the symbol Ep, p=l, ..., n, this part of the grammar would look like the following:U</Paragraph>
    <Paragraph position="27"> The subscripts i, j, and k correspond to the actual subscripts in the original formula. Further, the Yi are not terminal items, but are non-terminals that will be expanded into one of the non-terminals T i or Fi.12 Note that so far there are no rules to extend the parse tree down to the level of terminal items, name the X i. The next step does this and at the same time adds the power to choose between &amp;quot;T&amp;quot; and &amp;quot;F&amp;quot; assignments to atoms. One adds to the context-free base grammar two productions deriving each terminal item X i, namely, Ti=&gt;X i and Fi--&gt;Xi, corresponding to an assignment of &amp;quot;T&amp;quot; or &amp;quot;F&amp;quot; to the atoms of the formula (it is important not to get confused here between the atoms of the formula - these are terminal elements in the Lexical-Functional Grammar - and the non-terminals of the grammar.) Plainly, one must also add the rules Yi=&gt;Ti\] Fi, for each i, and rules corresponding to the assignment of truth-values to the negations of literals, Ti=&gt;X i and Fi---&gt;X i. Note that these are not &amp;quot;exotic&amp;quot; LFG rules: exactly the same sort of rule is required in the baby case, i.e., N=&gt;baby or V=&gt;baby, corresponding to whether baby is a Noun or a Verb. Now, the lexical entries for the &amp;quot;Ti&amp;quot; categorization of X i will look very different from the &amp;quot;Fi&amp;quot; categorization of Xi, just as one might expect the N and V forms for baby to be different.</Paragraph>
    <Paragraph position="28"> Here is what the entries for the two categorizations of</Paragraph>
    <Paragraph position="30"> 11 The context-free base that is built depends upon the original CNF formula that is input, since the number of terms, n, varies from formula to formula. In Stanley Peters's improved version of the reduction proof \[personal communication\], the context-free base is fixed for all formulas with the rules:</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
S=&gt;S S&amp;quot;
</SectionTitle>
    <Paragraph position="0"> S'=&gt;T T T or T T F or T F F or T F T or ...</Paragraph>
    <Paragraph position="1"> (remaining twelve triples containing at least one &amp;quot;T&amp;quot;) The Peters grammar works by recursing until the right number of terms is generated (any sentences that are too long or too short cannot be matched to the input formula). Thus, the number of terms in the original CNF formula need not be explicitly encoded into the base grammar.</Paragraph>
    <Paragraph position="2"> Xi: F i (+Assign Xi) =F Putting aside for the moment the &amp;quot;Truth-assignment&amp;quot; feature in this entry, the feature assignments for the negation of the literal X i must be the complement of this entry: Xi: T i (+Truth-assignment) =T</Paragraph>
    <Paragraph position="4"> Xi: F i (+Assign X i) =T The upward-directed arrows in the entries reflect the LFG feature propagation machine. Remember that T i and F i are just non-terminal categories, like Noun and Verb. For example, if the T i categorization for X i is selected, the entry says to &amp;quot;make the Truth-assignment feature of the node above T i have the value T, and make the X i portion of the Assign feature of the node above have the value T.&amp;quot; This feature propagation device reproduces the assignment of T's and F's to the CNF literals. If we have a triple of such elements, and at least one of them is expanded out to Ti, then the feature propagation machinery of LFG will merge the common feature names into one large structure for the node above, reflecting the assignments made; moreover, the term will get a filled-in truth assignment value just in case at least one of the expansions selected a T i path. This is depicted in Figure 4. Features are passed transparently through the intervening Yi nodes via the LFG &amp;quot;copy&amp;quot; device, (+ = +); this simply means that all the features of the node below the node to which the &amp;quot;copy&amp;quot; up-and-down arrows are attached are to be the same as those of the node above the up-and-down arrows.</Paragraph>
    <Paragraph position="5"> It should be plain that this mechanism mimics the assignment of values to literals required by the satisfiability problem.</Paragraph>
    <Paragraph position="6"> (2) Coordination of assignments: One must also guarantee that the X i value assigned at one place in the tree is not contradicted by the value of an X i or X i elsewhere. To ensure this, we use the LFG co12 This grammar will have to be slightly modified in order for the reduction to work, as will become apparent shortly 104 American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982  occurrence agreement machinery: the Assign featurebundle is passed up from each term to the highest node in the parse tree (one simply adds the (+=4,) notation to each E i rule in order to indicate this). The Assign feature at this node will contain the union of all assign feature bundles passed up by all terms. If any X i values conflict, then the resulting structure is judged ill-formed. Thus, only compatible X i assignments are well-formed. Figure 5 depicts this situation.  (3) Preservation of satisfying assignments: Finally, one has to reproduce the conjunctive character of the 3-CNF problem - that is, a sentence is satisfiable (well-formed) if and only if each term has at least one  literal assigned the value &amp;quot;T&amp;quot;. Part of the disjunctive character of the problem has already been encoded in the feature propagation machinery presented so far; if at least one X i in a term E 1 expands to the lexical categorization Ti, then the Truth-assignment feature gets the value T. This is just as desired. If one, two, or three of the literals X i in a term select Ti, then El'S Truth-assignment feature is T, and the analysis is wellformed. But how do we rule out the case where all three Xi's in a term select the &amp;quot;F&amp;quot; path, Fi? And how do we ensure that all terms have at least one T below them? Both of these problems can be solved by resorting to the LFG functional completeness constraint. The trick is to add a Pred feature to a &amp;quot;dummy&amp;quot; node attached to each term; the sole purpose of this feature will be to refer to the feature Truth-assisnment, just as the predicate template for the transitive verb kiss mentions the feature Obiect. Since an analysis is not well-formed if the &amp;quot;grammatical relations&amp;quot; a Pred mentions are not filled in from somewhere, this will have the effect of forcing the Truth-assignment feature to get filled in every term. Since the &amp;quot;F&amp;quot; lexical entry does not have a Truth-assignment value, if all the Xi's in a term triple select the F i path (all the literals are &amp;quot;F&amp;quot;), then no Truth-assignment feature is ever picked up from the lexical entries, and that term never gets a value for the Truth-assignment feature. This violates what the predicate template demands, and so the whole analysis is thrown out. (The ill-formedness is exactly analogous to the case where a transitive verb never gets an Object.) Since this condition is applied to each term, we have now guaranteed that each term must have at least one literal below it that selects the &amp;quot;T&amp;quot; path - just as desired. To actually add the new predicate template, one simply adds a new (but dummy) branch to each term El, with the appropriate predicate constraint attached to it. See Figure 6.</Paragraph>
    <Paragraph position="8"> There is one final subtle point here: one must also prevent the Pred and Truth-assignment features for each term from being passed up to the head &amp;quot;S&amp;quot; node. The reason is that if these features were passed up, then, since the LFG machinery automatically merges the values of any features with the same name at the topmost node of the parse tree, the LFG machinery would force the union of the feature values for Pred and Truth-assignment over all terms in the analysis tree. The result would be that if any term had at least one &amp;quot;T&amp;quot; (hence satisfying the Truth-assignment predicate template in at least one term), then the Pred and Truth-assignment features would get filled in at the topmost node as well. The string below would be well-formed if at least one term were &amp;quot;T&amp;quot;, and this would amount to a disjunction of disjunctions (an &amp;quot;OR&amp;quot; of &amp;quot;OR's), not quite what is sought. To eliminate this possibility, one must add a final trick: each term E 1 is given separate Pred, Truth-assignment, and Assign features, but only the Assign feature is propagated to the highest node in the parse tree as such. In contrast, the Pred and Truth-assignment features for each term are kept &amp;quot;protected&amp;quot; from merger by storing them under separate feature headings labeled E 1 ..... E n.</Paragraph>
    <Paragraph position="9"> The means by which just the Assign feature bundle is lifted out is the LFG analogue of the natural language phenomenon of Subject or Object &amp;quot;control&amp;quot;, whereby just the features of the Subject or Object of a lower clause are lifted out of the lower clause to become the Subject of Object of a matrix sentence; the remaining features stay unmergeable because they stay protected behind the individually labeled terms.</Paragraph>
    <Paragraph position="10"> To actually &amp;quot;implement&amp;quot; this in an LFG, one can add two new branches to each term expansion in the base context-free grammar, as well as two &amp;quot;control&amp;quot; equation specifications that do the actual work of lifting the features from a lower clause to the matrix sentence. A natural language example of this phenomenon is the following (from Kaplan and Bresnan 1981, pp. 43-45): The girl persuaded the baby to go.</Paragraph>
    <Paragraph position="11"> (part of the) lexical entry for persuaded: V (/ Vcomp Subject) = (/ Object) According to this lexical entry, the Object feature structure of a root sentence containing a verb like persuade is to be the same as the feature structure of the Subject of the Complement of persuade - a &amp;quot;control&amp;quot; equation. Since this Subject is the baby, this means that the features associated with the NP the baby are shared with the features of the Object of the matrix sentence.</Paragraph>
    <Paragraph position="12"> The satisfiability analogue of this machinery is quite similar to this; see Figure 7.</Paragraph>
    <Paragraph position="13"> As Figure 7 shows, a &amp;quot;control equation&amp;quot; should be attached to the A i node that forces the Assign feature bundle from the C i side to be lifted up and ultimately merged into the Assign feature bundle of the E 1 node (and then, in turn, to become merged at the topmost node of the tree by the usual full copy up-and-down arrows):</Paragraph>
    <Paragraph position="15"> The satisfiability analogue is just like the sharing of the Subject features of a Verb Complement with the Object position of a matrix clause.</Paragraph>
    <Paragraph position="16"> To finish off the reduction argument, it must be shown that, given any 3-CNF formula, the corresponding LFG grammar and string as just described can be constructed in a time that is a polynomial function of the length of the original input formula. This is not a difficult task, and only an informal sketch of how it can be done will be given. All one has to do is scan the original formula from left to right, outputting an appropriate cluster of base rules as each triple of literals is scanned: Ei=&gt;AiCi; Ci=&gt;Dummy2 YiYjYk; Yi=&gt;TilFi (similarly for Yj and Yk); Ti=&gt;Xi, Fi=&gt;X i (similarly for Tj and Tk). Note that for each triple of literals in the original input formula the appropriate grammar rules can be output in an amount of time that is just a constant times n. In addition, one must also maintain a counter to keep track of the 106 American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982  equations in CNF analogue of natural language case.</Paragraph>
    <Paragraph position="17"> number of triples so far encountered. This adds at most a logarithmic factor, to do the actual counting.</Paragraph>
    <Paragraph position="18"> At the end of processing the input formula, one must also output the rule S=&gt;E1E2,...,Em, where m is the number of triples in the CNF formula. Since rn is less than n, this procedure too is easily seen to take time that is a polynomial function of the length of the original input formula. Finally, one must also construct the lexical entry for each X i and X i. This too can be done as the input formula is scanned left to right. The only difficulty here is that one must check to see if the entry for each X i has been previously constructed. In the worst case, this involves rescanning the list of lexical entries built so far. Since there are at most n such entries, and since the time to actually output a single entry is constant, at worst the time spent constructing a single lexical entry could be proportional to n. Thus for n entries the total time spent in construction could be at most of order n 2. Since the time to construct the entire grammar is just the sum of the times spent in constructing its production rules and its lexicon, the total time to transform the input formula is bounded above by some constant times n 2.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML