File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/95/p95-1014_metho.xml
Size: 24,147 bytes
Last Modified: 2025-10-06 14:14:05
<?xml version="1.0" standalone="yes"?> <Paper uid="P95-1014"> <Title>Memoization of Coroutined Constraints</Title> <Section position="3" start_page="0" end_page="101" type="metho"> <SectionTitle> 2 Lexical rules in Categorial Grammar </SectionTitle> <Paragraph position="0"> This section reviews Bouma and van Noord's (1994) (BN henceforth) constraint-based categorial grammar analysis of modification in Dutch, which we use as our primary example in this paper. However, the memoizing CLP interpreter presented below has also been applied to GB and HPSG parsing, both of which benefit from constraint coroutining in parsing.</Paragraph> <Paragraph position="1"> BN can explain a number of puzzling scope phenomena by proposing that heads (specifically, verbs) subcategorize for adjuncts as well as arguments (rather than allowing adjuncts to subcategorize for the arguments they modify, as is standard in Categorial Grammar). For example, the first reading of the Dutch sentence (1) Frits opzettelijk Marie lijkt te ontwijken deliberately seems avoid 'Fritz deliberately seems to avoid Marie' 'Fritz seems to deliberately avoid Marie' is obtained by the analysis depicted in Figure 1. The other reading of this sentence is produced by a derivation in which the adjunct addition rule 'A' adds an adjunct to lijkt re, and applies vacuously to ontwijken. null It is easy to formalize this kind of grammar in pure Prolog. In order to simplify the presentation of the proof procedure interpreter below, we write clauses adds adjuncts to verbs, 'D' is a lexical 'division' rule which enables a control or raising verb to combine with arguments of higher arity, and 'D' is a unary modal operator which diacritically marks infinitival verbs. as 'H : :- B' where H is an atom (the head) and B is a list of atoms (the negative literals).</Paragraph> <Paragraph position="2"> The atom x(Cat, Left, Right) is true iff the sub-string between the two string positions Left and Right can be analyzed as belonging to category Cat.</Paragraph> <Paragraph position="3"> (As is standard, we use suffixes of the input string for string positions).</Paragraph> <Paragraph position="4"> The modal operator '~' is used to diacritically mark untensed verbs (e.g., ontwijken), and prevent them from combining with their arguments. Thus untensed verbs must combine with other verbs which subcategorize for them (e.g., lijkt re), forcing all verbs to appear in a 'verb cluster' at the end of a clause.</Paragraph> <Paragraph position="5"> For simplicity we have not provided a semantics here, but it is easy to add a 'semantic interpretation' as a fourth argument in the usual manner. The forward and backward application rules are specified as clauses of x/3. Note that the application rules are left-recursive, so a top-down parser will in general fail to terminate with such a grammar.</Paragraph> <Paragraph position="6"> :- op(990, xfx, ::- ).</Paragraph> <Paragraph position="7"> :- op(400, yfx, \ ).</Paragraph> <Paragraph position="8"> :- op(300, fy, # ).</Paragraph> <Paragraph position="9"> X Clause operator X Backward combinator X Modal operator b' x(X, Left, Right) ::- \[ ~ Forward application x(X/Y, Left, Mid), x(Y, Mid, Right) \].</Paragraph> <Paragraph position="10"> x(X, Left, Right) ::- \[ ~ Backward application x(Y, Left, Mid), x(X\Y, Mid, Right) \].</Paragraph> <Paragraph position="11"> x(I, \[Word\[Words\], Words) ::- \[ lex(Word, X) \].</Paragraph> <Paragraph position="12"> Lexical entries are formalized using a two place relation lex(W0rd, Cat), which is true if Cat is a category that the lexicon assigns to Word.</Paragraph> <Paragraph position="14"> The add_adjuncts/2 and division/2 predicates formalize the lexical rules 'A' (which adds adjuncts to verbs) and 'D' (the division rule).</Paragraph> <Paragraph position="15"> add_adjuncts(s, s) ::- ~.</Paragraph> <Paragraph position="16"> add_adjuncts(I, Y\adv) ::- \[ add_adjuncts(I, Y) \].</Paragraph> <Paragraph position="17"> add_adjuncts(I\PS, Y\A) ::- \[ add_adjuncts(X, Y) \].</Paragraph> <Paragraph position="18"> add_adjuncts(I/A, T/A) ::- \[ add_adjunc~s(l, T) 3.</Paragraph> <Paragraph position="19"> division(I, I) ::- \[\].</Paragraph> <Paragraph position="20"> division(XO/YO, (I\Z)/(Y\Z)) ::- \[ division(IO/YO, I/Y) \].</Paragraph> <Paragraph position="21"> Note that the definitions of add_adjuncSs/2 and division/2 are recursive, and have an infinite number of solutions when only their first arguments are instantiated. This is necessary because the number of adjuncts that can be associated with any given verb is unbounded. Thus it is infeasible to enumerate all of the categories that could be associated with a verb when it is retrieved from the lexicon, so following BN, we treat the predicates add_adjlmcts/2 and division/2 as coroutined constraints which are only resolved when their second arguments become sufficiently instantiated. As noted above, this kind of constraint coroutining is built-in to a number of Prolog implementations. Unfortunately, the left recursion inherent in the combinatory rules mentioned earlier dooms any standard backtracking top-down parser to nontermination, no matter how coroutining is applied to the lexical constraints. As is well-known, memoizing parsers do not suffer from this deficiency, and we present a memoizing interpreter below which does terminate.</Paragraph> </Section> <Section position="4" start_page="101" end_page="103" type="metho"> <SectionTitle> 3 The Lemma Table proof procedure </SectionTitle> <Paragraph position="0"> This section presents a coroutining, memoizing CLP proof procedure. The basic intuition behind our approach is quite natural in a CLP setting like the one of HShfeld and Smolka, which we sketch now.</Paragraph> <Paragraph position="1"> A program is a set of definite clauses of the form p(x) ql(Xl) ^... ^ q.(X.) ^ C/ where the Xi are vectors of variables, p(X) and qi(Xi) are relational atoms and C/ is a basic constraint coming from a basic constraint language C. C/~ will typically refer to some (or all) of the variables mentioned. The language of basic constraints is closed under conjunction and comes with (computable) notions of consistency (of a constraint) and entailment (C/1 ~c C/2) which have to be invariant under variable renaming} Given a program P and a goal G, which is a conjunction of relational atoms and constraints, a P-answer of G is defined as a consistent basic constraint C/ such that C/ --+ G is valid in every model of P. SLD-resolution is generalized in this setting by performing resolution only on relational atoms and simplifying (conjunctions of) basic constraints thus collected in the goal list. When finally only a consistent basic constraint remains, this is an answer constraint C/. Observe that this use of basic constraints generalizes the use of substitutions in ordinary logic programming and the (simplification of a) conjunction of constraints generalizes unification. Actually, pure Prolog can be viewed as a syntactically sugared variant of such a CLP language with equality constraints as basic constraints, where a standard Prolog clause p(T) ~- ql (T,),..., qn (T,) is seen as an abbreviation for a clause in which the equality constraints have been made explicit by means of new variables and new equalities p(X) ,--- X=T, XI--T,,...,Xn=T,, q,(x,,).</Paragraph> <Paragraph position="2"> Here the Xl are vectors of variables and the T/ are vectors of terms.</Paragraph> <Paragraph position="3"> Now consider a standard memoizing proof procedure such as Earley Deduction (Pereira and Warren 1983) or the memoizing procedures described by Tamaki and Sato (1986), Vieille (1989) or Warren (1992) from this perspective. Each memoized goal is associated with a set of bindings for its arguments; so in CLP terms each memoized goal is a 1This essentially means that basic constraints can be recast as first-order predicates.</Paragraph> <Paragraph position="4"> conjunction of a single relational atom and zero or more equality constraints. A completed (i.e., atomic) clause p(T) with an instantiated argument T abbreviates the non-atomic clause p(X) ~ X - T, where the equality constraint makes the instantiation specific. Such equality constraints are 'inherited' via resolution by any clause that resolves with the completed clause.</Paragraph> <Paragraph position="5"> In the CLP perspective, variable-binding or equality constraints have no special status; informally, all constraints can be treated in the same way that pure Prolog treats equality constraints. This is the central insight behind the Lemma Table proof procedure: general constraints are permitted to propagate into and out of subcomputations in the same way that Earley Deduction propagates variable bindings.</Paragraph> <Paragraph position="6"> Thus the Lemma Table proof procedure generalizes Earley Deduction in the following ways: 1. Memoized goals are in general conjunctions of relational atoms and constraints. This allows constraints to be passed into a memoized subcomputation. null We do not use this capability in the categorial grammar example (except to pass in variable bindings), but it is important in GB and HPSG parsing applications. For example, memoized goals in our GB parser consist of conjunctions of X' and ECP constraints. Because the X' phrase-structure rules freely permit empty categories every string has infinitely many well-formed analyses that satisfy the X' constraints, but the conjoined ECP constraint rules out all but a very few of these empty nodes.</Paragraph> <Paragraph position="7"> 2. Completed clauses can contain arbitrary negative literals (rather than just equality constraints, as in Earley Deduction). This allows constraints to be passed out ofa memoized subcomputation. null In the categorial grammar example, the add_adjuncts/2 and division/2 associated with a lexical entry cannot be finitely resolved, as noted above, so e.g., a clause</Paragraph> <Paragraph position="9"> is classified as a completed clause; the add_adjuncts/2 constraint in its body is inherited by any clause which uses this lemma.</Paragraph> <Paragraph position="10"> Subgoals can be selected in any order (Earley Deduction always selects goals in left-to-right order). This allows constraint eoroutining within a memoized subcomputation.</Paragraph> <Paragraph position="11"> In the categorial grammar example, a category becomes more instantiated when it combines with arguments, allowing eventually the add_adjuncts/2 and division/2 to be deterministically resolved. Thus we use the flexibility in the selection of goals to run constraints whenever their arguments are sufficiently instantiated, and delay them otherwise.</Paragraph> <Paragraph position="12"> 4. Memoization can be selectively applied (Earley Deduction memoizes every computational step).</Paragraph> <Paragraph position="13"> This can significantly improve overall efficiency. In the categorial grammar example only x/3 goals are memoized (and thus only these goals incur the cost of table management).</Paragraph> <Paragraph position="14"> The 'abstraction' step, which is used in most memoizing systems (including complex feature grammar chart parsers where it is somewhat confusingly called 'restriction', as in Shieber 1985), receives an elegant treatment in a CLP approach; an 'abstracted' goal is merely one in which not all of the equality constraints associated with the variables appearing in the goal are selected with that goal. 2 For example, because of the backward application rule and the left-to-right evaluation our parser uses, eventually it will search at every left string position for an uninstantiated category (the variable Y in the clause), we might as well abstract all memoized goals of the form x(C, L, R) to x(_, L, _), i.e., goals in which the category and right string position are uninstantinted. Making the equality constraints explicit, we see that the abstracted goal is obtained by merely selecting the underlined subset of these below: x(Xl,X2, X3),Xl = C, X2 = L, Xa = R.</Paragraph> <Paragraph position="15"> While our formal presentation does not discuss abstraction (since it can be implemented in terms of constraint selection as just described), because our implementation uses the underlying Prolog's unification mechanism to solve equality constraints over terms, it provides an explicit abstraction operation. Now we turn to the specification of the algorithm itself, beginning with the basic computational entities it uses.</Paragraph> <Paragraph position="16"> Definition 1 A (generalized) goal is a multiset of relational atoms and constraints. A (generalized) clause Ho 4-- Bo is an ordered pair of generalized goals, where /fro contains at least one relational atom. A relational interpretation .4 (see HShfeld and Smolka 1988 for definition) satisfies a goal G iff .A satisfies each element of G, and it satisfies a clause more general formulation of abstraction is required for systems using a hierarchy of types, such as typed feature structure constraints (Carpenter 1992). In applications of the Lemma Table Proof Procedure to such systems it may be desirable to abstract from a 'strong' type constralnt in the body of a clause to a logically 'weaker' type constraint in the memoized goal. Such a form of abstraction cannot be implemented using the selection rule alone.</Paragraph> <Paragraph position="17"> This generalizes the standard notion of clause by allowing the head H0 to consist of more than one atom. The head H0 is interpreted conjunctively; i.e., if each element of B0 is true, then so is each element of H0. The standard definition of resolution extends unproblematically to such clauses.</Paragraph> <Paragraph position="18"> Definition 2 We say that a clause co - H0 ~ B0 resolves with a clause cl = Ht ~-- BI on a non-empty set ofliterals C C_ Bo iff there is a variant Cl ~ of el of the form C *--- BI' such that V(co)NV(Bx') C V(C) (i.e., the variables common to e0 and BI ~ also appear in C, so there is no accidental variable sharing).</Paragraph> <Paragraph position="19"> If Co resolves with Cl on C, then the clause</Paragraph> <Paragraph position="21"> Now we define items, which are the basic computational units that appear on the agenda and in the lemma tables, which record memoized subcomputations. null Definition 3 An item is a pair (t, c) where c is a clause and t is a tag, i.e., one of program, solution or table(B) for some goal B. A lemma table for a goal G is a pair (G, La) where La is a finite list of items. The algorithm manipulates a set T of lemma tables which has the property that the first components of any two distinct members of T are distinct. This justifies speaking of the (unique) lemma table in T for a goal G.</Paragraph> <Paragraph position="22"> Tags are associated with clauses by a user-specified control rule, as described below. The tag associated with a clause in an item identifies the operation that should be performed on that clause. The solution tag labels 'completed' clauses, the program tag directs the proof procedure to perform a nonmemoizing resolution of one of the clanse's negative literals with program clauses (the particular negative literal is chosen by a user-specified selection rule, as in standard SLD resolution), and the table(B) tag indicates that a subcomputation with root goal B (which is always a subset of the clause's negative literals) should be started.</Paragraph> <Paragraph position="23"> Definition 4 A control rule is a function from clauses G *-- B to one of program, solution or table(C) for some goal C C B. A selection rule is a function from clauses G *-- B where B contains at least one relational atom to relational atoms a, where a appears in B.</Paragraph> <Paragraph position="24"> Because program steps do not require memoization and given the constraints on the control rule just mentioned, the list LG associated with a lemma table (G, LG) will only contain items of the form (t, G ,-- B) where t is either solution or table(C) for some goal C C_ B.</Paragraph> <Paragraph position="25"> Definition 5 To add an item an item e =</Paragraph> <Paragraph position="27"> program For each clause p E P such that c resolves with p on S(c), choose a corresponding resolvent e' and add iRic'), c') to A.</Paragraph> <Paragraph position="28"> table(B) Add e to its table, s If T contains a table (B', L) where B' is a variant of B then for each item (solution, d) E L such that c resolves with d on B choose a corresponding resolvent d' and add iR(c&quot;), d') to A. Otherwise, add a new table i B, C/) to T, and add (program, B ~-- B) to the agenda. solution Add e to its table.</Paragraph> <Paragraph position="29"> Let e = H ~ B. Then for each item of the form (tabh(H'), d) in any table in T where H' is a variant of H and c' resolves with c on H', choose a corresponding resolvent d' and add (R(d'), d') The formal description of the Lemma Table proof procedure is given in Figure 2. We prove the soundness and completeness of the proof procedure in DSrre and Johnson (in preparation). In fact, soundness is easy to show, since all of the operations are resolution steps. Completeness follows from the fact that Lemma Table proofs can be 'unfolded' into standard SLD search trees (this unfolding is well-founded because the first step of every table-initiated subcomputation is required to be a program resolution), so completeness follows from HShfeld and Smolka's completeness theorem for SLD resolution in CLP.</Paragraph> </Section> <Section position="5" start_page="103" end_page="104" type="metho"> <SectionTitle> 4 A worked example </SectionTitle> <Paragraph position="0"> Returning to the categorial grammar example above, the control rule and selection rule are specified by the Prolog code below, which can be informally described as follows. All x/3 literals are classified as 'memo' literals, and add_adjuncts/2 and division/2 whose second arguments are not sufficiently instantiated are classified as 'delay' literals. If the clause contains a memo literal G, then the control rule returns tablei\[G\]). Otherwise, if the clause contains any non-delay literals, then the control rule 3In order to handle the more general form of abstraction discussed in footnote 2 which may be useful with typed feature structure constraints, replace B with a(B) in this step, where a(B) is the result of applying the abstraction operation to B.</Paragraph> <Paragraph position="1"> The abstraction operation should have the property that a(B) is exactly the same as B, except that zero or more constraints in B are replaced with logically weaker constraints.</Paragraph> <Paragraph position="2"> returns program and the selection rule chooses the left-most such literal. If none of the above apply, the control rule returns solution. To simplify the interpreter code, the Prolog code for the selection rule and tableiG ) output of the control rule also return the remaining literals along with chosen goal.</Paragraph> <Paragraph position="3"> :- ensure_loaded(library(lists)).</Paragraph> <Paragraph position="4"> :- op(990, fx, \[delay, memo\]).</Paragraph> <Paragraph position="5"> delay division(_, X/Y) :- var(l), var(Y).</Paragraph> <Paragraph position="6"> delay add_adjuncts(_, X/Y) :- vat(X), vat(Y).</Paragraph> <Paragraph position="7"> memo x( ..... ).</Paragraph> <Paragraph position="8"> control(GsO, Control) :memo(G), select(G, CeO, Gs) -> Control = table(\[G\], Gs) ; member(G, GsO), \+ delay(G) -> Control = program ; Control = solution.</Paragraph> <Paragraph position="9"> selection(GsO, G, Gs) :select(G1, GsO, Gel), \+ delay(Gl) -> G = Gl, Ca = Gel.</Paragraph> <Paragraph position="10"> Because we do not represent variable binding as explicit constraints, we cannot implement 'abstraction' by means of the control rule and require an explicit abstraction operation. The abstraction operation here unbinds the first and third arguments of x/3 goals, as discussed above.</Paragraph> <Paragraph position="11"> abetraction(\[x(_,Left,_)\], \[x(_,Left,_)\]). x(A, \[l_t, o\], B) ~-- x(A, \[l_t, o\], B).</Paragraph> <Paragraph position="12"> x(A, \[l_t, o\], B) ~-- x(A/C, \[l_t, o\], D), x(C, D, B). x(A, \[l_t, o\], B) ~ x(C, \[l_t, o\], D), x(A\C, D, B). x(A, \[l_t, o\], \[o\]) *-- lex(l_t, A).</Paragraph> <Paragraph position="13"> x(A/#B, \[l_t, o\], \[o\]) ~-- add(s\np/(s\np), C), div(C, A/B). x(A, \[l_t, o\], B) ~ add(s\np/(s\np), C), div(C, A/D), x(#D, \[o\], B). x(A, \[o\], B) ~ x(A, \[o\], S).</Paragraph> <Paragraph position="14"> x(A, \[o\], B) *-- x(A/C, \[o\], D), x(C, D, B). x(A, \[o\], B) ~-- x(C, \[o\], D), x(A\C, D, S). x(A, \[o\], 4) ~- lex(o, A).</Paragraph> <Paragraph position="15"> x(#A, \[o\], ~) ~- add(s\np\np, A).</Paragraph> <Paragraph position="16"> x(A, \[l_t, o\], 0) ~'- add(s\np\np, S), add(s\np/(s\np), C), div(C, A/B). x(A, \[Lt, o\], B) *-- add(s\np\np, C), add(s\np/(s\np), D), div(D, A/E/C), x(E, Q, B). x(A, 0, B) ~- x(A, 0, B).</Paragraph> <Paragraph position="17"> x(A, 0, B) ~- x(A/C, Q, D), x(C, D, B).</Paragraph> <Paragraph position="18"> x(h, 4, B) +-- x(C, 4, D), x(A\C, D, B).</Paragraph> <Paragraph position="19"> x(A, \[l_t, o\], B) ~-- add(s\np\np, C), add(s\np/(s\np), D), div(D, E/C), x(A\E, ~, B). x(A, \[o\], B) ~-- add(s\np\np, C), x(A\#C, ~, B). x(A, \[l_t, o\], B) ~ add(s\np/(s\np), C), div(C, D/E), x(A\(D/#E), \[o\], B). selection rules specified in the text. The prefix t.n\[a\] T identifies the table t to which this item belongs, assigns this item a unique identifying number n, provides the number(s) of the item(s) a which caused this item to be created, and displays its tag T (P for 'program', T for 'table' and S for 'solution'). The selected literal(s) are shown underlined. To save space, 'add_adjuncts' is abbreviated by 'add', 'division' by 'div', 'lijkt_te' by 'It', and 'ontwijken' by 'o'.</Paragraph> <Paragraph position="20"> Figure 3 depicts the proof of a parse of the verb cluster in (1). Item 1 is generated by the initial goal; its sole negative literal is selected for program resolution, producing items 2-4 corresponding to three program clauses for x/3. Because items 2 and 3 contain 'memo' literals, the control rule tags them table; there already is a table for a variant of these goals (after abstraction). Item 4 is tagged program because it contains a negative literal that is not 'memo' or 'delay'; the resolution of this literal with the program clauses for lex/3 produces item 5 containing the constraint literals associated with lijkt re. Both of these are classified as 'delay' literals, so item 5 is tagged solution, and both are 'inherited' when item 5 resolves with the table-tagged items 2 and 3, producing items 6 (corresponding to a right application analysis with lijkt te as functor) and item 19 (corresponding to a left application analysis with ont. wijken as functor) respectively. Item 6 is tagged table, since it contains a x/3 literal; because this goal's second argument (i.e., the left string position) differs from that of the goal associated with table 0, a new table (table 1) is constructed, with item 7 as its first item.</Paragraph> <Paragraph position="21"> The three program clauses for x/3 are used to resolve the selected literal in item 7, just as in item 1, yielding items 8-10. The lex/3 literal in item 10 is resolved with the appropriate program clause, producing item 11. Just as in item 5, the second argument of the single literal in item 11 is not sufficiently instantiated, so item 11 is tagged solution, and the unresolved literal is 'inherited' by item 12. Item 12 contains the partially resolved analysis of the verb complex. Items 13-16 analyze the empty string; notice that there are no solution items for table 2.</Paragraph> <Paragraph position="22"> Items 17-19 represent partial alternative analyses of the verb cluster where the two verbs combine using other rules than forward application; again, these yield no solution items, so item 12 is the sole analysis of the verb cluster.</Paragraph> </Section> class="xml-element"></Paper>