File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/p90-1027_metho.xml
Size: 24,423 bytes
Last Modified: 2025-10-06 14:12:36
<?xml version="1.0" standalone="yes"?> <Paper uid="P90-1027"> <Title>AUTOMATED INVERSION OF LOGIC GRAMMARS FOR GENERATION</Title> <Section position="5" start_page="212" end_page="212" type="metho"> <SectionTitle> 2 A chain rule is one where the main binding-canying argu- </SectionTitle> <Paragraph position="0"> ment is passed unchanged from the left-hand side to the righL For example, assert (P) --> subJ (PI), verb (P2), obJ (P1, P2, P). is a chain rule with respect to the argmnent P.</Paragraph> <Paragraph position="1"> goals (i.e., those which are not responsible for making a rule a &quot;chain rule&quot;), or resort to dynamic ordering of such goals, putting the goal freezing back into the picture.</Paragraph> <Paragraph position="2"> In contrast with the above, the parser inversion procedure described in this paper does not require a run-time overhead and can be performed by an off-line compilation process. It may, however, require that the grammar is normalized prior to its inversion.</Paragraph> <Paragraph position="3"> We briefly discuss the grammar normalization problem at the end of this paper.</Paragraph> </Section> <Section position="6" start_page="212" end_page="213" type="metho"> <SectionTitle> IN AND OUT ARGUMENTS </SectionTitle> <Paragraph position="0"> Arguments in a PROLOG literal can be marked as either &quot;in&quot; or &quot;out&quot; depending on whether they are bound at the time the literal is submitted for execution or after the computation is completed. For example, in tovo ( \[to, eat, fish\], T4, \[np, \[n, john\] \] ,P3) the first and the third arguments are &quot;in&quot;, while the remaining two are &quot;out&quot;. When tovo is used for generation, i.e., tovo (TI, T4, PI, \[eat, \[rip, \[n, john\] \], \[np, \[n, fish\] \] \] ) then the last argument is &quot;in&quot;, while the first and the third are &quot;out&quot;; T4 is neither &quot;in&quot; nor &quot;out&quot;. The information about &quot;in&quot; and &quot;out&quot; status of arguments is important in determining the &quot;direction&quot; in which predicates containing them can be run s . Below we present a simple method for computing &quot;in&quot; and &quot;out&quot; arguments in PROLOG literals. 4 An argument X of literal pred('&quot; X &quot;&quot; ) on the rhs of a clause is &quot;in&quot; if (A) it is a constant; or (B) it is a function and all its arguments are &quot;in&quot;; or (C) it is &quot;in&quot; or &quot;out&quot; in some previous literal on the rhs of the same clause, i.e., I(Y) :-r(X,Y),pred(X); or (D) it is &quot;in&quot; in the head literal L on lhs of the same clause.</Paragraph> <Paragraph position="1"> An argument X is &quot;in&quot; in the head literal L = pred(... X... ) of a clause if (A), or (B), or (E) L is the top-level literal and X is &quot;in&quot; in it (known a priori); or ~ X occurs more than once in L and at s For a discussion on directed predicates in ~OLOO see (Shoham and McDermott, 1984), and (Debray, 1989).</Paragraph> <Paragraph position="2"> 4 This simple algorithm is all we need to complete the experiment at hand. A general method for computing &quot;in&quot;/&quot;out&quot; arguments is given in (Strzalkowski, 1989). In this and further algorithms we use abbreviations rhs and lhs to stand for right-hand side and left-hand side (of a clause), respectively.</Paragraph> <Paragraph position="3"> least one of these occurrences is &quot;in&quot;; or (G) for every literal L 1 = pred (&quot; * * Y&quot; * * ) unifiable with L on the rhs of any clause with the head predicate predl different than pred, and such that Y unifies with X, Yis &quot;in&quot; inL1.</Paragraph> <Paragraph position="4"> A similar algorithm can be proposed for computing &quot;out&quot; arguments. We introduce &quot;unknwn&quot; as a third status marker for arguments occurring in certain recursive clauses.</Paragraph> <Paragraph position="5"> An argument X of literal pred (. * * X ... ) on the rhs of a clause is &quot;out&quot; if (A) it is &quot;in&quot; in pred(... X * * * ); or (B) it is a functional expression and all its arguments are either &quot;in&quot; or &quot;out&quot;; or (C) for every clause with the head literal pred( . . . Y * * * ) unifiable with pred( &quot; * X &quot;&quot; ) and such that Y unifies with X, Y is either &quot;in&quot;, &quot;out&quot; or &quot;unknwn&quot;, and Y is marked &quot;in&quot; or &quot;out&quot; in at least one case.</Paragraph> <Paragraph position="6"> An argument X of literal pred(... X... ) on the lhs of a clause is &quot;out&quot; if (D) it is &quot;in&quot; in pred(.'.X...); or (E) it is &quot;out&quot; in literal predl(&quot; * * X ..&quot; ) on the rhs of this clause, providing that predl ~ pred; 5 if predl = pred then X is marked &quot;unknwn&quot;.</Paragraph> <Paragraph position="7"> Note that this method predicts the &quot;in&quot; and &quot;out&quot; status of arguments in a literal only if the evaluation of this literal ends successfully. In case it does not (a failure or a loop) the &quot;in&quot;/&quot;out&quot; status of arguments becomes irrelevant.</Paragraph> </Section> <Section position="7" start_page="213" end_page="215" type="metho"> <SectionTitle> COMPUTING ESSENTIAL ARGUMENTS </SectionTitle> <Paragraph position="0"> Some arguments of every literal are essential in the sense that the literal cannot be executed successfully unless all of them are bound, at least partially, at the time of execution. For example, the predicate</Paragraph> <Paragraph position="2"> &quot;to+verb+object&quot; object strings can be executed only if either T1 or P3 is bound. 6 7 If tovo is used to parse then T:I. must be bound; if it is used to generate then P3 must be bound. In general, a literal may have several alternative (possibly overlapping) sets of essential arguments. If all arguments in any one of such sets of essential arguments are bound, s Again, we must take provisions to avoid infinite descend, c.f. (G) in &quot;in&quot; algorithm.</Paragraph> <Paragraph position="3"> 6 Assuming that tovo is defined as follows (simplified): tovo(T1,T4,P1,P3) :- to(T1,T2), v(T2,T3,P2), object (T3, T4,P1,P2,P3).</Paragraph> <Paragraph position="4"> 7 An argument is consideredfu/ly bound is it is a constant or it is bound by a constant; an argument is partially bound if it is, or is bound by, a functional expression (not a variable) in which at least one variable is unbound.</Paragraph> <Paragraph position="5"> then the literal can be executed. Any set of essential arguments which has the above property is called essential. We shall call a set MSEA of essential arguments a minimal set of essential arguments if it is essential, and no proper subset of MSEA is essential. A collection of minimal sets of essential arguments (MSEA's) of a predicate depends upon the way this predicate is defined. If we alter the ordering of the rhs literals in the definition of a predicate, we may also change its set of MSEA's. We call the set of MSEA's existing for a current definition of a predicate the set of active MSEA's for this predicate. To run a predicate in a certain direction requires that a specific MSEA is among the currently active MSEA's for this predicate, and if this is not already the case, then we have to alter the definition of this predicate so as to make this MSEA become active. Consider the following abstract clause defining predicate Rf</Paragraph> <Paragraph position="7"> a,(...).</Paragraph> <Paragraph position="8"> Suppose that, as defined by (D1), Ri has the setMSi = {ml, &quot;&quot; * ,mj} of active MSEA's, and let MRi ~ MSi be the set of all MSEA for Ri that can be obtained by permuting the order of literals on the right-hand side of (D1). Let us assume further that R i occurs on rhs of some other clause, as shown below:</Paragraph> <Paragraph position="10"> We want to compute MS, the set of active MSEA's for P, as defined by (C1), where s _> 0, assuming that we know the sets of active MSEA for each R i on the rhs. s If s =0, that is P has no rhs in its definition, then if P (X1, &quot;'&quot; ,X~) is a call to P on the rhs of some clause and X* is a subset of {X1, &quot;'&quot; ,X~} then X* is a MSEA in P if X* is the smallest set such that all arguments in X* consistently unify (at the same time) with the corresponding arguments in at most I occurrence of P on the lhs anywhere in the program. 9 s MSEA's of basic predicates, such as concat, are assumed to be known a priori; MSEA's for reeursive predicates are first computed from non-n~cursive clauses.</Paragraph> <Paragraph position="11"> 9 The at most 1 requirement is the strictest possible, and it can be relaxed to at most n in specific applications. The choice of n may depend upon the nature of the input language being processed (it may be n-degree ambiguous), and/or the cost of backing up from unsuccessful calls. For example, consider the words every and all: both can be translated into a single universal quantifier, but upon generation we face ambiguity. If the representation from When s ___ 1, that is, P has at least one literal on the rhs, we use the recursive procedure MSEAS to compute the set of MSEA's for P, providing that we already know the set of MSEA's for each literal occurring on the rhs. Let T be a set of terms, that is, variables and functional expressions, then VAR (T) is the set of all variables occurring in the terms of T.</Paragraph> <Paragraph position="12"> Thus VAR({f(X),Y,g(c,f(Z),X)}) = {X,Y=,Z}. We assume that symbols Xi in definitions (C1) and (D1) above represent terms, not just variables. The following algorithm is suggested for computing sets of active MSEA's in P where i >1.</Paragraph> <Paragraph position="14"> Z, i=1, and OUT = ~. When the computation is completed, MS is bound to the set of active MSEA's for P.</Paragraph> <Paragraph position="15"> (2) Let MR 1 be the set of active MSEA's of R 1, and let MRU1 be obtained from MR 1 by replacing all variables in each member of MR1 by their corresponding actual arguments of R 1 on the rhs of (C1).</Paragraph> <Paragraph position="16"> (3) IfR I = P then for every ml.k e MRU1 if every argument Y, e m 1,k is always unifiable with its corresponding argument Xt in P then remove ml.k from MRUI. For every set ml.,i = ml,k u {XI.j}, where X1j is an argument in R1 such that it is not already in m ~,~ and it is not always unifiable with its corresponding argument in P, and m 1,kj is not a superset of any other m u remaining in MRUI, add m 1.kj to MRUl.10 (4) For each mlj e MRU1 (j=l'&quot;rl) compute</Paragraph> <Paragraph position="18"> not be executed.</Paragraph> <Paragraph position="19"> which we generate is devoid of any constraints on the lexieal number of surface words, we may have to tolerate multiple choices, at some point. Any decision made at this level as to which arguments are to be essential, may affect the reversibility of the grammar.</Paragraph> <Paragraph position="20"> l0 An argument Y is always unifiable with an argument X if they unify regardless of the possible bindings of any variables occurring in Y (variables standardized apart), while the variables occurring in X are unbound. Thus, any term is always unifiable with a variable; however, a variable is not always unifiable with a nonvariable. For example, variable X is not always unifiable with f (Y) because if we substitute g (Z) for X then the so obtained terms do not unify. The purpose of including steps (3) and (7) is to eliminate from consideration certain 'obviously' ill-formed reeursive clauses. A more elaborate version of this condition is needed to take care of less obvious cases.</Paragraph> <Paragraph position="21"> (5) For each ~h,j e MP1 we do the following: (a) assume that ~tl, j is &quot;in&quot; in R1; (b) compute set OUT1j of &quot;out&quot; arguments for R1; (c) call</Paragraph> <Paragraph position="23"> (6) In some i-th step, where l<i<s, and MSEA = lxi-l,,, let's suppose that MRi and MRUi are the sets of active MSEA's and their instantiations with actual arguments of R i, for the literal Ri on the rhs of (C 1).</Paragraph> <Paragraph position="24"> (7) If R i = P then for every mi. u E MRUi if every argument Yt e mi. u is always unifiable with its corresponding argument Xt in P then remove mi.u from MRUi. For every set mi.uj = mi.u u {Xij } where X u is an argument in R~ such that it is not already in mio u and it is not always unifiable with its corresponding argument in P and rai, uj is not a superset of any other rai, t remaining in MRUi, add mi.,j to MRU I.</Paragraph> <Paragraph position="25"> (8) Again, we compute the set MPi = {!%.i I j=l ...r i}, where ~tid = (VAR (mij) -OUTi_l,k), where OUTi_I, ~ is the set of all &quot;out&quot; arguments in literals R 1 to Ri_ 1 .</Paragraph> <Paragraph position="26"> (9) For each I.t/d remaining in Me i where i$.s do the following:</Paragraph> <Paragraph position="28"> Co) otherwise, if ~ti.j *: 0 then find all distinct minimal size sets v, ~ VP such that whenever the arguments in v, are &quot;in&quot;, then the arguments in l%d are &quot;out&quot;. If such vt's exist, then for every v, do: (i) assume vt is &quot;in&quot; in P; (ii) compute the set OUT,.j, of &quot;out&quot; arguments in all literals from R1 to Ri; (iii) call MSEAS (MSi. h,la i_l,*t.mt, VP,i + 1,OUTi, h); (c) otherwise, if no such v, exist, MSid := ~.</Paragraph> <Paragraph position="30"> The procedure presented here can be modified to compute the set of all MSEA's for P by considering all feasible orderings of literals on the rhs of (C1) and using information about all MSEA's for Ri's. This modified procedure would regard the rhs of (C1) as an tmordered set of literals, and use various heuristics to consider only selected orderings.</Paragraph> </Section> <Section position="8" start_page="215" end_page="218" type="metho"> <SectionTitle> REORDERING LITERALS IN CLAUSES </SectionTitle> <Paragraph position="0"> When attempting to expand a literal on the rhs of any clause the following basic rule should be observed: never expand a literal before at least one its active MSEA's is &quot;in&quot;, which means that all arguments in at least one MSEA are bound. The following algorithm uses this simple principle to reorder rhs of parser clauses for reversed use in generation. This algorithm uses the information about &quot;in&quot; and &quot;out&quot; arguments for literals and sets of MSEA's for predicates. If the &quot;in&quot; MSEA of a literal is not active then the rhs's of every definition of this predicate is recursively reordered so that the selected MSEA becomes active. We proceed top-down altering definitions of predicates of the literals to make their MSEA's active as necessary. When reversing a parser, we start with the top level predicate pa=a_gen (S, P) assuming that variable t, is bound to the regularized parse structure of a sentence. We explicitly identify and mark P as &quot;in&quot; and add the requirement that S must be marked &quot;out&quot; upon completion of rhs reordering.</Paragraph> <Paragraph position="1"> We proceed to adjust the definition of para_gen to reflect that now {P} is an active MSEA. We continue until we reach the level of atomic or non-reversible primitives such as concat, member, or dictionary look-up routines. If this top-down process succeeds at reversing predicate definitions at each level down to the primitives, and the primitives need no redefinition, then the process is successful, and the reversed-parser generator is obtained. The algorithm can be extended in many ways, including inter-clausal reordering of literals, which may be required in some situations (Strzalkowski, 1989).</Paragraph> <Paragraph position="2"> INVERSE(&quot;head :- old-rhs&quot;,ins,outs); {ins and outs are subsets of VAR(head) which are &quot;in&quot; and are required to be &quot;out&quot;, respectively} begin compute M the set of all MSEA's for head; for every MSEA m e M do</Paragraph> <Paragraph position="4"> if m is an active MSEA such that me ins then begin compute &quot;out&quot; arguments in head; add them to OUT; if outs cOUT then DONEChead:-old-rhs&quot; )</Paragraph> <Paragraph position="6"> {done only once during the inversion} repeat mark &quot;in&quot; old-rhs-1 arguments which are either constants, or marked &quot;in&quot; in head, or marked &quot;in&quot;, or &quot;out&quot; in new-rhs; 216 select a literal L in old-rhs-1 which has an &quot;in&quot; MSEA m L and if m L is not active in L then either M L = O or m L e ML; set up a backtracking point containing all the remaining alternatives to select L from old-rhs-1; if L exists then begin if m L is non-active in L then begin if M L -- ~ then M L := M L u {mL}; for every clause &quot;L1 :- rhsu&quot; such that L1 has the same predicate as L do</Paragraph> <Paragraph position="8"> if GIVEUP returned then backup, undoing all changes, to the latest backtracking point and select another alternative end end; compute &quot;in&quot; and &quot;out&quot; arguments in L; add &quot;out&quot; arguments to OUT;</Paragraph> <Paragraph position="10"> else begin backup, undoing all changes, to the latest backtracking point and select another alternative; if no such backtracking point exists then</Paragraph> <Paragraph position="12"> We have implemented an interpreter, which translates Definite Clause Grammar dually into a parser and a generator. The interpreter first transforms a DCG grammar into equivalent PROLOG code, which is subsequently inverted into a generator. For each predicate we compute the minimal sets of essential arguments that would need to be active if the program were used in the generation mode. Next, we rearrange the order of the fight hand side literals for each clause in such a way that the set of essential arguments in each literal is guaranteed to be bound whenever the literal is chosen for expansion. To implement the algorithm efficiently, we compute the minimal sets of essential arguments and reorder the literals in the right-hand sides of clauses in one pass through the parser program. As an example, we consider the following rule in our DCG grammar: 11 The parser program is now inverted using the algorithms described in previous sections. As a result, the assertion clause above is inverted into a generator clause by rearranging the order of the literals on its right-hand side. The literals are examined from the left to right: if a set of essential arguments is bound, the literal is put into the output queue, otherwise the tt The grammar design is based upon string grammar (Sager, 1981). Nonterminal net stands for a string of sentence adjuncts, such as prepositional or adverbial phrases; : : is a PROLOG-defined predicate. We show only one rule of the grammar due to the lack of space.</Paragraph> <Paragraph position="13"> literal is put into the waiting stack. In the example at hand, the literal sa (Sl, L1, L3) is examined first.</Paragraph> <Paragraph position="14"> Its MSEA is {Sl}, and since it is not a subset of the set of variables appearing in the head literal, this set cannot receive a binding when the execution of assertion starts. It may, however, contain &quot;out&quot; arguments in some other literals on the right-hand side of the clause. We thus remove the first sa literal from the clause and place it on hold until its MSEA becomes fully instantiated. We proceed to consider the remaining literals in the clause in the same manner, until we reach S: verb * head : * Vp : head. One MSEA for this literal is { S }, which is a subset of the arguments in the head literal. We also determine that S is not an &quot;out&quot; argument in any other literal in the clause, and thus it must be bound in assertion whenever the clause is to be executed. This means, in turn, that S is an essential argument in assertion. As we continue this process we find that no further essential arguments are required, that is, {S} is a MSEA for assertion.</Paragraph> <Paragraph position="15"> The literal S : verb: head : : Vp: head is output and becomes the top element on the right-hand side of the inverted clause. After all literals in the original clause are processed, we repeat this analysis for all those remaining in the waiting stack until all the literals are output. We add prefix g_ to each inverted predicate in the generator to distinguish them from their non-inverted versions in the parser.</Paragraph> <Paragraph position="16"> The inverted assertion predicate as it appears in the generator is shown below.</Paragraph> <Paragraph position="18"> A single grammar is thus used both for sentence parsing and for generation. The parser or the generator is invoked using the same top-level predicate pars_gen(S,P) depending upon the binding status of its arguments: if S is bound then the parser is invoked, if P is bound the generator is called.</Paragraph> <Paragraph position="20"> such a way that it can be executed by a top-down interpreter, such as the one used by PROLOG. If this is not the case, that is, if the grammar requires a different kind of interpreter, then the question of invertibility can only be related to this particular type of interpreter. If we want to use the inversion algorithm described here to invert a parser written for an interpreter different than top-down and left-to-right, we need to convert the parser, or the grammar on which it is based, into a version which can be evaluated in a top-down fashion.</Paragraph> <Paragraph position="21"> One situation where such normalization may be required involves certain types of non-standard recursive goals, as depicted schematically below.</Paragraph> <Paragraph position="22"> If vp is invoked by a top-down, left-to-right interpreter, with the variable P instantiated, and if P1 is the essential argument in comp1, then there is no way we can successfully execute the first clause, even if we alter the ordering of the literals on its right-hand side, unless, that is, we employ the goal skipping technique discussed by Shieber et al. However, we can easily normalize this code by replacing the first two clauses with functionally equivalent ones that get the recursion firmly under control, and that can be evaluated in a top-down fashion. We assume that P is the essential argument in v (A, P) and that A is &quot;out&quot;. The normalized grammar is given below. vp(A,P) -> v(B,P),vpI(B,A).</Paragraph> <Paragraph position="23"> vpl (f (B, PI) ,A) -> vpl (B,A), compl (PI) .</Paragraph> <Paragraph position="24"> vpl (A,A) .</Paragraph> <Paragraph position="25"> v(A,P) -> lex.</Paragraph> <Paragraph position="26"> In this new code the recursive second clause will be used so long as its first argument has a form f(a,fl), where u and 13 are fully instantiated terms, and it will stop otherwise (either succeed or fail depending upon initial binding to A). In general, the fact that a recursive clause is unfit for a top-down execution can be established by computing the collection of minimal sets of essential arguments for its head predicate. If this collection turns out to be empty, the predicate's definition need to be normalized.</Paragraph> <Paragraph position="27"> Other types of normalization include elimination of some of the chain rules in the grammar, esl~ciany if their presence induces undue non-determinism in the generator. We may also, if necessary, tighten the criteria for selecting the essential arguments, to further enhance the efficiency of the generator, providing, of course, that this move does not render the grammar non-reversible. For a further discussion of these and related problems the reader is referred to (Strzalkowski, 1989).</Paragraph> </Section> class="xml-element"></Paper>