AUTOMATED INVERSION OF LOGIC GRAMMARS FOR GENERATION 
Tomek Strzalkowski and Ping Peng 
Courant Institute of Mathematical Sciences 
New York University 
251 Mercer Street 
New York, NY 10012 
ABSTRACT 
We describe a system of reversible grammar in 
which, given a logic-grammar specification of a 
natural language, two efficient PROLOG programs are 
derived by an off-line compilation process: a parser 
and a generator for this language. The centerpiece of 
the system is the inversion algorithm designed to 
compute the generator code from the parser's PRO- 
LOG code, using the collection of minimal sets of 
essential arguments (MSEA) for predicates. The sys- 
tem has been implemented to work with Definite 
Clause Grammars (DCG) and is a part of an 
English-Japanese machine translation project 
currently under development at NYU's Courant Insti- 
tute. 
INTRODUCTION 
The results reported in this paper are part of the 
ongoing research project to explore possibilities of an 
automated derivation of both an efficient parser and 
an efficient generator for natural language, such as 
English or Japanese, from a formal specification for 
this language. Thus, given a grammar-like descrip- 
tion of a language, specifying both its syntax as well 
as "semantics" (by which we mean a correspondence 
of well-formed expressions of natural language to 
expressions of a formal representation language) we 
want to obtain, by a fully automatic process, two pos- 
sibly different programs: a parser and a generator. 
The parser will translate well-formed expression of 
the source language into expressions of the language 
of "semantic" representation, such as regularized 
operator-argument forms, or formulas in logic. The 
generator, on the other hand, will accept well-formed 
expressions of the semantic representation language 
and produce corresponding expressions in the source 
natural language. 
Among the arguments for adopting the bidirec- 
tional design in NLP the following are perhaps the 
most widely shared: 
• A bidirectional NLP system, or a system whose 
inverse can be derived by a fully automated pro- 
cess, greatly reduces effort required for the sys- 
tem development, since we need to write only one 
program or specification instead of two. The 
actual amount of savings ultimately depends upon 
the extend to which the NLP system is made 
bidirectional, for example, how much of the 
language analysis process can be inverted for gen- 
eration. At present we reverse just a little more 
than a syntactic parser, but the method can be 
applied to more advanced analyzers as well. 
• Using a single specification (a grammar) underly- 
ing both the analysis and the synthesis processes 
leads to more accurate capturing of the language. 
Although no NLP grammar is ever complete, the 
grammars used in parsing tend to be "too loose", 
or unsound, in that they would frequently accept 
various ill-formed strings as legitimate sentences, 
while the grammars used for generation are usu- 
ally made "too tight" as a result of limiting their 
output to the "best" surface forms. A reversible 
system for both parsing and generation requires a 
finely balanced grammar which is sound and as 
complete as possible. 
• A reversible grammar provides, by design, the 
match between system's analysis and generation 
capabilities, which is especially important in 
interactive systems. A discrepancy in this capa- 
city may mislead the user, who tends to assume 
that what is generated as output is also acceptable 
as input, and vice-versa. 
• Finally, a bidirectional system can be expected to 
be more robust, easier to maintain and modify, 
and altogether more perspicuous. 
In the work reported here we concenlrated on 
unification-based formalisms, in particular Definite 
Clause Grammars (Pereira & Warren, 1980), which 
can be compiled dually into PROLOG parser and gen- 
erator, where the generator is obtained from the 
parser's code with the inversion procedure described 
below. As noted by Dymetman and Isabelle (1988), 
this transformation must involve rearranging the 
order of literals on the right-hand side of some 
clauses. We noted that the design of the string gram- 
mar (Sager, 1981) makes it more suitable as a basis 
of a reversible system than other grammar designs, 
although other grammars can be "normalized" 
(Strzalkowski, 1989). We also would like to point out 
that our main emphasis is on the problem of 
212 
reversibility rather than generation, the latter involv- 
ing many problems that we don't deal with here (see, 
e.g. Derr & McKeown, 1984; McKeown, 1985). 
RELATED WORK 
The idea that a generator for a language might 
be considered as an inverse of the parser for the same 
language has been around for some time, but it was 
only recently that more serious attention started to be 
paid to the problem. We look here only very briefly 
at some most recent work in unificatlon-hased gram- 
mars. Dymelman and Isabelle (1988) address the 
problem of inverting a definite clause parser into a 
generator in context of a machine translation system 
and describe a top-down interpreter with dynamic 
selection of AND goals 1 (and therefore more flexible 
than, say, left-to-right interpreter) that can execute a 
given DCG grammar in either direction depending 
only upon the binding status of arguments in the top- 
level literal. This approach, although conceptually 
quite general, proves far too expensive in practice. 
The main source of overhead comes, it is pointed out, 
from employing the nick known as goal freezing 
(Colmerauer, 1982; Naish, 1986), that stops expan- 
sion of currently active AND goals until certain vari- 
ables get instantiated. The cost, however, is not the 
only reason why the goal freezing techniques, and 
their variations, are not satisfactory. As Shieber et al. 
(1989) point out, the inherently top-down character 
of goal freezing interpreters may occasionally cause 
serious troubles during execution of certain types of 
recursive goals. They propose to replace the 
dynamic ordering of AND goals by a mixed top- 
down/bottom-up interpretation. In this technique, cer- 
tain goals, namely those whose expansion is defined 
by the so-called "chain rules "2, are not expanded dur- 
ing the top-down phase of the interpreter, but instead 
they are passed over until a nearest non-chain rule is 
reached. In the bottom-up phase the missing parts of 
the goal-expansion tree will be filled in by applying 
the chain rules in a backward manner. This tech- 
nique, still substantially more expensive than a 
fixed-order top-down interpreter, does not by itself 
guarantee that we can use the underlying grammar 
formalism bidirectionally. The reason is that in order 
to achieve bidirectionality, we need either to impose 
a proper static ordering of the "non-chain" AND 
* Literals on the right-hand side of a clause create AND 
goals; llterals with the same predicate names on the left-hand sides 
of different ehuses create OR goals. 
2 A chain rule is one where the main binding-canying argu- 
ment is passed unchanged from the left-hand side to the righL For 
example, assert (P) --> subJ (PI), verb (P2), 
obJ (P1, P2, P). is a chain rule with respect to the argmnent P. 
goals (i.e., those which are not responsible for mak- 
ing a rule a "chain rule"), or resort to dynamic order- 
ing of such goals, putting the goal freezing back into 
the picture. 
In contrast with the above, the parser inversion 
procedure described in this paper does not require a 
run-time overhead and can be performed by an off- 
line compilation process. It may, however, require 
that the grammar is normalized prior to its inversion. 
We briefly discuss the grammar normalization prob- 
lem at the end of this paper. 
IN AND OUT ARGUMENTS 
Arguments in a PROLOG literal can be marked 
as either "in" or "out" depending on whether they are 
bound at the time the literal is submitted for execu- 
tion or after the computation is completed. For 
example, in 
tovo ( \[to, eat, fish\], T4, 
\[np, \[n, john\] \] ,P3) 
the first and the third arguments are "in", while the 
remaining two are "out". When tovo is used for 
generation, i.e., 
tovo (TI, T4, PI, 
\[eat, \[rip, \[n, john\] \], 
\[np, \[n, fish\] \] \] ) 
then the last argument is "in", while the first and the 
third are "out"; T4 is neither "in" nor "out". The 
information about "in" and "out" status of arguments 
is important in determining the "direction" in which 
predicates containing them can be run s . Below we 
present a simple method for computing "in" and 
"out" arguments in PROLOG literals. 4 
An argument X of literal pred('" X "" ) on 
the rhs of a clause is "in" if (A) it is a constant; or (B) 
it is a function and all its arguments are "in"; or (C) it 
is "in" or "out" in some previous literal on the rhs of 
the same clause, i.e., I(Y) :-r(X,Y),pred(X); or (D) 
it is "in" in the head literal L on lhs of the same 
clause. 
An argument X is "in" in the head literal 
L = pred(... X... ) of a clause if (A), or (B), or (E) 
L is the top-level literal and X is "in" in it (known a 
priori); or ~ X occurs more than once in L and at 
s For a discussion on directed predicates in ~OLOO see (Sho- 
ham and McDermott, 1984), and (Debray, 1989). 
4 This simple algorithm is all we need to complete the exper- 
iment at hand. A general method for computing "in"/"out" argu- 
ments is given in (Strzalkowski, 1989). In this and further algo- 
rithms we use abbreviations rhs and lhs to stand for right-hand side 
and left-hand side (of a clause), respectively. 
213 
least one of these occurrences is "in"; or (G) for 
every literal L 1 = pred (" • • Y" • • ) unifiable with L 
on the rhs of any clause with the head predicate 
predl different than pred, and such that Y unifies 
with X, Yis "in" inL1. 
A similar algorithm can be proposed for com- 
puting "out" arguments. We introduce "unknwn" as a 
third status marker for arguments occurring in certain 
recursive clauses. 
An argument X of literal pred (. • • X ... ) on 
the rhs of a clause is "out" if (A) it is "in" in 
pred(... X • • • ); or (B) it is a functional expression 
and all its arguments are either "in" or "out"; or (C) 
for every clause with the head literal 
pred( . . . Y • • • ) unifiable with pred( " • X "" ) and 
such that Y unifies with X, Y is either "in", "out" or 
"unknwn", and Y is marked "in" or "out" in at least 
one case. 
An argument X of literal pred(... X... ) on 
the lhs of a clause is "out" if (D) it is "in" in 
pred(.'.X...); or (E) it is "out" in literal 
predl(" • • X .." ) on the rhs of this clause, providing 
that predl ~ pred; 5 if predl = pred then X is marked 
"unknwn". 
Note that this method predicts the "in" and 
"out" status of arguments in a literal only if the 
evaluation of this literal ends successfully. In case it 
does not (a failure or a loop) the "in"/"out" status of 
arguments becomes irrelevant. 
COMPUTING ESSENTIAL ARGUMENTS 
Some arguments of every literal are essential in 
the sense that the literal cannot be executed success- 
fully unless all of them are bound, at least partially, at 
the time of execution. For example, the predicate 
t ovo ( T 1, T 4, P 1, P 3 ) that recognizes 
"to+verb+object" object strings can be executed only 
if either T1 or P3 is bound. 6 7 If tovo is used to 
parse then T:I. must be bound; if it is used to gen- 
erate then P3 must be bound. In general, a literal 
may have several alternative (possibly overlapping) 
sets of essential arguments. If all arguments in any 
one of such sets of essential arguments are bound, 
s Again, we must take provisions to avoid infinite descend, 
c.f. (G) in "in" algorithm. 
6 Assuming that tovo is defined as follows (simplified): 
tovo(T1,T4,P1,P3) :- to(T1,T2), v(T2,T3,P2), 
object (T3, T4,P1,P2,P3). 
7 An argument is consideredfu/ly bound is it is a constant or 
it is bound by a constant; an argument is partially bound if it is, or 
is bound by, a functional expression (not a variable) in which at 
least one variable is unbound. 
214 
then the literal can be executed. Any set of essential 
arguments which has the above property is called 
essential. We shall call a set MSEA of essential argu- 
ments a minimal set of essential arguments if it is 
essential, and no proper subset of MSEA is essential. 
A collection of minimal sets of essential argu- 
ments (MSEA's) of a predicate depends upon the way 
this predicate is defined. If we alter the ordering of 
the rhs literals in the definition of a predicate, we 
may also change its set of MSEA's. We call the set 
of MSEA's existing for a current definition of a predi- 
cate the set of active MSEA's for this predicate. To 
run a predicate in a certain direction requires that a 
specific MSEA is among the currently active MSEA's 
for this predicate, and if this is not already the case, 
then we have to alter the definition of this predicate 
so as to make this MSEA become active. Consider 
the following abstract clause defining predicate Rf 
Ri(X1,"" ,Xk):- (D1) 
QI('" "), 
Q2('"), 
a,(...). 
Suppose that, as defined by (D1), Ri has the setMSi = 
{ml, "" • ,mj} of active MSEA's, and let MRi ~ MSi 
be the set of all MSEA for Ri that can be obtained by 
permuting the order of literals on the right-hand side 
of (D1). Let us assume further that R i occurs on rhs 
of some other clause, as shown below: 
e(xl,'" ,x.):- (C1) 
R 1 (X1.1, "'" ,Xl,kl), 
R2(X2,1, ... ,X2,kz), 
R,(X,, 1,"" ,X,,k,): 
We want to compute MS, the set of active MSEA's 
for P, as defined by (C1), where s _> 0, assuming that 
we know the sets of active MSEA for each R i on the 
rhs. s If s =0, that is P has no rhs in its definition, then 
if P (X1, "'" ,X~) is a call to P on the rhs of some 
clause and X* is a subset of {X1, "'" ,X~} then X* is 
a MSEA in P if X* is the smallest set such that all 
arguments in X* consistently unify (at the same time) 
with the corresponding arguments in at most I 
occurrence of P on the lhs anywhere in the program. 9 
s MSEA's of basic predicates, such as concat, are assumed to 
be known a priori; MSEA's for reeursive predicates are first com- 
puted from non-n~cursive clauses. 
9 The at most 1 requirement is the strictest possible, and it 
can be relaxed to at most n in specific applications. The choice of n 
may depend upon the nature of the input language being processed 
(it may be n-degree ambiguous), and/or the cost of backing up 
from unsuccessful calls. For example, consider the words every 
and all: both can be translated into a single universal quantifier, but 
upon generation we face ambiguity. If the representation from 
When s ___ 1, that is, P has at least one literal on 
the rhs, we use the recursive procedure MSEAS to 
compute the set of MSEA's for P, providing that we 
already know the set of MSEA's for each literal 
occurring on the rhs. Let T be a set of terms, that is, 
variables and functional expressions, then VAR (T) is 
the set of all variables occurring in the terms of T. 
Thus VAR({f(X),Y,g(c,f(Z),X)}) = {X,¥,Z}. We 
assume that symbols Xi in definitions (C1) and (D1) 
above represent terms, not just variables. The follow- 
ing algorithm is suggested for computing sets of 
active MSEA's in P where i >1. 
MSEAS (MS,MSEA, VP,i, OUT) 
(1) Start with VP =VAR({X1,-'.,X,}), MSEA = 
Z, i=1, and OUT = ~. When the computation is 
completed, MS is bound to the set of active 
MSEA's for P. 
(2) Let MR 1 be the set of active MSEA's of R 1, and 
let MRU1 be obtained from MR 1 by replacing all 
variables in each member of MR1 by their 
corresponding actual arguments of R 1 on the rhs 
of (C1). 
(3) IfR I = P then for every ml.k e MRU1 if every 
argument Y, e m 1,k is always unifiable with its 
corresponding argument Xt in P then remove 
ml.k from MRUI. For every set ml.,i = ml,k u 
{XI.j}, where X1j is an argument in R1 such 
that it is not already in m ~,~ and it is not always 
unifiable with its corresponding argument in P, 
and m 1,kj is not a superset of any other m u 
remaining in MRUI, add m 1.kj to MRUl.10 
(4) For each mlj e MRU1 (j=l'"rl) compute 
I.h.j := VAR(ml:) c~ VP. Let MP 1 = {IXl,j I 
~(I.h,j), j=l..-r'}, where r>0, and ~(dttl,j) = 
\[J.tl, j ~: Q~ or (LLh, j = O and VAR(mI,j) = O)\]. If 
MP1 = O then QUIT: (C1) is ill-formed and can- 
not be executed. 
which we generate is devoid of any constraints on the lexieal 
number of surface words, we may have to tolerate multiple 
choices, at some point. Any decision made at this level as to which 
arguments are to be essential, may affect the reversibility of the 
grammar. 
l0 An argument Y is always unifiable with an argument X if 
they unify regardless of the possible bindings of any variables oc- 
curring in Y (variables standardized apart), while the variables oc- 
curring in X are unbound. Thus, any term is always unifiable with 
a variable; however, a variable is not always unifiable with a non- 
variable. For example, variable X is not always unifiable with f (Y) 
because if we substitute g (Z) for X then the so obtained terms do 
not unify. The purpose of including steps (3) and (7) is to elim- 
inate from consideration certain 'obviously' ill-formed reeursive 
clauses. A more elaborate version of this condition is needed to 
take care of less obvious cases. 
215 
(5) For each ~h,j e MP1 we do the following: (a) 
assume that ~tl, j is "in" in R1; (b) compute set 
OUT1j of "out" arguments for R1; (c) call 
MSEAS(MSI,j,IXl.j,VP,2,0UTIj); (d) assign 
MS := t,_) MS 1,j. 
j=l..r 
(6) In some i-th step, where l<i<s, and MSEA = 
lxi-l,,, let's suppose that MRi and MRUi are the 
sets of active MSEA's and their instantiations 
with actual arguments of R i, for the literal Ri on 
the rhs of (C 1). 
(7) If R i = P then for every mi. u E MRUi if every 
argument Yt e mi. u is always unifiable with its 
corresponding argument Xt in P then remove 
mi.u from MRUi. For every set mi.uj = mi.u u 
{Xij } where X u is an argument in R~ such that it 
is not already in mio u and it is not always 
unifiable with its corresponding argument in P 
and rai, uj is not a superset of any other rai, t 
remaining in MRUi, add mi.,j to MRU I. 
(8) Again, we compute the set MPi = {!%.i I 
j=l ...r i}, where ~tid = (VAR (mij) - 
OUTi_l,k), where OUTi_I, ~ is the set of all "out" 
arguments in literals R 1 to Ri_ 1 . 
(9) For each I.t/d remaining in Me i where i$.s do the 
following: 
(a) if lXij = O then: (i) compute the set OUTj of 
"out" arguments ofRi; (ii) compute the union 
OUTi.j := OUTj u OUTi-l.k; (iii) call 
MSEAS (MSi.j,~ti_I.k, VP,i + I,OUTI.j); 
Co) otherwise, if ~ti.j *: 0 then find all distinct 
minimal size sets v, ~ VP such that whenever 
the arguments in v, are "in", then the argu- 
ments in l%d are "out". If such vt's exist, then 
for every v, do: (i) assume vt is "in" in P; (ii) 
compute the set OUT,.j, of "out" arguments in 
all literals from R1 to Ri; (iii) call 
MSEAS (MSi. h,la i_l,*t.mt, VP,i + 1,OUTi, h); 
(c) otherwise, if no such v, exist, MSid := ~. 
(10)Compute MS := k.) MSi.y; 
jfl..r 
(11)For i=s+l setMS := {MSEA}. 
The procedure presented here can be modified to 
compute the set of all MSEA's for P by considering 
all feasible orderings of literals on the rhs of (C1) and 
using information about all MSEA's for Ri's. This 
modified procedure would regard the rhs of (C1) as 
an tmordered set of literals, and use various heuristics 
to consider only selected orderings. 
REORDERING LITERALS IN CLAUSES 
When attempting to expand a literal on the rhs 
of any clause the following basic rule should be 
observed: never expand a literal before at least one its 
active MSEA's is "in", which means that all argu- 
ments in at least one MSEA are bound. The following 
algorithm uses this simple principle to reorder rhs of 
parser clauses for reversed use in generation. This 
algorithm uses the information about "in" and "out" 
arguments for literals and sets of MSEA's for predi- 
cates. If the "in" MSEA of a literal is not active then 
the rhs's of every definition of this predicate is recur- 
sively reordered so that the selected MSEA becomes 
active. We proceed top-down altering definitions of 
predicates of the literals to make their MSEA's active 
as necessary. When reversing a parser, we start with 
the top level predicate pa=a_gen (S, P) assuming 
that variable t, is bound to the regularized parse 
structure of a sentence. We explicitly identify and 
mark P as "in" and add the requirement that S must 
be marked "out" upon completion of rhs reordering. 
We proceed to adjust the definition of para_gen to 
reflect that now {P} is an active MSEA. We continue 
until we reach the level of atomic or non-reversible 
primitives such as concat, member, or dictionary 
look-up routines. If this top-down process succeeds at 
reversing predicate definitions at each level down to 
the primitives, and the primitives need no re- 
definition, then the process is successful, and the 
reversed-parser generator is obtained. The algorithm 
can be extended in many ways, including inter- 
clausal reordering of literals, which may be required 
in some situations (Strzalkowski, 1989). 
INVERSE("head :- old-rhs",ins,outs); 
{ins and outs are subsets of VAR(head) which 
are "in" and are required to be "out", respectively} 
begin 
compute M the set of all MSEA's for head; 
for every MSEA m e M do 
begin 
OUT := ~; 
if m is an active MSEA such that me ins then 
begin 
compute "out" arguments in head; 
add them to OUT; 
if outs cOUT then DONEChead:-old-rhs" ) 
end 
else if m is a non-active MSEA and m cins then 
begin 
new-rhs := ~; QUIT := false; 
old-rhs-1 := old-rhs; 
for every literal L do 
M L := O; 
{done only once during the inversion} 
repeat 
mark "in" old-rhs-1 arguments which are 
either constants, or marked "in" in head, 
or marked "in", or "out" in new-rhs; 216 
select a literal L in old-rhs-1 which has 
an "in" MSEA m L and if m L is not active in L 
then either M L = O or m L e ML; 
set up a backtracking point containing 
all the remaining alternatives 
to select L from old-rhs-1; 
if L exists then 
begin 
if m L is non-active in L then 
begin 
if M L -- ~ then M L := M L u {mL}; 
for every clause "L1 :- rhsu" such that 
L1 has the same predicate as L do 
begin 
INVERSECL1 :- rhsm",ML,~); 
if GIVEUP returned then backup, undoing 
all changes, to the latest backtracking 
point and select another alternative 
end 
end; 
compute "in" and "out" arguments in L; 
add "out" arguments to OUT; 
new-rhs := APPEND-AT-THE-END(new-rhs,L); 
old-rhs- 1 := REMOVE(old-rhs- 1,L) 
end {if} 
else begin 
backup, undoing all changes, to the latest 
backtracking point and select another 
alternative; 
if no such backtracking point exists then 
QUIT := true 
end {else} 
until old-rhs-1 = O or QUIT; 
if outs cOUT and not QUIT then 
DONE("head:-new-rhs") 
end {elseif} 
end; {for} 
GIVEUPCcan't invert as specified") 
end; 
THE IMPLEMENTATION 
We have implemented an interpreter, which 
translates Definite Clause Grammar dually into a 
parser and a generator. The interpreter first 
transforms a DCG grammar into equivalent PROLOG 
code, which is subsequently inverted into a generator. 
For each predicate we compute the minimal sets of 
essential arguments that would need to be active if 
the program were used in the generation mode. Next, 
we rearrange the order of the fight hand side literals 
for each clause in such a way that the set of essential 
arguments in each literal is guaranteed to be bound 
whenever the literal is chosen for expansion. To 
implement the algorithm efficiently, we compute the 
minimal sets of essential arguments and reorder the 
literals in the right-hand sides of clauses in one pass 
through the parser program. As an example, we con- 
sider the following rule in our DCG grammar: 11 
assertion (S) -> 
sa (SI) , 
subject (Sb), 
sa ($2), 
verb (V) , 
{Sb:np:number :: V:number}, 
sa (S3), 
object (O,V, Vp, Sb, Sp), 
sa ($4) , 
{S.verb:head : : Vp:head}, 
{S:verb:number :: V:number}, 
{S:tense : : \[V:tense, O:tense\] }, 
{S:subject :: Sp}, 
{S:object :: O:core}, 
{S:sa : : 
\[$1: sa, $2 : sa, $3: sa,O: sa, S4 : sa\] }. 
When lranslated into PROLOG, it yields the following 
clause in the parser: 
assertion (S, LI, L2) • - 
sa (SI, LI, L3) , 
subject (Sb, L3, L4), 
sa (S2, L4, L5), 
verb (V, L5, L6) , 
Sb:np:number :: V:number, 
sa (S3, L6, L7), 
object (0, V, Vp, Sb, Sp, L7, L8), 
sa ($4, L8, L2), 
S:verb:head : : Vp:head, 
S:verb:number :: V:number, 
S:tense :: \[V:tense,O:tense\], 
S:subject : : Sp, 
S:object :: O:core, 
S:sa : : 
\[Sl:sa, S2:sa, S3:sa,O:sa, S4:sa\] . 
The parser program is now inverted using the algo- 
rithms described in previous sections. As a result, the 
assertion clause above is inverted into a genera- 
tor clause by rearranging the order of the literals on 
its right-hand side. The literals are examined from the 
left to right: if a set of essential arguments is bound, 
the literal is put into the output queue, otherwise the 
tt The grammar design is based upon string grammar (Sager, 
1981). Nonterminal net stands for a string of sentence adjuncts, 
such as prepositional or adverbial phrases; : : is a PROLOG-defined 
predicate. We show only one rule of the grammar due to the lack 
of space. 
217 
literal is put into the waiting stack. In the example at 
hand, the literal sa (Sl, L1, L3) is examined first. 
Its MSEA is {Sl}, and since it is not a subset of the 
set of variables appearing in the head literal, this set 
cannot receive a binding when the execution of 
assertion starts. It may, however, contain "out" 
arguments in some other literals on the right-hand 
side of the clause. We thus remove the first sa 
literal from the clause and place it on hold until its 
MSEA becomes fully instantiated. We proceed to 
consider the remaining literals in the clause in the 
same manner, until we reach S: verb • head : • 
Vp : head. One MSEA for this literal is { S }, which is 
a subset of the arguments in the head literal. We also 
determine that S is not an "out" argument in any 
other literal in the clause, and thus it must be bound 
in assertion whenever the clause is to be exe- 
cuted. This means, in turn, that S is an essential 
argument in assertion. As we continue this pro- 
cess we find that no further essential arguments are 
required, that is, {S} is a MSEA for assertion. 
The literal S : verb: head : : Vp: head is out- 
put and becomes the top element on the right-hand 
side of the inverted clause. After all literals in the 
original clause are processed, we repeat this analysis 
for all those remaining in the waiting stack until all 
the literals are output. We add prefix g_ to each 
inverted predicate in the generator to distinguish 
them from their non-inverted versions in the parser. 
The inverted assertion predicate as it appears in 
the generator is shown below. 
g_assertion (S, L1, L2) • - 
S:verb:head :: Vp:head, 
S:verb:number :: V:number, 
S:tense :: \[V:tense,O:tense\], 
S:subject : : Sp, 
S:object :: O:core, 
S:sa : : 
\[SI : sa, $2 : sa, $3 : sa, O: sa, $4 : sa\] , 
g_sa ($4, L3, L2) , 
g_object (O,V, Vp, Sb, Sp, L4, L3), 
g_sa ($3, L5, L4), 
Sb:np:number :: V:number, 
g_verb (V, L6, L5), 
g_sa ($2, L7, L6) , 
g_subject (Sb, L8, L7), 
g_sa ($1, LI, L8) . 
A single grammar is thus used both for sentence pars- 
ing and for generation. The parser or the generator is 
invoked using the same top-level predicate 
pars_gen(S,P) depending upon the binding 
status of its arguments: if S is bound then the parser 
is invoked, if P is bound the generator is called. 
I ?- 
yes 
I ?- 
P = 
yes 
load_gram (grammar) . 
pars_gen(\[jane,takes,a,course\],P). 
\[\[catlassertion\], 
\[tense,present,\[\]\], 
\[verbltake\], 
\[subject, 
\[np,\[headljane\], 
\[numberlsingular\], 
\[classlnstudent\], 
\[tpos\], 
\[apos\] , 
\[modifier, null\] \] \], 
\[object, 
\[np,\[headlcourse\], 
\[numberlsingular\], 
\[classlncourse\], 
\[tpos I a\], 
\[apos\] , 
\[modifier, null\] \] \], 
\[sa, \[1, \[1, \[1, \[1, \[111 
?- pars_gen(S, 
\[\[catlassertion\], 
\[tense,present,\[\]\], 
\[verbltake\], 
\[subject, 
\[np,\[headljane\], 
\[numberlsingular\], 
\[classlnstudent\], 
\[tpos\], 
\[apos\], 
\[modifier, null\]\]\], 
\[object, 
\[np,\[headlcourse\], 
\[numberlsingular\], 
\[classlncourse\], 
\[tposla\], 
\[apos\], 
\[modifier,null\]I\], 
\[sa,\[\],\[\],\[\],\[\],\[\]\]\]). 
S = \[jane,takes, a, course\] 
yes 
GRAMMAR NORMALIZATION 
Thus far we have tacitly assumed that the 
grammar upon which our parser is based is wriuen in 
218 
such a way that it can be executed by a top-down 
interpreter, such as the one used by PROLOG. If this is 
not the case, that is, if the grammar requires a dif- 
ferent kind of interpreter, then the question of inverti- 
bility can only be related to this particular type of 
interpreter. If we want to use the inversion algorithm 
described here to invert a parser written for an inter- 
preter different than top-down and left-to-right, we 
need to convert the parser, or the grammar on which 
it is based, into a version which can be evaluated in a 
top-down fashion. 
One situation where such normalization may 
be required involves certain types of non-standard 
recursive goals, as depicted schematically below. 
vp (A, P) 
vp (A, P) 
v(A,P) 
-> vp(f (A, PI) ,P) ,compl (PI) . 
-> v(A,P) . 
-> lex. 
If vp is invoked by a top-down, left-to-right inter- 
preter, with the variable P instantiated, and if P1 is 
the essential argument in comp1, then there is no 
way we can successfully execute the first clause, 
even if we alter the ordering of the literals on its 
right-hand side, unless, that is, we employ the goal 
skipping technique discussed by Shieber et al. How- 
ever, we can easily normalize this code by replacing 
the first two clauses with functionally equivalent ones 
that get the recursion firmly under control, and that 
can be evaluated in a top-down fashion. We assume 
that P is the essential argument in v (A, P) and that 
A is "out". The normalized grammar is given below. 
vp(A,P) -> v(B,P),vpI(B,A). 
vpl (f (B, PI) ,A) -> vpl (B,A), compl (PI) . 
vpl (A,A) . 
v(A,P) -> lex. 
In this new code the recursive second clause will be 
used so long as its first argument has a form f(a,fl), 
where u and 13 are fully instantiated terms, and it will 
stop otherwise (either succeed or fail depending upon 
initial binding to A). In general, the fact that a recur- 
sive clause is unfit for a top-down execution can be 
established by computing the collection of minimal 
sets of essential arguments for its head predicate. If 
this collection turns out to be empty, the predicate's 
definition need to be normalized. 
Other types of normalization include elimina- 
tion of some of the chain rules in the grammar, esl~- 
ciany if their presence induces undue non- 
determinism in the generator. We may also, if neces- 
sary, tighten the criteria for selecting the essential 
arguments, to further enhance the efficiency of the 
generator, providing, of course, that this move does 
not render the grammar non-reversible. For a further 
discussion of these and related problems the reader is 
referred to (Strzalkowski, 1989). 
CONCLUSIONS 
In this paper we presented an algorithm for 
automated inversion of a unification parser for 
natural language into an efficient unification genera- 
tor. The inverted program of the generator is obtained 
by an off-line compilation process which directly 
manipulates the PROLOG code of the parser program. 
We distinguish two logical stages of this transforma- 
tion: computing the minimal sets of essential argu- 
ments (MSEA's) for predicates, and generating the 
inverted program code with INVERSE. The method 
described here is contrasted with the approaches that 
seek to define a generalized but computationally 
expensive evaluation strategy for running a grammar 
in either direction without manipulating its rules 
(Shieber, 1988), (Shieber et al., 1989), 0Vedekind, 
1989), and see also (Naish, 1986) for some relevant 
techniques. We have completed a first implementa- 
tion of the system and used it to derive both a parser 
and a generator from a single DCG grammar for 
English. We note that the present version of 
INVERSE can operate only upon the declarative 
specification of a logic grammar and is not prepared 
to deal with extra-logical control operators such as 
the cut. 
ACKNOWLEDGMENTS 
Ralph Grishman and other members of the 
Natural Language Discussion Group provided valu- 
able comments to earlier versions of this paper. We 
also thank anonymous reviewers for their sugges- 
tions. This paper is based upon work supported by 
the Defense Advanced Research Project Agency 
under Contract N00014-85-K-0163 from the Office 
of Naval Research. 
REFERENCES 
Colmerauer, Main. 1982. PROLOG H: 
Manuel de reference et mode& theorique. Groupe 
d'Intelligence Artificielle, Faculte de Sciences de 
Luminy, Marseille. 
Debray, Saumya, K. 1989. "Static Inference 
Modes and Data Dependencies in Logic Programs." 
ACM Transactions on Programming Languages and 
Systems, 11(3), July 1989, pp. 418-450. 
Derr, Marcia A. and McKeown, Kathleen R. 
1984. "Using Focus to Generate Complex and Sim- 
ple Sentences." Proceedings of lOth COLING, 
Bonn, Germany, pp. 319-326. 
219 
Dymetman, Marc and Isabelle, Pierre. 1988. 
"Reversible Logic Grammars for Machine Transla- 
tion." Proc. of the Second Int. Conference on 
Machine Translation, Pittsburgh, PA. 
Grishman, Ralph. 1986. Proteus Parser Refer- 
ence Manual. Proteus Project Memorandum #4, 
Courant Institute of Mathematical Sciences, New 
York University. 
McKeown, Kathleen R. 1985. Text Genera- 
tion: Using Discourse Strategies and Focus Con- 
straints to Generate Natural Language Text. Cam- 
bridge University Press. 
Naish, Lee. 1986. Negation and Control in 
PROLOG. Lecture Notes in Computer Science, 238, 
Springer. 
Pereira, Fernando C.N. and Warren, David 
H.D. 1980. "Definite clause grammars for language 
analysis." Artificial Intelligence, 13, pp. 231-278. 
Sager, Naomi. 1981. Natural Language Infor- 
mation Processing. Addison-Wesley. 
Shieber, Stuart M. 1988. "A uniform architec- 
ture for parsing and generation." Proceedings of the 
12th COLING, Budapest, Hungary (1988), pp. 614- 
619. 
Shieber, Smart M., van Noord, Gertjan, Moore, 
Robert C. and Pereira, Feruando C.N. 1989. "A 
Semantic-Head-Driven Generation Algorithm for 
Unification-Based Formalisms." Proceedings of the 
27th Meeting of the ACL, Vancouver, B.C., pp. 7-17. 
Shoham, Yoav and McDermott, Drew V. 1984. 
"Directed Relations and Inversion of PROLOG Pro- 
grams." eroc. of the Int. Conference of Fifth Gen- 
eration Computer Systems. 
Strzalkowski, Tomek. 1989. Automated Inver- 
sion of a Unification Parser into a Unification Gen- 
erator. Technical Report 465, Department of Com- 
puter Science, Courant Institute of Mathematical Sci- 
ences, New York University. 
Strzalkowski, Tomek. 1990. "An algorithm 
for inverting a unification grammar into an efficient 
unification generator." Applied Mathematics Letters, 
vol. 3, no. 1, pp. 93-96. Pergamon Press. 
Wedekind, Jurgen. 1988. "Generation as 
structure driven derivation." Proceedings of the 12th 
COLING, Budapest, Hungary, pp. 732-737. 
