A Symmetrical Approach to Parsing and Generation 
Marc Dymetman, Pierre Isabelle and Frangois Penault 
CCRIT, Communications Canada. 1575 Bld Chomedey. Laval (Qu6bec) H7V 2X2 CANADA 
Abstract. Lexicat Grammars are a class of unification grammars which share a fixed rule component, 
for which there exists a simple left-recursion elimination transformation. The parsing and generation 
programs ale seen as two dual non-left-recursive versions of the original grammar, and are implemented 
through a standard top-down Prolog interpreter. Formal criteria for termination are given as conditions 
on lexical entries: during parsing as well as during generation the processing of a lexical entry constimes 
some amount of a guide; the guide used for parsing is a list of words remaining to be analyzed, while the 
guide for generation is a list of the semantics of constituents waiting to be generated. 
I. Introduction 
Symmetry between parsing and 
generation. There is a natural appeal to the attempt 
to characterize parsing and ge~;era~ion in a symmetrical 
way. This is because the statement of the problem of 
reversibility is naturally synlmetrical: parsing is 
concerned with recovering semantic content from 
phonological content, generation phonological content 
from semantic content. It has been noted by several 
researchers (\[$88\], tN891, \[SNMP891) that certain 
problems (left-recursion) and techniques (left-corner 
processing, linking, Ear!ey deduction) encountered in 
the parsing domain hJ,'e o,rrelates in the generation 
domain. It is then na!:ural to wy and see parsing and 
generation as instances of a single paradigm; \[$881 and 
\[D\[88, DI90I are attempts in this direction, but are 
hindered by the fact that there is no obvious correlate 
in gene,'ation of the string indexing techniques so 
prominent in parsing {string indices in chart parsing, 
differential lists m DCG parsing). 
Guides. What we propose here is to take a step 
back .and abstract file notion of string index to that of a 
;,¢¢iUc. This gci~er,d notion ,,viii apply to both parsing 
aud generation, but it wi/! be instantiated differently in 
the va'o modes. The purpose of a guide is to orient the 
proof procedure, specific to either parsing or 
generation, in such a way that: (i) the guide is 
initialized as a direct function of the input (the string 
in parsing, thc semantics in generation), (it) the current 
stale of the ,~uide strongly constrains the next access lo 
the lexicon, (iii) after lexical access, the size of the 
guide strictly decreases (,gMde-consumption co~lditic.1, 
see section 3). Once a guide is specified, the generation 
problem (respectively the parsing problem I) then 
reduces to a problem fornml!y simihtr to the problem of 
parsing v, ith a DCG \[PW80} containing no empty 
productions 2 (ie rules whose right-hand side is the 
empty string \[\]). 
Several parsing techniques can be applied to this 
problem; we will be concerned here with a top-down 
parsing approach directly implementable through a 
standard Prolog interpreter. This approach relies on a 
lefi-recl~r,sioll-climination trans/brmation for a certain 
class of definite clause programs (see section 33. 
The ability 1o specify guides, for parsing or for 
generation, depends on certain compositionality 
hypotheses which the underlying grammar has to 
satisfy. 
I Thb, hall of the statcmenl ma> seem tautological, but it is not: see the attempt 
:it a reinlerprctalion of left exirap~sition iri terms of guides in section 5. 
2 Al~o <'ailed meh' r.h'x I11781. 
Hypotheses on compositionaHity. The 
parsing and generalion problems can be rendered 
tractable only if certain hypmheses are made 
concerning the composition of linguistic structures. 
Thus generation can be arduous if the semantics 
associated with the composition of two structures is the 
nm'estricted lambda-application 3 of tile first structure's 
semantics on the second structure's semantics: this is 
because knowledge of the mother's semantics does not 
constrain in a usable way the semantics of the 
daughters. 4 On the contrary, parsing is greatly 
simplified if the string associated with the composition 
of two strqctures is the concatenation of tile strings 
associated with each st,ucture: one can then use string 
indexing to orient and control tl'e progression of the 
parsing process, as is done in DCG under tile guise of 
"dil'ferential lisls". 
l,e×ical Granlmar. The formalism of Lexical 
Grammar (LG) makes explicit certain compositionality 
hypotheses which ensure the existence of guides for 
parsing as well as for generation. 
A Lexical Grammar has two parts: a (variable) 
lexicon and a (fixed) rule component. The rule 
component, a definhe clause specification, spells out 
basic linguistic compositionality rules: (i) how a well- 
formed linguistic structure A is composed from well- 
formed structures B and (27: (it) what .:ire the respective 
statuses of B and C (left constituent vs ri,,,ht 
constituent, syntactic head vs syntactic dependenl, 
semantic f-wad vs semantic depemlent): and (iii) how the 
string (,'esp. semantics, subcategorization list .... ) 
associated with A is related to the strinoA (resp. 
semantics, subcategorization lists .... ) associated with 
/3 and C (see sectioi, 2). 
The ability to define a guide for parsing is a 
(simple) consequence of the fact that the string 
associated with A is the concatenation of the strings 
associated with B and (.,5. The ability to define a guide 
for generation is a (less simple) consequence of LG's 
hypotheses on subcategorization (see sections 2 and 4). 
"~ By tmrestricted lambda-application, we mean functional application 
lbtlowed by, ivwriting to a ilOl'tlla\] lollll, 
4 In theories favoring such an approach (such as GPSG IGKPS871), parsing 
may be computatiollally tractable, but generation does not seem to be. These 
theories can be questioned as plausible computational models, for they should 
be judged on Iheir ability to account for production behavior (generation) as 
well as for understanding behavior {parsing). 
5 A fairly standard assumption, ll: empty string lealizalions are allowed, then 
extraposifion call still be handled, as '~ketched in section 5. 
90 1 
(P0) Lexical Grammar rules /N 
conse Kvat J ve addit i on conserva t J ve add:i t i on 
of parsing guide o\[ gent.ration guide ./ \ 
guided parsing guided generation 
(Plp) program (Plg) program 
(leli-recursive) (lefl-recursive) 
ieft-recdrsJ.on e\] il\[lJ nat Jol\] 
I l 
guided parsing guided generation 
(P2p) program (P2g) program 
(non left-recursive (non left-rccursive 
Fig. 1. A symmetrical approach to parsing and 
generation: paper overview 
Parsing and Generation with l,exical 
Grammar. Fig. I gives an overview of our approach 
to parsing and generation. Let us briefly review the 
niain points: 
-- (P0) is a definite clause specification of the 
original LG rules. It contains a purely 
declarative definition of linguistic 
compositionality, but is unsuitable for direct 
implementation (see section 2). 
..... (Pip) (resp (Plg)) is a guided conservative 
extension of (P0) for parsing (resp. for 
generation); that is, (Plp) (resp (Plg)) is a 
specification which describes the same 
linguistic structures as (P0), hut adds a certain 
redundancy (guiding) to help constrain the 
imrsing (resp. generation) process, ttowever, 
these definite clause programs are not yct 
adequate for direct top-down implementation, 
since they are left-recursive (see section 3). 
-- (Plp) and (Pig) can be seen as symmetrical 
instantiations of a common program schema 
(P1); (Pl) can be transformed into (P2), an 
equivalent non-leftorecursive program schema 
(see section 3). 
-- (P2p) (resp (P2g)) is the non-left-recursive 
version of (Plp) (resp. (Pig)). Under the 
guide-consumption condition, it is guaranteed 
to terminate in top-down interpretation, and to 
enumerate all solutions to the parsing (resp. 
generation) problem (see section 4). 
For lack o/' space, theorems are stated here without 
proofs'; these, and more details, can be \]bund in \[D9Ob\]. 
2. Lexical Grammar 
Rule component The fixed rule component of 
LG (see Fig. 3) describes in a generic way the 
combination of constituents. A constituent A is either 
lexically specified (second clause in the phrase 
definition), or is a combination of two constituents /3 
and C (first clause in the phrase definition). B and C 
play complementary roles along the following three 
dimensions: 
-- combine .strings : B is to the hift of C in the 
surface order, or conversely to the right of C. 
This information is attached to each 
constituent through the string order feature. 
-- combine syns : B is the syntactic-head and C 
the syntactic-dependent, or conversely 
(syn order feature). 
.... combine seres : B is the semantic-head and C 
the semantic-dependent, or conversely 
(sere_order feature). 
Because B and C play symlnetrical roles (' , these 
seemingly eight combinations actually redttce to four 
different cases. To avoid duplicating cases, in the 
definition o1' the phrase predicate, the symmetry has 
been "broketf' by arbitrarily imposing that B be the 
left constituent. 7 
Fig. 2 gives an example of a derivation tree in LG, 
using the lexicon of Fig. 4. 
A 
B 
ma<v /'~"~h 
D E of Te;; 
F G 
visited nolre dane 
A.subcat = \[\] A.sem : C.sem 
B.subcat = \[J = D.scm 
C.subcat = \[BJ = often(,visit(marv,nd)) 
D.subcat = left E.sem = F.sem 
E.subcat = \[B\] = visit(mary,nd) 
F.subcat = IG,B\] B.sem = mary 
G.subcat = \[\] G.sem = nd 
Fig. 2. A derivation in LG 
(heavy lines correspond to semantic-heads) 
Our notion of semantic-head is a variant of that 
given in \[SNMP89\], where a daughter is said to be a 
semantic-head if it shares the semantics of its mother. 
The combine seres predicate is responsible for 
assigning sere -I-wad status (versus sem dep status) to a 
phrase, and for-imposing the following constraints: 
i. the semantic-head shares its semantics with its 
mother, 
it. the semantic-head always subeategorizes its sister 
((b) in Fig. 3), 
iii. the mother's subeategorization list is the 
concatenation of the semantic-dependent list and 
of the semantic-head list minas the element just 
incorporated ((c) in Fig. 3). 8 
The subcategorization list attached to a constituent X 
corresponds to constituents higher in the derivation 
tree which are expected to fill semamic roles inside X. 
Subcalegorization lists are percolated flom the lexical 
entries up the deriw~tion tree according to iii. 
6 Remark: the rules are m)t DCG rules, bul simply d<finite (orl\]o!¢ 0 (tau.sc.~ 
7 If line (a) in the definitioll of phrase were omitled, the same ~('l el lingtListic 
strUCtLIFeS Wollld result, but some strLlcltlres ',A'otl\[d be described twice, Line 
(a) is simply onc means of clinlinating these spurious ambiguities. '\[he S~llllC 
el'lEcl would be produced by rephlcing (a) by fi.sem enter = sere \]wad or by 
B,spl ordcr = s>w head. 
8 hi fact, because of d~e constraints imposed by co#tM;w ,syn,s (see discussion 
bclm~ ) one of these lwo lists has to be empty. 
2 
91 
phrase(A) :- phrase(B), phrase(C), 
B.string order = left, 
COlnbine(B,C,A). 
phrase(A) :- term(A). 
(a) 
combine(B,C,A) :- 
(combine_strings(B,C,A);combine_strings(C,B,A)), 
(combine syns(B ,C,A);conrbine syns(C,B,A)), 
(combine_sems(B ,C,A);combine sems(C,B ,A)). 
combine_strings(B,C,A) :- 
B.string_order = left, C.string_order = right, 
append(B.string,C.string,A,string). 
combine sems(B,C,A):- 
B.sem order = sere head, C.sem_order = sem_dep, 
A.sem = g.sem, 
B.subcat = \[CIRest\], 
append(C.subcat,Rest,A.subcat). 
combine syns(B,C,A) :- 
B.syn_order = syn head, C.syn_order = syn_dep, 
A.cat = B.cat, 
( B.sem_order = sere_head, C.subcat = \[\] 
% complement 
; C.sem order = sere_head, C.subcat = \[_ \]). 
% modifier 
(b) 
(c) 
Fig. 3. The rules Hf Lexical Grammar 9 
Semantic-heads need not correspond to syntactic- 
heads. In the case of a mod~fi'er like often, in paris, or 
hidden by john, the modifier phrase, which is the 
syntactic-dependent, is the semantic-head and 
semantically subcategorizes its sister: thus, in the 
example of Fig. 2, the modifier phrase D semantically 
subcategorizes its sister E; combine sen:s has then the 
effect of unifying tile semantics of E (visit(ntary,nd)) to 
the substructure X in the semaatics (often(X)) attached 
) -e to D (see the lexical enty for ~jten in Fig, 4). This is 
reminiscent of work done in c'ttegorial gramnmr (see for 
instance IZKC~ ~l), where a n'odifier is seen as having a 
category of the fornl A/A, aud acts ;.Is a functor on the 
group it modifies. 
The combine syms predicate is responsible for 
assigning swz_head status (verstls syndcp status) to a 
phrase, and for ensuring the following constraints: 
i. Tile category cat of the ssntactic-head is 
transmitted to the mother. The category of a 
phrase is lherefore always a projection of the 
category (n.vpa ..) of some lexical item. 
ii. When the syntactic-dependent is the same as tile 
semamic-dependent, then the syntactic- 
dependent is semantically saturated (its subcat 
is empty). This is the case when the syntactic- 
dependent plays the syntactic role of a 
complement to its syntactic-head. 
iii. When the syntactic-dependent is tile same as 
the semantic-head, then tile syntactic- 
dependent's subcat contains only one 
element m. This is the case when the syntactic- 
dependent plays the syntactic role of a 
rood(fief to its syntactic-head. 
The lexicon in LG Because LGs have a 
fixed rule component, all specific linguistic knowledge 
9 Here, as in the sequel, we have made use of a "dot notation" for functional 
access to the different featttros of a linguistic structure A: for instance, A.cat 
represen%; the content of tile ('at feature ill A. 
l0 The "external argument" of the modifier, identified with the ~;emantic- 
dependent by tile semantic combhmtkm rule. 
term(T) :- T.sem = mary, 
T.string = \]mary\], 
T.cat = n, T.subcat = 11. 
term(T) :- T.sem = notredame, 
T.string = \[notre,darnel, 
T.cat = n, T.subcat = \[\]. 
term(T) :- T.sem = paris, 
T.string = \[paris\], 
T.cat = n, T.subcat = \[\]. 
term(T) :- T.sem = die(S.sem), 
T.string =\[died\], 
T.cat = v, T.subcat = IS\], 
S.string order = left, 
S.cat = n, S.syn_order = syn_dep. 
term(T) :- T.sem = visit(S.sem,O.sem), 
T.string = \[visited\], 
T.cat = v, T.subcat = tO,S\], 
S.string order = left, S.cat = n, 
S.syn_order = syn dep, 
O.string order = right, O.cat = n, 
O.syn_order = syn dep. 
term(T) :- T.sem = in(S.sem,O.sem), 
T.string = \[in\], 
T.cat = p, T.subcat = tO,S\], 
S.string_order = left, S.cat = v, 
S.syn order = syn head, 
O.string_order = right, O.cat = n, 
O.syn_order = syn dep. 
term(T) :- T.sem = often(S.sem), 
T.string = \[often\], 
T.cat = adv, T.subcat = IS\], 
S.string_order = _, % may be left or right 
S.cat = v,,S.syn order= syn head. 
Fig. 4. Lexical entries in LG 11 
is contained in the lexicon. Fig. 4 lists a few pos~'~ble 
lexical entries. 
Consider a typical entry, for instance the cntry for 
in. This entry specifies a possible leaf T of a derivation 
tree. T has the following properties: 
i. T has string \[in\], and is of category p 
(preposition). 
ii. T semantically subcalegorizes two phrases: O 
(the object of the preposition), of category n. 
and S (the "implicit subject" of the 
preposition), of category v. By the general 
constraints associated with combine seres, 
this means that S and O will both have 
semantic-dependent status. 
iii. In the surface order, S is to the left of its 
semantic-head, while O is to the right of its 
semantic-head. 
iv. The semantics in(S.sem,O,sem) of 7 is obtained 
by unification from the semantics of its 
subcategorized constituents S and O. 
v. S is constrained to having syntacticqmad status, 
and O to having syntactic-dependent status. 
Because of the constraints imposed by 
combine syns, this means that O will be a 
syntactic complement of the preposition, and 
that the prepositional phrase will be a 
modifier of its "subject" S. 
Idioms. The lexical apparatus allows for a direct 
account of certain types of idiomatic constructions. For 
instance, if the lexical entries of Fig. 5 are added to the 
For eas ~ is f ex msilion, tile c )tltlib itioll of he tense to the semantics of 
verbs is ignored here. 
92 3 
lexicon, then the expression "X kicked the bucket" will 
he assigned the semantics die(X). Entry (a) expresses 
the fact that (in its idiomatic use), the verb form kicked 
subcategorizes for a subject S and an object 0 whose 
semantics is thebucket, and is itself assigned the 
semantics dietS.sere). 
term(T) :- T.sem = die(S.sem), 
T.string = \[kicked\], 
T.cat = v, T.subcat = \[O,SI, 
S.string order = left, S.cat = n, 
S.syn order = syn_dep, 
O.string order = right, O.cat = n, 
O.syn order = syn dep, 
O.sem = the_bucket. 
term(T) :- T.sem = the_bucket, (b) 
T.slring = lthe,bucket\], 
T.cat = n, T.subcat = \[I. 
(a) 
Fig. 5. Idioms in LG 
3. Guides and lefl-recursion elimination 
Guide,i. Consider a finite string l t, and let 12 be a 
proper suffix of ll, l 3 be a proper suffix of 12, and so 
on. This operation call only be iterated a finite number 
of times. The notion of guide-structure generalizes this 
~,;ituation. 
DEFINITION 3.1. A guide-structure is a partially 
ordered set G which respects the descending chain 
condition, i.e the condition that in G all strictly 
decreasing ordered chains 11 > 12 > ... > l i > ... are 
,finite. 
Consider now the following elementary definite 
clause program (P0')t2: 
a(A) :- a(B), ~(B.A). (P0) 
a(A) :- ttA). 
We assume here that g) is an abbreviation which 
,,;lands for a disjunction (C:,'-...'('k) of conjunctions Q of 
goals of the form a(A), t(A), or {T=S} (unification 
::oals) where the T, S are variables or partially 
iustantiated terms. Among the variables appearing 
inside 'i), only the "interface" variables A, B are 
explicitly mentioned. We further assume that the 
defining clauses (not shown) for the t predicate have 
right-.hand sides which are conjunctions of term 
unification goals {T=S}. We call t the lexicon 
predicate, and a the generie nonterminal predicate. 
Consider now the following program (Pl), called a 
guided extension of (P0): 
a'(A,Li,,,Lout) :- a'(B,Li,,Li,ter) , (Pl) 
ff~'(B ,A ,Lmter,Lma). 
a'(A,Lm,Lma ) .'- t'(A,Lin,Lout). 
(Pl) is obtained from (P0) in the following way: (i) 
guide variables (Lin, Linte r, Lout)have been threaded 
throughout (P0), and (it) the l-predicate t has been 
rcphtced by a 3-predicate t'which is assumed to be a 
r<finement of t, ie, Jbr all A, Li,, Lot ., t'(A,Lip~,Lour) 
imp.lies t(A). 
Program (Pl) is a more constrained version of 
program (P0): t' can be seen as a version of t which is 
able to "consult" Liv ~, thus coostraining lexical access at 
each step. We will be interested in programs (Pl) which 
respect two conditions: (i) the guide-consumption 
I! Only programs of the (P0) form are discussed here, but the subsequent 
discussion of guides generalizes easily to arbitrary definite clause programs. 
condition, and (it) the conservative extension 
condition. 
I)~iFlNrrlOY 3.2. Program (PI) is said to satisfy the 
guide-consumption condition if/" (i) the guide variables 
take their values in some guide-structure G, and (it) any 
call to t'(A,Lin,Lout) with Lin fully instantiated returns 
with Lou t ./idly instantiated and strictly smaller in G. 
DEFINITION 3.3. Program (P1) is said to be a 
conservative extension of (PO) iff: a(A) is provable in 
(PO) e:> there exist Lin,Lou t such that a'(A,Lin,Lout) is 
provable in (P1). 
The ~ part of the previous definition is 
automatically satisfied by any program (P1) defined as 
above. The ~ part, on the other hand, is not, hut 
depends on further conditions on the refinement t' of t. 
Saying that (PI) is a conservative extension of (P0) is 
tantamount to saying that (P1) adds some redundancy to 
(P0), which can be computationally exploited to 
constrain processing. 
Left-recursion elimination 13. Program (PI) 
is left-recursive: in a top-down interpretation, a call to 
a' will result in another immediate call to a', and 
therefore will loop. On the other hand the following 
program (P2) is not left-recursive, and Theorem 3.4 
shows fllat it is equivalent to (Pl): 
a'(A,,,Li.,L .) :- t'(Ao,Li,l,Lo), aux(Ao,A,l,Lo,L,fl, 
aux(An,An,Ln,L.). 
aux(Ai,An,Li,L n) :- ff)i'Ai,Ai+t,Li,Li+l), 
aux(A i+ 1 ,A n,Li + I ,L.). 
(P2) 
Here, ,.to' and t' are the same as in (P1), and a new 
predicate aux, called the auxiliary nonterminal predicate 
has been introduced.~4 
THFORFM 3.4. Programs (P\] ) and (P2) are equivalent 
in predicate a'.l 5 
The fact tMt (p2) is not left-recursive does not 
alone guarantee termination of top-down interpretation. 
However, if (PI) respects the guide-consumption 
condition and a further condition, the no-ehain 
condition, then (P2) does indeed terminate. 16 
DEFINrrIoN 3.5, Program (P1) is said to re,v~ect the 
no-chain condition llf each goal conjunction Ci' 
appearing in ©' contains at least one call to a' or to t'. 
THEOREM 3.6. Suppose (PI) satisfies both the 
guide-consumption condition attd the no-chain 
condition. Then relative to top-down, depth-first, 
interpretation of (P2), the query a(A,L0,Ln), with L 0 
completely instantiated, has a finite SLD search tree \] 7 
associated with it (in other words, all its solutions will 
be enumerated through backtracking, and the program 
will terminate). 
4. Parsing and generation in Lexical 
Grammar 
The rules of Fig. 3 are completely symmetrical in 
their specification of syntactic compositionality, 
13 The general problem of left-recm'sion elimination m I)CGs (including 
chain rules and mall rules \[H78\]) is studied in \[D90al; the existence of a 
Generali=ed Greibaeh Normal Form is proven, and certain decidability results 
are givcll. 
14 The (PI) ~ (I)2) translbrmation is closely related to lej?-eorner parsing 
\[MTIIMY83\], which can in fact be recovered fronl this transformation 
through a certain encoding t)rocedurc (see ID90b\]), 
15 That is: a'(A,LM,Lou t) is a consequence of (P l) i ff a'(A,Lin.Lout) is a 
consequcnce o\[ (P2). 
16 In tile context of (?FGs, tile no chain condition would Colrk~spolld it) a 
gl'allll/lal without ¢\]la\[l~ rides, alld tile guide collgtllilption Collditioll \[o a 
granlmar without null rules. 
17 See \[L87\] A)r a definition of SI,D search tree. 
4 93 
(B,Lo,Lt) 
mary 
(A,L0,L4) 
1 
(C,LI,L4) 
(D,L, ,L2) (E,L2,L4) 
o/re,, 
(F,L2,L3) (G,L3,L4) 
visited notre dame 
L 0 = \[mary,often,visited,notre,dame\] 
L 1 = \[often,visited,notre,dame\] 
L 2 = \[visited,notre,dame\] 
L 3 = \[notre,dame\] 
L 4 = \[\] 
Fig. 6. A guide for parsing 
"string" compositionality and semantic 
cnmpositionality is. The symmetry between string 
compositionality attd semantic compositionality will 
allow us to treat parsing and generation as dual aspects 
of the same algorithm. 
Orienting the rules. The phrase predicate can 
be rewritten in either one of the two forms: phrase j), 
where emphasis is put on the relative linear order of 
constituents (h, ft vs. right), and phrase_g, where 
emphasis is put on the relative semantic status 
(semantic head vs. semantic dependent) of constituents. 
phrases~(A) :-phrase_p(B), 'I'(B,A). (POp) 
phrase p(A) :- term(A) 
where 'I'(B,A) stands for: 
• t'(B,AJ - phrase~(CL 
B.strin3 order = left, 
combine(B,C,A). 
and 
phrase_g(A) :- phrase g(B), G(B,A). (P0g) 
phrase_g(A) :- term(A) 
where G(B,A) stands for: 
G(B.A) -~ phrase_g(C), 
B.sem order = head, 
combine(B,C,A). 
LEMMA 4.1. phrase_p and phrase g are both 
equivalent to phrase. 
The phrase j) (resp. phraseg) programs are now 
each in the format of the (P0) program of section 3, 
where a has been renamed: phrase p (resp. phrase_g), 
and 09: P(resp. G). 
These programs can be extended into guided 
programs (Plp) and (Plg), as was done in section 3: 
phrasej/(A,Lin,Lou t) :- (Plp) 
phrase p'(B,Lin,Linter), P'(B,A,Linter,Lout). 
phrase_p'(A,Lin,Lout) :- term~o'(A,Lin,Lout). 
where: 
and 
W(B,A,Li,lte,.,Lout) -~ phrase~/(C,Linte,,Lout), 
B.string order = h'fi, 
combine(B,C,A). 
(Dp) 
phrase g '(A ,L m,Lou t) :- (P 1 g) 
phr ase_g'( B ,Li,,,Linte,.), G'( B ,A,Lmter,Lom). 
phrase g'(A,Lm,Lout) :- term g'(A,Lin,Lout). 
where: 
G(B,A,Linte,.,Lout) =~ phrase g'(C,Linter,Lout), (Dg) 
B.sem order = head, 
combi-ne(B,C.A ). 
In these programs, term p' and term_g' are the 
refinements of term (corresponding to t' in program 
(P1) of section 3) used for parsing and generation 
respectively. Their definitions, which contain the 
substance of the guiding technique, are given below. 
N.B. Programs (Plp) and (Pig) respect the no- 
chain condition:phrase_p' is called inside 'P', and 
phrase_g' is called inside G'. 
A conserv'ltive guide for parsing. Let us 
define term_p' in the following way: 
term I/(A,Lin,Lou t) :- term(A), 
append(A.string,Lo,,.Li,~). 
(Gp) 
It is obvious that term p' is a refinement of term. 
Using the definition of combinestrings" in section 2, 
one can easily show that program (PIp) is a 
conservative extension of program (POp). 
The guide-structure Gp is the set of character 
strings, ordered in the following way: st\] <_ st2 iff stl 
is a suffix of st2. If the lexicon is such that for an 5 ' 
entry term(A), A.string is instantiated and is different 
from the empty list, then it can easily be shown that 
(PIp) respects the guide-consumption condimm. 
The guide just introduced for parsing is simply a 
restatement in terms of guides of the usual differential 
lists used in the Prolog translation of DCG rules. 
A conservative guide for generation. Let 
us define term g" in the following way (using the 
auxiliary predicate extract sems): 
term_g'(A,Lin.Lo, t) .'- term(A), 
L m=\[A.sem/Lmter\], 
extract sems(A.subcat,SubcatSems), 
append(SubcatSems,Li,te!.,Lont). 
extract_sems( \[\],/ \] ). 
extract_sems(\[X/Rest\],\[X.sem/RestSems\]).'- 
extract sems(Rest.RestSems). 
(Gg) 
The guide structure L used for generation is a list of 
semantic structures, initially instantiated to IS.semi, 
where S is the linguistic structure to be generated, of 
which the semantics S.sem is known. When a call 
term g'(A,Lin,Lo,a) to the lexicon is made, with Lin 
instantiated to a list of semantic structures, the lexical 
structure A selected is constrained to be such that its 
semantics A.sem is the first item on the Lin list. The 
A.sem element is "popped" from the guide, and is 
replaced by the list of the semantics of the phrases 
subcategorized by A. (Fig. 7 illustrates the evolution of 
the guide in generation.) 
18 This symmetry should not be obscured by tile fact that, in order to avoid 
duplicating clauses with the same logical content, the presentation of tile rules 
appears otherwise (see above the discussion of "broken symmetry"). 
94 5 
(A,L0,L4) 
(C,Lo,L3) {B,L3,L4) _,i~ mary 
(D,Lo,LI) (E,LbL3) 
(F,L,,L2> <G,L2,L3> 
visited non'e dante 
L 0 : \[often(visit(rnary,nd))\] 
L 1 = \[visit(mary,nd)\] 
L 2 = \[nd,mary\] 
L 3 = \[mary\] 
1. 4 = \[\] 
Fig. 7. A guide for generation 
It is" obvious that term_g' is then a refinement of 
term, and furthermore, using the definition of 
eombine sems in section 2, one can prove: 
Lt",MMA 4.2. Progranl (Plg) is a conservative 
extension of program (POg). 
7'he guide.consumption eonditio~ in generation. 
Let us define recursively the size of an LG semantic 
representation as the function fi'om terms to natural 
numbers such that: 
size\]atom\] = 1 
size\[atom(T I ..... T,)\] = 1 + sizelTl\] + ... + sizelT,J 
Assume now that, for any entry term(A), the 
lexicon respects the following condition: 
If A.se,n is fully instantiated, then the A.subcat 
list is instantiated sufficiently so that, for any 
element X of this list, (i) X.sem is J'ully 
instantiated, and (ii) X.sem has a strictly smaller 
size than A.sem. 
Under these conditions, one can define a guide-structure 
Gg (see \[D90b\]), and one can prove: 
LEMMA 4.3. Program (Plg) satL@'es the guide- 
consumption condition. 
The resulting programs for parsing and 
generation. After the left-recursion elimination 
transforrnation of section 3 is performed, the parsing 
and generation programs take the following forms: 
phrase p'(An,Lm,Ln) :- term l/(Ao,Lin,Lo), 
aux fl(Ao,A n,LO,Ln). 
a ux_j)( A n,A ,,L n ,L pO . 
aux J~(Ai,An,Li,Ln) .'- fP'(Ai,Ai+ 1,Li,Li+ l ), 
auxj)(Ai+ l,An,Li+ l,Ln)" 
phrase_g'(An,Li,,,Ln) .'- term_g'(Ao,Li,,,Lo), 
aux_g(Ao,A~,Lo,Ln). 
attx__g( A n,A ,~,Ln,L"). 
atcr_ ,g(Ai,An,Li,L,) :- G'(Ai,Ai+ I,Li,Li+ I ), 
aux_g(Ai+ 1 ,A,~,Li+ l ,L,,). 
That is, after expliciting term_p', term_g', ft" and G' 
(see (Gp), (Gg), (Dp), (Dg), above), these programs 
take the forms (P2p) and (P2g) in Fig. 8; for 
parse(S.string,S.sem) :- 
S.cat =v, S.subeal=\[\], 
phrase_p'(S,S.string,\[\]). 
% S is a sentence 
phrasej)'(A,,,Li,,L n) .'- term(A), 
append(A.string,Lo,Lin), 
aux.p( A o,A n,l,O,Ln). 
au-v j)(A n,An,Lt,,Ln). 
aux p(Ai,An,Li,Ln) :- phrasej/(C,Li,Li+l), 
Ai.string order= le/'t, 
combine(A i, C,A i+ l ), 
aux p(Ai+l,An,Li+t,L,). 
(P2p) 
generate(S.string,S.sem) .'- 
S.eat =v, S.subcat=\[\], 
phrase g'(S,lS.sem\],l /). 
% S is a sentence 
phrase g'(A,,,Lin,Ln) .'- term(A), 
Lit ' = \[A .sem/Linte,.\], 
extract sems(A.subeat,SubeatSems), 
append(SubeatSems,Li,te,.,Lo), 
aux g(Ao,A,,Lo,Ln). 
atzx g(A n,A n,L n,Ln). 
au.r g(Ai,A,~,L i,L") .'- phrase_g'(C,L i,Li+ l ), 
Ai.sem_order = head, 
c ombine( A i,C,A i + l), 
aux g(Ai+ l,An,Li+ 1,L"). 
extract_seres(/i,\]\]). 
extractsems(\[X/Rest\],lX.sem/RestSemsl ).- 
extract sems(Rest,RestSems). 
(P2g) 
Fig. 8. The final parsing and generation programs parse 
and generate 
convenience interface predicates parse and generale arc 
provided. 
Under the conditions on the lexicon given above 
-- which are satisfied by the lexicon of Fig. 4 - , 
programs (Plp) and (Pig) both respect the guide- 
consumption condition; they also respect the no-chain 
condition (see remark following the description of 
(Pip) and (Plg)); Theorem 3.6 applies, and we have the 
following result: 
/f parse(A.string,A.sem) (resp. 
gencrate(A.string,A.sem)) is called with A.string 
instantiated (re,v). A.sem inslantialed), then all 
solutions will be enumerated on baeklracking, and 
the query will terminate. 
5. Further research 
Handling extrapositinn with guides. The 
specific guides defined above for parsing and generation 
are not the only possible ones. If for some reason 
certain conditions on the lexicon are to be relaxed, 
then more sophisticated guides must and can be defined. 
Thus, the guide introduced above for parsing 
essentially assumes that no lexical entry has an empty 
string realization. This condition may be too strict for 
certain purposes, such as handling traces. 
Interestingly, however, the guide consumption 
condition can still be imposed in these cases, if one 
takes care to suitably enrich the notion of guide. 
I,et us assume, fl)r instance, that there be a general 
syntactic constraint to the effect that two empty lexical 
6 95 
items cannot immediately follow each other 19. Let us 
then posit as a guide structure, instead of a list L of 
words, a couple <L,B>, where B is a variable restricted 
to taking values 0 or 1. Suppose further that these 
couples are ordered "lexicographically", ie that: 
VL, L',B,B' 
L < L' ~ <L,B> < <L',B'> 
L= L'A B<B' ~ <L,B> < <L,B'>. 
It is easy to see that the set of guides is then a 
partially ordered set which respects the descending 
chain condition. 
Let us finally assume that term_p' is redefined in 
the following manner: 
term p'(A,<Lin,Bin>,<Lout,Bout>) :- 
term(A), 
append( A.strin g ,Lout,Lin ) , 
( A.string = \[\], Bin =l, Bout = 0 
; A.string #\[\],Bin = ,Bout = 1 ). 
It can be shown that this definition of guide_parse is 
sufficient to ensure the guide-consumption condition, 
and therefore guarantees the termination of the parsing 
process. 
Variations on this idea are possible: for instance, 
one could define the guide as a couple <L,X> where X is 
a list of left-extraposed constituents (see \[P81\]). Any 
time a constituent is added to the extraposition list X, 
this operation is required to consume some words from 
L, and any time a trace is encountered, it is required to 
"cancel" an element of X. Because the lexicographical 
order defined on such guides in the following way: 
k'L,L',X,X' 
L < L' -~ <L,X> < <L',X'> 
L= L' ,,~ X < X' ~ <L,X> < <L,X'>. 
respects the descending chain condition, the parsing 
process will be guaranteed to terminate. 
6. Conclusion 
This paper shows that parsing and generation can 
be seen as symmetrical, or dual, processes exploiting 
one and the same grammar and lexicon, and using a 
basic l<ft-recursion elimination transformation. 
Emphasis is on the simplicity and symmetry of 
linguistic description, which is mostly contained in 
the lexicon; compositionality appears under three 
aspects: string compositionality, semantic 
compositionality, and syntactic compositionality. The 
analysis and generation processes each favor one 
aspect: string compositionality in analysis, semantic 
compositionality in generation. These give rise to two 
guides (analysis guide and generation guide), which are 
generalizations of string indexes. The left-recursion 
elimination transformation described in the paper is 
stated using the general notion of guide, and is 
provably guaranteed, under certain explicit conditions, 
to lead to termination of the parsing and generation 
processes. We claim that the approach provides a 
simple, yet powerful solution to the problem of 
grammatical bidirectionality, and are currently testing it 
as a possible replacement for a more rule-oriented 
19 A counter-example to this simplistic assumption is not hard to come by: the 
person who I john persuaded e I PRO to drink. However, the assumption gives 
the flavor of a possible set of strategies for handling empty categories. 
grammatical component in the context of the CRITTER 
translation system \[!DM88\]. 
Acknowledgments 
Thanks to Michel Boyer, Jean-Luc Cochard and 
Elliott Macklovitch for discussion and comments. 

References 

\[D90a\] Dymetman, Marc. A Generalized Greibach 
Normal Form for Definite Clause Grammars. Laval, 
Qu6bec: Minist~re des Communications Canada, Centre 
Canadien de Recherche sur l'Informatisation du Travail. 

\[D90b\] Dymetman, Marc. Left-Recursion 
Elimination, Guiding, and Bidirectionality in Lexical 
Grammars (to appear). 

\[DI88\] Dymetman, Marc and Pierre lsabelle. 
1988. Reversible Logic Grammars for Machine 
Translation. In Proceedings of the Second International 
Conference on Theoretical and Methodological Issues in 
Machine Translation of Natural Languages. Pittsburgh: 
Carnegie Mellon University, June. 

\[DI90\] Dymetman, Marc and Pierre Isabel\]e. 
1990. Grammar Bidirectionality through Controlled 
Backward Deduction. In Logic and Logic Grammars for 
Language Processing, eds. Saint Dizier, P. and S. 
Szpakowicz. Chichester, England: Ellis Horwood. 

\[GKPS87\] Gazdar, Gerald, Ewan Klein, Geoffrey 
Pullum and Ivan Sag. 1985. Generalized Phrase Structure 
Grammar. Oxford: Basil Blackwell. 

\[H78\] Han'ison, Michael A. 1978. lhtroduction 
to Formal Language Theory. Reading, MA: Addison- 
Wesley. 

\[IDM88\] Isabelle, Pierre, Marc Dymetman and 
Etliott Macklovitch. 1988. CRITTER: a Translation 
System for Agricultural Market Reports. In Proceedings 
of the 12th International Conference on Computational 
Linguistics, 261-266. Budapest, August. 

\[L87\] Lloyd, John Wylie. 1987. Foundations of, 
Logic Programming, 2rid ed. Berlin: Springer-Verlag, 

\[MTHMY831 Matsumoto Y., H. Tanaka, H. 
Hirikawa, H. Miyoshi, H. Yasukawa, 1983. BUP: a 
bottom-up parser embedded in Prolog. New Generation 
Computing 1:2, 145-158. 

\[PWS0\] Pereira, Fernando C. N. and David H. D. 
Warren. 1980. Definite Clause Grammars for Language 
Analysis. Artificial Intelligence: 13, 231-78. 

\[P81\] Pereira, Fernando C. N. 198l. 
Extraposition Grammars. Computational Linguistics 
7:4, 243-56. 

\[$88\] Shieber, Stuart M. 1988.. A Uniform 
Architecture for Parsing and Generation. In Proceedings 
of the 12th International Conference on Computational 
Linguistics, 614-19. Budapest, August. 

\[SNMP89\] Shieber, Stuart, M., Gertjan van 
Noord, Robert Moore and Fernando Pereira. 1989. A 
Semantic-Head-Driven Generation Algorithm for 
Unification-Based Formalisms. In Proceedings of the 
27th Annual Meeting of the Association for 
Computational Linguistics, 7-17. Vancouver, BC, 
Canada, June. 

\[N89\] Van Noord, Jan. 1989. BUG: A Directed 
Bottom-up Generator for Unification Based Formalisms. 
Working Papers in Natural Language Processing No. 4. 
Utrecht, Holland: RUU, Department of Linguistics. 

\[ZKC87\] Zeevat, H., E. Klein, and J. Calder. 
1987. Unification Categorial grammar. Edinburgh: 
University of Edinburgh, Centre for Cognitive Science, 
Research Paper EUCCS/RP-2I. 
