A Generalized Greibach Normal Form 
for Definite Clause Grammars 
Marc Dymetman 
CCRIT, Communications Canada 
1575 boul. Chomedey 
Laval (Qutbec) H7V 2X2, Canada 
dymetman@ccrit.doc.ca 
Abstract 
An arbitrary definite clause grammar can be transforaled 
into a so-called Generalized Greibach Normal Form 
(GGNF), a generalization of the classical Greibach Nor- 
mat Form (GNF) for context-free grammars. 
The normalized definite clause grammar is declara- 
tively equivalent to the original definite clause grammar, 
that is, it assigns the same analyses to the same strings. 
Offline-parsability of the original grammar is reflected in 
an elementary textual property of the transformed gram- 
mar. When this property holds, a direct (top-down) Pro- 
log implementation of the normalized grammar solves 
the parsing problem: all solutions are enumerated on 
backtracking and execution terminates. 
When specialized to the simpler case of context-free 
grammars, the GGNF provides a variant to file GNF, 
where the transformed context-free grammar not only 
generates the same strings as the original grammar, but 
also preserves their degrees of ambiguity (this last prop- 
erty does not hold for the GNF). 
The GGNF seems to be the first normal form result 
for DCGs. It provides an explicit factorization of the 
potential sources of uudeeidability for the parsing prob- 
lem, and offers valuable insights on the computational 
structure of unification grammars in general. 
1 Introduction 
From the point of view of decidability, the parsing prob- 
lem with a definite clause grammar is much more com- 
plex than the corresponding problem with a context-free 
grammar. The parsing problem with a context-live gram- 
mar is decidable, i.e. there exists an algorithm which de- 
termines whethex a given string is accepted or not by the 
grammar; t by contrast, the parsing problem with a deft- 
nite clause grammar is undecidable, even if the auxiliary 
Prolog predicates are restricted to be unifications. 2 
t For instance the standard top-down parsing algorithm using 
a GNF of the grammar. 
2This assumption, or a similar one, is always made when 
the decidability proparfies of logical (or unification) grammars 
are studied. The reason is simple: if the auxiliary predicates 
are defined through an unrestricted Prolog program, there is 
no means of guaranteeing that calling some auxill~y predicate 
will not result in nontexmination, for reasons quite independent 
from the structure of the grammar under consideration. The 
In other words, there does not exist, in general, an al- 
gorithm for deciding whether a definil~ clause grammar 
DCGI accepts a string String, i.e. whether there exists 
some linguistic structure S such that DCG1 "analyses 
String into S'. On the other hand, ander the condi- 
tion that DCCx is offline-parsable, that is, under the 
condition that the context-free skeleton of DCG1 is not 
infinitely ambiguous, then the parsing problem is indeed 
decidable \[7\]. 
The fact that the parsing problem with DCGt is de- 
cidable in theory should be carefully distinguished from 
the fact that a given algorithm for parsing is able to ex- 
ploit this potentiality. A parsing algorithm for which this 
is the case, that is, an algorithm which is able, for any 
offline-parsable DCGx, to decide whether a given string 
is accepted or not by DCG1 is said to be strongly stable 
\[7\]. Most parsing algorithms are not strongly stable s, a 
notable exception being the "Earley deduction" algorithm 
\[7\], once modified according to the proposals in \[10\]. 4 
Top-down implementations--and in particular the 
standard Prolog implementation of a DCG---are espe- 
cially fragile in this respect, for they fail to terminate as 
soon as the grammar contains a left-recursive rule such 
as: 
vp(VP2) ~ vp(VPt),pp(PP), 
{ combine(V P1, P P, V P2) }. 
In \[2\], automatic local transformations were performed 
on a DCG in order to eliminate some limited, but frequent 
in practice, types of left-recarsion in such a way that 
the resulting grammar could be directly implemented in 
Prolog. This initial work led us naturally to the more 
general question: 
Question: Is it possible to automatically transform an 
assumption that auxiliary predicates are unifications takes care 
of this interfering problem. 
aAn important cl~s of offline-parsable grammars---which 
we will call here the class of explicitly offline-parsable 
grammars--is the class of DCGs whose context free skeleton 
is a proper context-free grammar, that is, a grammar without 
rules of the type A ~ \[ \] (empty productions) or of the type 
A ~ B (chain rules) \[1\]. This subclass is much less problem- 
atic to parse than the full class of offline-parsuble DCGs (for 
instance a left-comer parsing algorithm will work). However, 
it is an easy consequence of the GGNF result that, for any 
offline-parsable DCG, there exists an explicitly offline-parsable 
DCG equivalent to it. 
4Stuart Shieber, personal communication. 
AcrEs DE COLIN'G-92, NANT~.s, 23-28 hOt\]T 1992 3 6 6 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
arbitrary oflliue-parsable DUG1 into an equivalent 
DCG2 in such a way that the standard top-down ex- 
ecution of DCG2 terminates (in success or failure) 
for any given input string? 
To answer this question (in the positive), we have taken 
an indirect route, and, adapting to the case of definite 
clause grammars some of the standard techniques used 
to transform context-free grammars into GNF, s have 
established the following results: 
1. Generalized Greibach Normal Form for definite 
clause grammars It is possible to translorm any 
definite clause grammar DCG1 (oflline-parsable or 
not) into a definite clause grammar DUG2 equiva- 
lent to DCGI~at is, assigning the same analyses 
to the same strings--and which is in a certain form, 
called the Generalized Greibach Normal Form, or 
GGNF, of DCG1. 
2. Explicit representation of offline-parsability in the 
GGNF The offline-parsability of DCGI is equiva- 
lent to an elementary textual property of DCG2: the 
so-called "unit subgrammar" contains no cycle, i.e. 
no nonterminal calling itself directly or indirectly. ~ 
3. Termination condition for top-down parsing with the 
GGNF If DCG1 is offline-parsable, and such that 
its auxiliary predicates it are unifications, then, for 
any input String, the standard top-down (Prulog) 
execution of DCG2 enumerates all solutions to the 
parsing problem for String and terminates. 
2 The Generalized Greibach Normal 
Form for Definite Clause Gram- 
mars 
2.1 Definite Grammar Schemes 
As usually defined (see \[6\]), a definite clause grammar 
consists in two separate sets of clauses: 
1. Nonterminal clauses, written as: 
a(Tl ..... 7h) --, a , 
where n is a nonterminal, T~ ..... T. am terms (vari- 
able.s, constants, or complex terms), and a is a se- 
quence of "terminal goals" \[term\], of "nontenni- 
nal goals" b(T{,..., 7;'), and of"auxiliary predicate 
goals" {p(Ti', .... T~')}. 
SSee the Appendix for some indications on the methods 
used. 
SThus. a side-effect of the GGNF is to provide a deci- 
sion procedure for the problem of knowing whether a DCG is 
offline-parsable or not. This is equivalent to deciding whether a 
eomext-free grammar is infinitely ambiguous or not, a problem 
the decidability of which seems to be "quasi.folk-knowledge", 
although I was innocent of this until the fact was brought to my 
attention by Yves Schabes, among others: the proof is more or 
less implicit in the usual technique to make a CFG "cycle-free", 
\[1, p. 150\] . See also \[4\] for a special case of this problem. 
(Caveat. The notion of "cycle" in "cycle-free" is technically 
different from the notion used here, which simply means: cycle 
in the graph associated with the relation "callable from". See 
note 10.) 
2. Auxiliary clauses, constituting an autonomous def- 
inite program, defining rite auxiliary predicates ap~ 
pcaring in the right-hand sides of nonterminal rules. 
These clauses am written: 
p(7~ ..... 7)) :- f~ , 
where fl is some sequence of predicate goals. 
A definite grammar scheme DGS is syntactically iden- 
tical to a definite clause grammar, except for the fact that: 
1. Tile '/} arguments appearing in the nonterminal and 
auxiliary predicate goals ,are restricted to beiug vari- 
ables: no constants or complex terms are allowed; 
2. Only nonterminal clauses appear in the definite 
grannnar scheme, but no auxiliary clause; the auxil- 
iary predicates which may appear in the right-hand 
sides of clauses do not receive a definition. 
A definite grammar scheme can be seen as an un- 
completely specified definite clause grammar, that is, a 
definite clause granmtar "lacking" a definition for the 
auxiliary predicates {p(Xt ..... X,)} that it "uses". The 
auxiliary predicates are "free": the interpretation of p is 
not fixed a priori, but can be an arbitrarily chosen n-ary 
relation on a certain Herbrand universe of terms. 7 
Example 1 The following clauses define a definite 
grammar scheme DGSI :s 
a l(X)--* a3(E), a l(A), a2(B), {q(E,A, B, X)} 
al(X) ~ \[oh\], {pl(X)} 
.2(X) ~ \[dull, {p2(X)} 
a3(X) ~ \[ \], {p3(X)} 
In this definite grammar scheme, only variables appear 
as argutnents; the auxiliary predicates pl, p2, p3 and q 
do not receive a definition. 
If a definite program definiug these auxiliary predicates 
is added to the definite grammar scheme, one obtains a 
full-fledged definite clause grammar, which can be inter- 
preted in the usual manner. 
Conversely, every definite clause grammar can be seen 
as the conjunction of a definite grammar scheme and of 
a definite clause program. In order to do so, a minor 
transforntation must be performed: each complex term 
T appearing as an argument in the head or body of a 
nonterminal clause must be replaced by a variable X, 
this variable being implicitly constrained to unify with T 
through the addition of an ad-hoc unification goal in the 
body of the clause (see \[3\]). 
rThe domain of interpretation can in fact be any set. Taking 
it to be the Ilerbrand universe over a certain vocabulary of 
functional symbols permits to "simulate" a DCG, by fixing the 
interpretation of the free auxiliary predicates in this domain. 
Another linguistically relevant domain of interpretation is the 
set of directed acyclic graphs built over a certain vocabulary 
of labels and atomic symbols, which pemtits tile simulation of 
unification grammars of the PATR-II type. 
SThe usual symbol for the initial nonterminal is s; we prefer 
to use al for reasons of notational coherence. 
ACRES DE COLING-92, NAhq~S, 23-28 AO~r 1992 3 6 7 PROC, O1: COLING-92, NAh'rt:,s, AUG. 23-28, 1992 
Example 2 Consider the following definite clause gram- 
mar: 
al(cons(B, A)) ---, a3(E), al(A), a2(B). 
al(X) -. \[oh\], {pl(x)}. 
a2(f) ~ toni\]. 
a3(X) ---* \[ \], {p3(X)}. 
pl(nil). 
p3(r). 
Let us define two new predicates p2 and q by the fol- 
lowing clauses: 
p2(/). 
q(E, A, B, cons(B, A)). 
then the definite clause grammar above can be rewritten 
as: 
al(X) ---, a3(E), al(A), a2(B), {q(E, A, B, X)}. 
al(X) -. \[oh\], {pl(x)}. 
a2(X) ---, \[oui\], {p2(X)}. 
a3(X) --* \[ 1, {p3(x)}. 
pl(nil). 
p2(f). 
pS(r). 
q(E, A, B, cons(B, A)). 
that is, in the form of a definite grammar scheme to 
which has been added a set of auxiliary clauses defining 
its auxiliary predicates. This definite grammar scheme is 
in fact identical with DGSI (see previous example). 
In the sequel of this paper, we will be interested, not 
directly in transformations of definite clause grammars, 
but in transformations of definite grammar schemes. The 
transformation of a definite grammar scheme DGS into 
DGS ~ will respect the following conditions: 
• The auxiliary predicates of DGS and of DGS' are 
the same; 
• For any definite clause program P which defines 
the auxiliary predicates in DGS (and therefore also 
those in DGSI), the definite clause grammar DCG 
obtained through the adjunction of P to DGS 
has the same denotational semantics as the definite 
clause grammar DCG' obtained through the adjunc- 
tion of P to DGS j. 
Under the preceding conditions, DGS and DGS' are 
stud to be equivalent definite grammar schemes. 
The grammar Vansformatious thus defined are, in a 
certain sense, universal transformations:, they are valid 
independently from the interpretation given to the auxil- 
iary predicates. 
2.2 GGNF for definite clause grammars 
Structure of the GGNF The definite grammar scheme 
DGS, on the terminal vocabulary V, having Q as its 
set of auxiliary predicates, is said to be in Generalized 
Greibach Normal Form if: 
1. The nonterminals of DGS are partitioned in three 
distinct subsets: A = {al}, called the initial set; U, 
called the set of unit nonterminals; N, called the set 
of co-unit nonterminals. 
2. The rules of DGS are partitioned into three groups 
of rules, called the factorization group (defining the 
elements of A), the unit group (defining the elements 
of U), and the co-unit group (defining the elements 
of N), graphically presented in the following man- 
net: 
factorization rules: definition of the elements of A 
unit rules: definition of the elements of U 
co-unit rules: definition of the elements of N 
3. Factofization rules are taken among the two follow- 
ing rules: 
al(Xl ..... X.) --, nl(X1 ..... X,~) , 
and/or: 
al(Xl,...,Xn) "~ ul(Xl ..... Xn) , 
where nl E N and ul E U, and where n E N is 
the arity of the initial nonterminal al. 
4. Unit rules are of the form: 
u(Xl ..... X,,) --* tl , 
where u E U is a unit nonterminal of arity m, m E 
N, and where H is a finite sequence of nonterminal 
unit goals of U, of auxiliary predicates of Q, or is 
the empty string \[\]. The group of unit rules forms a 
subscheme of the GGNF definite grammar scheme 
(see below). 
5. Co-unit rules are of the form: 
n( Xt , . . . , X~ ) --+ \[term\] Af , 
where n E N is a co-unit nonterminal of arity k, 
k E N. where \[term\] E V and where .hi is a finite 
sequence of terminal goals of V, of nonterminal unit 
goals of U, of auxiliary predicates of Q, or of non- 
terminal co-unit goals of N. 
6. The context-free skeleton of DGS, considered as a 
context-free grammar, is reduced. 9 
°A context-free grammar is said to be reduced iff it all its 
nonterminals are accessible from al and are productive (see \[5, 
pp. 73-78\]) 
ACrEs DE COLING-92, Nx~rI~s. 23-28 AO~T 1992 3 6 8 P~toc. Or COL1NG-92, NANTES, AU(~. 23-28, 1992 
The nonterminals of A, U, and N am defined in func- 
tion of one another, as well as in function of \[ \], of the 
terminals of V, and of the auxiliary predicates of Q, ac- 
cording to the definitional hierarchy illastrated below: 
G i 
v :o:: ti: 
2.3 Structure of the unit subscheme and 
offline-parsability 
One can remark that: 
• The group of unit rules is closed: file definition of 
unit nontenninals involves only unit nonterminals 
(but no co-unit nonterminal). For this reason, the 
group of unit roles is called the unit subscheme (or, 
loosely, the unit subgrammar) of the GGNF definite 
grammar scheme. 
• The unit subscheme can only generate the empty 
string \[ \]. 
The unit subscheme of the GGNF is said to contain a 
cycle iff there exists a unit nonterminal u(X1, .... X,,~) 
which "calls itself recursively", directly or indirectly, in- 
side this group./° One can show that this property is 
equivalent to the fact that the context-free skeleton of 
DGS is infinitely ambiguous, or, in other words, to the 
fact that DGS is not offline-parsable \[3\]. 
2.4 'lop-down parsing with the GGNF 
Let DGS be a definite grammar scheme in GGNF, having 
Q for its set of auxiliary predicates. Assume that every 
element p of Q, of arity n, is defined tlLrough a head 
clause t ~, of the form: 
p(TI ..... T,,). 
where 7'1, ... ,Tn can be any terms; In other words, the 
auxiliary predicates are constrained to be simply unifica- 
tions. Let DCG be the definite clause grammar obtained 
through adjunction of these clauses to DGS. The gram- 
mar DCG has the following properties: 
t°For example, the scheme: 
t~t(X) ~ u2(g), u3(Z) .... 
,,2(x) ~ udY). m(Z) .... 
contains a cycle in ul • 
n We use the terminology 'head clause' for a clause without 
body. A more standard terminology would be 'unit clause', 
but this would conflict with our technical notion of "unit' (a 
nonterminai generating the empty string \[ \]). 
I. If tile unit subscheme does not contain a cycle, then, 
for arty input string Siring, the standard top-down 
parsing algorithm terminates, after enumerating all 
the analyses for Siring; 
2. If the unit subscheme contains a cycle, the top-down 
parsing algorithm can terminate or not, depending 
on the definition given to the auxiliary predicates. 
We give below three exmnples of definite grammar 
schemes, and of the equivalent definite grammar schemes 
in GGNF. 
2.5 Examples 
Example 3 Consider the definite grammar scheme 
DGS1 given in Example 1, repeated below: 
al(X)-~ a3(E), al(A), a2(B), {q(E,A, B,X)} 
ul(X) ~ \[oh\], {pl(X)} 
a2(X) --, \[dull, {p2(X)} 
~3(x) -, \[ j, {p3(x)) 
The following definite grammar scheme DGS;u is in 
GGNF and is equivalent to DGSI: 
al(X) --~ nl(X) 
¢ 
nl(X) --* \[oh\], {pl(X)} 
nl(X) --* \[dill, {pl(Y)}, h(Y,X) 
h(Y,X) ~ \[dull, {p2(B)}, {p3(E)}, 
{q(E, Y, I3, X)} 
h(Y,X)--, \[oui\], {p2(B)}, {p3(E)}, 
{q(E, 1I, B, Z)}, h(Z, X) 
Suppose P is any auxiliary definite program which 
defines the auxiliary predicates p l,p2,p3, q. Then the 
definite clause grammars DCGt and DCG2, obtained 
by adjunction of this program to DGSt and DGS2, re- 
spectively, are equivalent. 
The unit subscheme of DGS~ does not contain a cy- 
cle (it is empty12). One can conclude that DCGI, as 
well as DCG~, are offline-parsable. If, moreover, it is 
assumed that P defines the auxiliary predicates as be- 
ing unifications, then it can be concluded that top-down 
parsing with DCG~ will enumerate all possible analyses 
for a given string and terminate. 
For instance, assume that the auxiliary program con- 
sists in the following four clauses (see Example 2): 
pl(nil). 
p2(f). 
p3(r). 
q(E, A, fl, cons(B, A)). 
12See note 14. 
ACRES DE COLING-92, NAhn'as, 23-28 AOl)r 1992 3 6 9 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
Then, DCGt becomes: 
al(X)-* a3(E), at(A), a2(B), {q(E,A,B,X)}. 
al(X)--~ \[oh\], {pl(X)}. 
a2(X) ---, \[oui\], {p2(X)}. 
a3(X)--, \[1, {p3(X)}. 
pt(nil). 
p2(f). 
p3(r). 
q(E, A, B, cons(B, A)). 
and DCG2: 
at(X) ---* nl(X) 
nt(X) ---* \[ohl, {pl(X)} 
nl(X)---.* \[oh\], {pt(Y)}, h(Y,X) 
h(Y, X)~ \[null, {p2(B)}, {p3(E)\], 
{q(E, Y, B, X)} 
h(Y, X)--~ \[null, {p2(B)}, {p3(E)}, 
{q(E, V, B, Z)}, h(Z, X) 
pl(nit). 
p2(f). 
p3(r). 
q(E, A, B, cons(B, A)). 
These two definite clause grammars are declaratively 
equivalent. They both accept strings of the form: 
oh oui ... oui 
where oui is repeated k times, k E N, and assign to each 
of these strings the (single) analysis represented by the 
term: 
ifk=0 : nil, 
ifk > 0 : cons(f .... eons(f, nil)...) 
(cons repeated k times.) 
On the other hand, from the operational point of view, 
if a top-down parsing algorithm is used, DCGt loops on 
any input string, ts while DCG2 enumerates all solutions 
on backtracking---here, zero or one solution, depending 
on whether the string is in the language generated by the 
grammar--and terminates. 
Example 4 Consider the following definite grammar 
scheme DGSz: 
al(X)---, \[null, at(Y), {p(X,Y)} 
~l(X) --. ~2(Y), {~(X,Y)} 
a2(X) .-* \[\], {q(X)} 
taRemark that DCG1 is left recursive in a "vicious" (covert) 
way: nontexminal al calls itself, net immediately, but arc* 
calling a3, which does not consume anything in the input string. 
The GGNF of DGSs is DGS4 below: 
al(X) --, nt(X) 
al(X) ---* ut(X) 
ul(X) --* u2(Y), {r(X,Y)} 
u2(x) ~ \[\], {q(X)\] 
nt(X) --* \[ouil, nl(Y), {v(x, Y)} 
nl(X) --, \[oui\], ulW), {p(X, Y)} 
From an inspection of DGS4 it can be concluded that: 
• The unit subscheme does not contain a cycle} 4 
Therefore DGS4, and consequently DGSa, is 
offline-parsable. 
• If DCG3 (resp. DCG4) is the definite clause gram- 
mar obtained through the adjunction to SDG3 (resp. 
SDG4) of clauses defining the auxiliary predicates 
p, q, r, then DCG3 and DCG4 are equivalent; Fur- 
thermore, ff these definitions make p, q, r unification 
predicates, then top-down parsing with DCG4 ter- 
minates, after enumerating all solutions. 
Example 5 Consider the following definite grammar 
scheme DGS~: 
al(X) --* \[oh\], {p(X)} 
at(X) ---, al(Y), {q(X,Y)} 
The GGNF of DGS~ is DGS6 given below: 
al(X) ---, nt(X) 
u(X, X) ~ \[\] 
u(X,Y) --* {q(X,Z)}, u(Z,Y) 
nl(X) ~ \[oh\], {PW)}, u(X, Y) 
From an inspection of DGS6 it can be concluded that: 
• The unit subscheme contains acycle. Therefore nei- 
ther DGS6 nor DGS~ are oflline-parsable. 
• If DCG5 (resp. DCG6) is the definite clause gram- 
mar obtained through the adjunction to DGS5 (resp. 
DGS~) of clauses defining the auxiliary predicates 
p, q, then DCG~ and DCG6 are equivalent; 
• Even if p,q are defined as unifications, top-down 
parsing with DCG6 may not terminate. 
Regarding the last point, let us show that different 
definitions for p and q result in different computational 
behaviors: 
a4 It can easily be shown that, iff this is the ease, then the 
unit nonterminals can be completely eliminated, as in the case 
of example 3 above. 
ACn'ES DE COLING-92, NANTES, 23-28 AOUT 1992 3 7 0 PRoc. OF COLING-92, NANTES, Auo. 23-28. 1992 
Situation 1 Assume that p,q are defined 
through the following clauses: 
p(nil). 
q(f(X), X). 
In such a situation, top-down parsing with 
DCG6 of the input string oh does not 
terminate: an infinite number of solutions 
(X = nil, X = f(nil), . . .) are enumerated 
on backtracking and the program loops, t5 
Situation 2 Assume that p,q are defined by 
the following clauses: 
p(X) :- fail. 
q(nil, nil). 
The first clause defines p as being the 
'false' (omitting giving a clause for p 
would have the same result). In such a 
situation, top-down parsing with DCG6 
terminates. 
3 Conclusions: 
• Few, if any, norn'tal form results for DCGs (and for 
their close relatives, unificatiun grammars) were previ- 
ously known. The GGNF transformation can be applied 
to any DCG, whether oflline-parsable or not. 
• In the GGNF, the potential sources of undecidability 
of the parsing problem are factorized in the unit subgmm- 
mar, a grammar "over" the empty string \[ \]. The GGNF 
as a whole is oflline-parsable exactly when its unit sub- 
grammar is. This is the case iff tile unit subgrammar 
does not contain a nontenninal calling itself recursively. 
• The GGNF seems to provide the closest analogue to 
the GNF that one can hope to find for DCGs. 16 
• If the DCG (or equivalently its GGNF) is oflline- 
parsable then top-down parsing with the GGNF finds all 
solutions to the parsing problem and terminates. 
• The transformatiou under GGNF can be specialized 
to the simpler case of a context-free grammar. In this 
case, the GGNF provides a variant of the standard GNF 
preserving degrees of ambiguity, t 7 
l~This is not the worst possible case: here, at least, a/l so~ 
lutioos end up being enumerated on backtracking. This would 
not be the case with more complex definitions for p and q. 
leJustificatlon for this claim is given in \[3\]. 
lrFor lack of space, this point was not discussed in the paper. 
Consider the context-free grammar GFG (which is the skeleton 
of example 5): 
al ~ \[oh\] 
algal 
The GNF for this grantmar is fire grammar: 
al ~ \[oh\] 
The original grammar assigns an infinite degree of ambiguity 
to \[oh\], while its GNF does not (in fact it is easy to show that 
a GNF can never be infinitely ambiguous). On the other hand, 
Appendix: Some indications on the 
transformation method 
We can only give here some brief indications in tile hope 
that rite interested reader will be motivated to look into 
the full description given ill \[3\]. We start with some 
comments on the GGNF in the CFG case and then move 
on to the case of definite grammar schemes. 
CFGs, Algebraic Systems, and the GGNF. The most 
powerful transformation methods existing for context- 
free grammars are "algebraic ("matrix based" \[8\]) ones 
relying on the concepts of formal power series and al- 
gebraic systems (see \[5, 91). Using such concepts, a 
context4ree grammar such as: 
al ~ ala2 I a2 I \[\] a~ --, \[v\] 
is refommlated into the algebraic system: 
al = ala2+a2+ l 
a s = \[ v \] 
which represents a fixpoint equation in the variables (or 
"nontermitmls") al,a2 on a certain algebraic structure 
(a non-commutative semiring) of formal power series 
Noo<<V'~, where Net is the set of non-negative in- 
tegers, extended to infinity, hfformally, an element of 
N,,~<<V'>> represents a language on the vocabulary V 
(such that \[v\] E V), where each string in the language 
is associated with a number, finite or infinite, which can 
be interpreted as the degree of ambiguity of this string 
relative to the system (or, equivalently the corresponding 
CFG). 
In the example at hand, it can be easily verified that the 
following assigments of formal power series to at, a~: 
al = l+2\[v\]+2\[v\]'J+2\[v\]a+ ''" 
a2 ~ Col 
satisfy the system, as In terms of rile corresponding CFG, 
this fact implies that (1) the entpty string \[ \] is recognized 
exactly once by the grammar, and that each of the strings 
\[v\], \[v\]\[v\], \[v\]\[v\]iv\] ..... is recognized exactly twice by 
tile gramutar. 
From the point of view of transformations, algebraic 
systems have certain impoltallt advantages over context- 
free grammars: (1) they make an~biguity degrees explicit, 
(2) they involve equations (rather ttmn rewriting rules), 
tile GGNF of Ut/G is: 
~1 ~ nl 
u~u 
nl ~ \[ohl u 
and it can be verified that it preserves degrees of ambiguity. 
This difference, which may be considered minor in tile case of 
CFGS, plays an important role in the transformation of DCGs. 
1o Furthermore, in the case at hand, they represent file unique 
solution to this system. 
Adds DE COLING-92. NANTES. 23-28 AO~r 1992 3 7 1 PI~oc. OF COLING-92. NANTES, AUG. 23-28. 1992 
where "equals can be replaced by equals", and (3) they 
possess a rich algebraic structure (addition, multiplica- 
tion) which endows them with mathematical perspicuity 
and power. 
There are some substantial differences between the 
transformation steps used to obtain the GGNF of an al- 
gebraic system and the standard ones used to obtain its 
GNF, the principal one lying in the necessity to preserve 
degrees of ambiguity at each step. In the GNF case, the 
initial step consists in first transforming the initial sys- 
tem into a proper system (a notion analoguous to that of 
proper CFG)--an operation which does not preserve de- 
grees of ambiguity--and then performing the main trans- 
formation. For this reason, the transformation steps in the 
GGNF case must be formulated in a more global way, 
which, among other complications, involves the use of 
certain identities on regular languages) 9 However, there 
are also important similarities between the GNF and the 
GGNF transformations, among them the observation that 
the elementary algebraic system in the variable a on the 
vocabulary V = {Iv\], It\]}: 
a = a Iv\] + \[t\] 
has the unique solution a = \[t\] Iv\]*, an observation which 
can be much generalized, and which plays a central role 
in both cases. 
DCGs, Mixed Systems, and the GGNF. In ordcr 
to define the GGNF in the case of Definite Grammar 
Schemes (or, equivalently. DCGs), we have introduced 
so-eallcd mixed systems, a generalization of algebraic 
systems capable of representing association of structures 
to strings. Without going into details, let's consider the 
following definite grammar scheme: 
ai(x ) -+ al(y)ax(z){p(z,y,z)} I 
a~(y) {q(z,y)} I \[ \] {r(x)} 
a2(~:) -+ \[vl{s(z)} 
This scheme is reformulated as the mixed system: 
alx = aly a2z IY~ z + a2y qYz + r,: 
a2~ = \[v\] s~ 
In this system, the variables (or nonterminals) at, a~ are 
seen as functions: /g -~ B<<V*>> (where B is the set 
of booleans {0, 1}), that is, as functions mapping ele- 
ments of a set E (often taken to be a Herbrand universe), 
representing linguistic structures, into formal series of 
B~V*>>, that is, into languages over V. This can be 
seen to correspond to the intuitive notion that a nonter- 
minal "associates" structures to strings. As for p, q, r, s, 
they are seen respectively as fonctions from E ~, E ~, E, E 
into B C B<¢:V*>>, that is, as predicates of different ari- 
ties over E. The system represents a fixpoint equation on 
the variables ax, a2, given the constants \[v\], p, q, r, s. 2° 
Although mixed systems are defined on more complex 
structures than are algebraic systems, the transformation 
methods for algebraic systems generalize without diffi- 
cuhy to their case, and these methods form the mathe- 
matieal basis of the results reported in this paper. 
l~Such as the identity (e + \])* ~ e*(ye*) °. 
Z°For the interested reader, the given system expresses (us- 
ing "conventions of summation" familiar in tensor algebra) the 
Acknowledgments 
Thanks to C. \]3oitet, M. Boyer, F. Guenthner, F. Perrault, 
Y. Schabes, M. Simard, and G. van Noord for comments 
and suggestions. 
relations: 
AcrEs DE COLING-92, NANTES, 23-28 AO'O'r 1992 3 7 2 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 

References 

\[1\] A.V. Aho and J.D. Ullman. The Theory of Pars- 
ing, Translation and Compiling, volume 1: Parsing. 
Prentice-Hall, Englewood Cliffs, NJ, 1972. 

\[2\] M. Dymetman and P. lsabelle. Reversible logic 
grammars for machine translation. In Proceedings 
of the Second International Conference on Theoret- 
ical and Methodological Issues in Machine Trans- 
lation of Natural Languages, Pittsburgh, PA, June 
1988. Carnegie Mellon University. 

\[3\] M. Dymetman. Transformations de gram- 
maires logiques. Applications au probl~me de la 
rdversibilit~ en Traduction Automatique. ThSse 
d'Etat. Universit6 de Grenoble, Grenoble, France, 
July 1992. 

\[4\] A. Haas. A generalization of the offline-parsable 
grammars. In Proceedings of the 27th Annual Meet- 
ing of the ACL, 237-42, Vancouver, BC, Canada, 
June 1989. 

\[5\] Harrison, M.A. 1978. Introduction to Formal Lan- 
guage Theory. Reading, MA: Addison-Wesley. 

\[6\] Pereim, EC.N. and D.H.D. Warren. 1980. Definite 
Clause Grammars for Language Analysis. Artificial 
lntelligence:13, 231-78. 

\[7\] Pereira, F.C.N. and D.H.D. Warren. 1983. Parsing 
as Deduction. In Proceedings of the 21th Annual 
Meeting of the ACL, 137-44. Cambridge, MA, June. 

\[8\] Rosencmntz D. J. 1967. Matrix equations and nor- 
real forms for context-free grammars. JACM 14:3, 
501-07. 

\[9\] Salomaa, A. and M. Soittola. 1978. Automata The- 
oretic Aspects of Formal Power Series. New York: 
Springer Verlag. 

\[10\] S.M. Shieber. 1985. Using restriction to extend al- 
gorithms for complex feature-based formalisms. In 
Proceedings of the 23rd Annual Meeting of the ACL, 
145-152. Chicago, 1L, June. 
