JPSG Parser on Constraint Logic Programming 
TUDA, Hirosi * 
Department of information science 
Faculty of science 
University of Tokyo 7-3-1 Hongo, Bunkyo-ku Tokyo, 113 Japan 
e-maih a30728%tansei.cc.u-tokyo.junet @relay.cs.net 
HASIDA, Kbiti 
Institute for New Generation Computer Technology (ICOT) 
1-4-28 Mita, Minato-ku Tokyo, 108 Japan 
e-mail: hasida@icot.jp@relay.cs.net 
SIRAI, Hidetosi 
Tamagawa University 
6-1-1 Tamagawa gakuen, Machida-shi Tokyo, 194 Japan 
e-mail: a88868%tansei.cc.u-tokyo.junet@relay.cs.net 
Abstract 
This paper presents a constraint logic programming 
language cu-Prolog and introduces a simple Japanese 
parser based on Japanese Phrase Structure Grammar 
(JPSG) as a suitable application of cu-Prolog. 
cu-Prolog adopts constraint unification instead of 
the normal Prolog unification. In cu-Prolog, con- 
straints in terms of user defined predicates can be 
directly added to the program clauses. Such a clause 
is called Constraint Added Horn Clause (CAHC}. 
Unlike conventional CLP systems, cu-Prolog deals 
with constraints about symbolic or combinatorial ob- 
jects. For natural language processing, such con- 
straints are more important than those on numeri- 
cal or boolean objects. In comparison with normal 
Prolog, cu-Prolog has more descriptive power, and is 
more declarative. It enables a natural implementa- 
tion of JPSG and other unification based grammar 
formalisms. 
*From this April, Fujitsu Corporation 
1 Introduction 
Prolog is frequently used in implementing natural 
language parsers or generators based on unification 
based grammars. This is because Prolog is also 
based on unification, and therefore has a declarative 
feature. One important characteristic of unification 
based grammar is also a declarative grammar formal- 
ization \[11\]. 
However, Prolog does not have sufficient power of 
expressing constraints because it executes every parts 
of its programs as procedures and because every vari- 
able of Prolog can be instantiated with any objects. 
Hence, the constraints in unification based grammar 
are forced to be implemented not declaratively but 
procedurally. 
We developed a new constraint logic programming 
language cu-Prolog which is free from this defect of 
traditional Prolog \[13\]. In cu-Prolog, user defined 
constraints can be directly added to a program clause 
(constraint added Horn clause), and the constraint 
unification \[12, 8\] 1 is adopted instead of the nor- 
1 In these earlier papers, "constraint unification" was called 
"conditioned unification." 
-95- 
mal unification. This paper discusses the outline of 
the cu-Prolog system, and presents a Japanese parser 
based on :IPSG (Japanese Phrase Structure Gram- 
mar) \[7\] as a suitable application of cu-Prolog. 
2 Constraint Added Horn 
Clause (CAHC) 
Most of the constraint logic programming language 
systems (CAL \[2\], PrologIII \[5\], etc.) deal with con- 
straints about algebraic equations, i.e., constraints 
about numerical domains, such as that of real num- 
bers etc. 
However, in the problems arising in Artificial In- 
telligence, constraints on symbolic or combinatorial 
objects are far more important than those on nu- 
merical objects, cu-Prolog handles constraints de- 
scribed in terms of sequence of atomic formulas of 
Prolog. The program clauses of cu-Prolog are fol- 
lowing type, which we call Constraint Added Horn 
Clauses (CAHCs): 
1. H :- Bt,B2,...,Bn;C1,C2,...,Cm. 
(H is called the head, B1, B2,... ,Bn is the 
body, C1, C2,..., Cm is the constraint. The 
body and the constraint can be empty.) 
C1,C2,...,Cm comprise a set of constraints on 
the variables occurring in the rest of the clause. 
C1, C2,..., Crn must be, in the current implementa- 
tion, modular in the sense that it has the following 
canonical form. 
\[Def.\] 1 (modular) A sequence of atomic formulas 
C1, C2,..., Cm is modular when 
1. every arguments of Ci is variable, and 
~. no variable occurs in two distinct places, and 
3. the predicate of Ci is modularly defined (1 < i < 
m). 
\[Def.\] 2 (modularly defined) Predicate p is mod- 
ularly defined, when in every definition clause of p, 
PI : --D., 
D is empty, 
or 
1. every argument of D is variable, 
~. no variable occurs in two distinct space, and 
3. every predicate occurring in D is p or modularly 
defined. 
For example, 
member(X, Y), member(U, V) is modular, 
member(X, Y), member(Y, Z) is not modu- 
lar, and 
append(X, Y, \[a, b, e, a~) is not modular. 
Seen from the declarative semantics, the program 
clause of cu-Prolog is equivalent to the following pro- 
gram clause of Prolog: 
1. H :- B1,B2,...,B~,C1,C2,...,Cm. 
3 cu-Prolog 
3.1 Constraint Unification 
cu-Prolog employs Constraint Unification \[12, 8\] 
which is the usual Prolog unification plus constraint 
transformation (normalization). 
Using constraint unification, the inference rule of 
cu-Prolog is as follows: 
Q, R; C. , Q' : -S; D., 
0 = mgu(Q, Q'), B = my(co, DO) 
$0, R6; B 
(Q is an atomic formula. R, C, S, D, 
and B are sequences of atomic formulas. 
mgu(Q,Q I) is a most general unifier be- 
tween Q and Q'.) 
my(C1,...,Cm) is a modular constraint which is 
equivalent to C1,.. •, Cm. If C1, ..., Cm is inconsis- 
tent, my(C1,..., Cm) is not defined. In this case, the 
above inference rule is inapplicable. 
- 96 - 
For example, 
mr(member(X, \[a, b, d), member(X, \[b, c, d\]) ) 
returns a new constraint cO(X), where the definition 
of cO is 
cO(b). 
c0(c). 
and 
mr(member(X, \[a, b, 4), member(X, \[k, l, m\])) 
is not undefined. 
This transformation is done by repeating un- 
fold/fold transformations as described later. 
3.2 Comparison with conventional ap- 
proaches 
In normal Prolog, constraints are inserted in a goal 
and processed as procedures. It is not desirable for a 
declarative programming language, and the execution 
can be ineffective when constraints are inserted in a 
insufficient place. 
As constraints are rewritten at every unification, 
cu-Prolog has more powerful descriptive ability than 
the bind-hook technique. For example, freeze in Pro- 
log II\[4\] can impose constraints on one variable, so 
that when the variable is instantiated, the constraints 
are executed as a procedure. Freeze has, however, 
two disadvantages. First, freeze cannot impose a con- 
straint on plural variables at one time. For example, 
it cannot express the following CAHC. 
f(X), g(Y, Z); append(X, Y, Z). 
Second, since the contradiction between constraints is 
not detected until the variable is instantiated, there 
is a possibility of executing useless computation in 
constraints deadlocking. For example, X and Y are 
unifiable even after executing 
and 
freeze(X, member(X, \[a, hi)) 
freeze(Y, member(Y, \[u, v\])) 
In cu-Prolog, 
and 
f(x); member(X, \[a, hi). 2 
I(Y); member(Y, \[u, ,\]). 
are not unifiable. 
3.3 Constraint Transformation 
This subsection explains the mechanism of constraint 
transformation in cu-Prolog. 
Let 7" be definition clauses of modularly defined 
constraints, ~ be a set of constraints {C1,..., Cn} 
that contains variables zl, ... ,zm, and p be a new 
m-ary predicate. 
Let D be definition clauses of new predicates, and 
~o = TU~) 
is initially 
{p(xl,..., xm): -C1,..., C,.} 
and other new predicates are included through the 
constraint normalization. 
Then, mf(~) returns p(zl,..., zm), if there exists 
a sequence of program clauses 
:P0, Pl,... ,~', 
and :Pn is modularly defined, where Pi+1 is derived 
from Pi (0 < i < n) by one of the following three 
types of transformations. 
1. unfold transformation 
Select one clause C from Pi and one atomic for- 
mulaA from the body of C. Let C1, ..., Cn be all 
the clauses whose heads unify with A, and C~ be 
the result of applying Cj to A of C (j = 1,..., n). 
7~+x is obtained by replacing C in :Pi with 
q,...,c-. 
:~rnember(X,\[a,b D is not modular, but is equivalent to 
pI(X), where 
pl(~). 
p2(b). 
- 97 - 
2. fold transformation 
Let C(A : -K&L.) be a clause in Pi, and D(B : 
-K'.) be a clause in :D, and 0 be mgu(K, K') 
that meets the following conditions. 
(a) No variables occur in both K and L, and 
(b) C is not contained in 7). 
Then, 7~i+t is obtained by replacing C in :Pi with 
AO :-BO&L. 
. integration 
Let C (H : -B&R.) be a clause in :Pi, where B 
is not modular and contains variables zl,..., zm 
and there are no common variables between B 
and R. Let p be a new m-ary predicate and the 
following clause E: 
p(zl,..., z~) : -B. 
be the definition ofp. Then, :Pi+l is obtained by 
replacing C in :Pi with 
H : -p(Xl .... , zm)~R. 
and adding E. E is also added to :D. 
The third transformation can be seen as a special 
case of fold transformation. Hence, these three trans- 
formations preserve the semantics of programs be- 
cause unfold/fold transformation has been proved as 
valid \[6\]. ' 
The following example shows a transformation of 
member(A, Z), append(X, Y, Z). 
Here, T is { T1,T2,T3,T4 }, where 
T1 = member(X,\[X\[Y\]). 
T2 = member(X,\[Y\[Z\]):-member(X,Z). 
T3 = append(\[\],X,X). 
T4 = append(\[AIX\],Y,\[AIZ\]):-append(X,Y,Z). 
and E is {member(A, Z), append(X, Y, Z)}. The new 
predicate pl is defined as 
DI: p1(A,X,Y,Z):-member(A,Z),append(X,Y,Z). 
and 
P0 = {TI,T2,Z3,T4,D1},~ = {D1} 
Unfolding the first formula of Dl's body, we get 
T5 = pI(A,X,Y, \[AIZ\]) :-append(X,Y, \[AIZ\]). 
T6 = pI(A,X,Y, \[BJZ\]) :-member,(A,Z), 
append (X, Y, \[B J Z/). 
So 
~Pl -- {T1,T2,T3,T4,TS,T6} 
By integration, 
T6' = pI(A,X,Y,\[AJZ3):-p2(X,Y,A,Z). 
T6' = pI(A.X,Y,\[BIZ3):-p3(A,Z,X,Y,B). 
D2 = p2(X,Y,A,Z):-append(X,Y, \[AIZ\]). 
D3 = p3(A,Z,X,Y,B):- 
member (A, Z), append (X, Y, \[B \[ Z/). 
and 
~)2 -- {TI, T2, T3, T4, TS', T6', D2, D3} 
~) = {D1,D2,D3} 
By unfolding D2, 
T7 = p2(\[\],\[AIZ\],A,Z). 
T8 = p2(\[BIX\] ,Y,A,Z) :-append(X,Y,Z). 
These clauses comprise the modular definition of p2. 
Thus 
"P3 = {T1, T2, T3, T4, T5', T6', TT, T8, D3}. 
Unfold the second definition of D3, and we have 
T9 = p3(A,Z, \[ \], \[B\[Z\] ,B) :-member(k,Z). 
TIO = p3(A,Z, \[BIX\],Y,B):- 
member (A, Z) ,append(X,Y,Z). 
~9 4 = {T1, T2, T3, T4, T5 I, T6', TT, T8, Tg, TIO}. 
Folding TIO by D1 will generate 
TIO' = p3(A,Z,EBIX3,Y,B):-pI(A,X,Y,Z). 
Accordingly 
"P5 = {T1, T2, T3, T4, TS', T6', T7, T8, T9, TIO'}. 
- 9S - 
As a result, 
member(A, Z), append(X, Y, Z) 
has been transformed to pl(A,X,Y,Z) preserving 
equivalefice, and the following new clauses have been 
defined. 
{T4, T5 I, T6 I, T7, T8, T9, TlOI}. 
3.4 Implementation 
The source code of cu-Prolog is, at present (Vet 2.0), 
composed of 4,500 lines of language C on UNIX sys- 
tem. Its precise computation speed is under evalua- 
tion, but is sufficient for practical use. 
Implementation of the effective constraint trans- 
formation shown in above subsection requires some 
heuristics in the application of three transformation. 
Especially, in unfold transformation, one atomic for- 
mula A is selected in the following heuristic rules 
1. The atomic formula of the finite predicate. 
2. The atomic formula that has constants or \[ \] in 
its arguments. 
3. The atomic formula that has lists in its argu- 
ment. 
4. The atomic formula that has plural dependen- 
cies. 
Here, 
\[Def.\] 3 (finite predicate) A predicate p is finite, 
when the body of every definition clause of p is 
. 
Figure 
nil, or 
expressed by finite predicates 
1 demonstrates constraint transformation. 
4 A JPSG parser 
As an application of cu-Prolog, a natural language 
parser based on unification based grammar has been 
considered first of all. Since constraints can be added 
directly to the program clause representing a lexi- 
cal entry or a phrase structure rule, the grammar is 
implemented more naturally and declaratively than 
with ordinary Prolog. Here we describe a simple 
Japanese parser of JPSG in cu-Prolog. CAHC plays 
an important role in two respects. 
First, CAHC is used in the lexicon of homonyms 
or polysemic words. For example, a Japanese noun 
"hasi" is 3-way ambiguous, it means a bridge, chop- 
sticks, or an edge. This polysemic word can be sub- 
sumed in the following single lexical entry. 
lezieon(\[hasilX\], X, \[... semS EM\]); 
hasi_sem( S E M ). 
where hasi_sem is defined as follows. 
hasi.sem( bridge ). 
hasi.sem( ehopst icks ). 
hasi.sem(edge). 
The value of the semantic feature is a vari- 
able (SEM), and the constraint on SEM is 
hasi_sem(SEM). Note that predicate hasi_sem is 
modularly defined. According to CAHC, such ambi- 
guity may be considered at one time, instead of being 
divided in separate lexical entries. Japanese has such 
an ambiguity is also shown in conjugation, post po- 
sitions, etc. They can be treated in this manner. 
Second, a phrase structure rule is written naturally 
in a CAHC. In JPSG \[7\], FFP(FOOT Feature Prin- 
ciple) is: 
The value of a FOOT feature of the mother 
unifies with the union of those of her daugh- 
ters. 
This principle is embedded in a phrase structure 
rule as follows: 
psr(\[slashM S\], \[slashLDb~, \[slashRDS\]); 
union( L D S, RD S, MS). 
However, this cannot be described in this manner in 
traditional Prolog. 
- 99 - 
.member(I,\[IIY\]). .member(I,\[YlZ\]):-member(I,Z). 
.append(\[\],I,I). .append(\[lll\],Y,\[AIZ\]):-append(X,Y,Z). 
.@ member(I,\[ga,eo,nt\]),member(X,\[no,eo,nt\]). 
solution = cO(I) 
cl(.o). 
cl(ni). cO(lO):-cl(IO). 
.@ member(A,l),append(I,Y,l). 
solution = cT(&, Z, I, Y) ¢8(12, I2, IO, Yl, Y3):-append(IO, YI, Y3). 
c8(I2, Y3, IO, Y1, Z4):-c7(I2, Z4, XO, YI). cT(AO, Xl, D, II):-member(AO, I1). 
cT(Ao, \[A%lz4\], \[A%lx2\], Y3):-cB(AO, A1, I2, Y3, Z4). 
The first four lines are definitions of member and append. The lines that begin with "(~" are user's input atomic formulas 
(constraints). The system returns the constraint (cO(X)) that is equivalent to the input constraint, and its definitions. 
Figure 1: Demonstration of the constraint transformation routine 
Figure 2 shows a simple demonstration of our 
JPSG parser, and Figure 3 shows an example of 
treating ambiguity as constraint. The current parser 
treats a few feature and has little lexicon. However, 
the expansion is easy. It parses about ten to twenty 
words sentences within a second on VAX8600. Since 
JPSG is a declarative grammar formalism and cu- 
Prolog describes JPSG also declaratively, the parser 
needs parsing algorithms independently. In the cur- 
rent implementation, we adopt the left corner parsing 
algorithm \[1\]. Furthermore, we would even be able 
to abandon parsing algorithm altogether \[10\]. 
5 Final Remarks 
ular. So the most difficult problem one must tackle 
concerns itself with heuristics about how to control 
computation. 
Acknowledgments 
This study owes much to our colleagues in the 
JPSG Working group at ICOT. The implementation 
of cu-Prolog is supported by ICOT and the Ministry 
of International Trade and Industry in Japan. 
References 
\[i\] 
The further study of cu-Prolog has many prospects. \[2\] 
For example, to expand descriptive ability of con- 
straints, the negative operator or the universal quan- 
tifier can be added. The constraint-based, alias par- 
tial, aspects of Situation Semantics\[3\] are naturally \[3\] 
implemented in terms of an extended version of cu- 
Prolog \[9\]. For practical applications in Artificial In- \[4\] 
telligence in general and natural language process- 
ing in particular, one needs a mechanism for carrying 
out computation partially, instead of totally as de- 
scribed above, where constraint transformation halts 
only when the constraint in question is entirely mod- 
A. V. Aho and J. D. Ullman. The Theory of 
Parsing, Translation, and Compiling, Volume i: 
Parsing. Prentice-Hall, 1972. 
A. AIBA. Seiyaku Ronri Programming (Con- 
straint Logic Programming). bit, 20(1):89-97, 
1988. (in Japanese). 
J. Barwise and J. Perry. Situation and Attitudes. 
MIT Press, Cambridge, Mass, 1983. 
A. Colmerauer. Prolog H Reference Manual 
and Theoretical Model. Technical Report, ER- 
ACRANS 363, Groupe d'-Intelligenee Artifielle, 
Universite Aix-Marseille II, October 1982. 
\[5\] A. Colmerauer. Prolog III. BYTE, August 1987. 
- 100- 
_ : -p ( \[ken, ga ,naomi, wo, al, euru\] ). 
v \[Form_764, AJ|{Adj_768}, SC{SubCat.772}\] : SEN_776--- \[suff_p\] 
I \[ --v \[vs2, SC{Sc_752}\] : \[love, Sbj.120, Obj_1241 --- \[subcat _p\] 
I l--pFga\] :ken---\[adjao.nt_p\] 
I \[ --n\[n\] :ken--- \[ken\] 
I I I I__p\[ga, AJA{n\[n\]}\] :ken---\[ga\] 
I l __v \[vs2, SC{p \[ga\], $c_752}\] : \[love, Sbj_120,0bj_124\] --- \[subcat_p\] 
I \[ --p\[wo\] :naomi--- \[adjacent.p\] 
I I \[ l --n \[hi :naomi--- \[naomi\] 
I I \[ I__p\[wo, AJA{n\[n\]}\] :naomi---\[wo\] 
I \[__V\[VS2, $C{p\[wo\], p\[gal, Sc_752}\] : \[lovo,Sbj_120,Obj_124\]---\[ni\] 
\[__v\[For=_764, AJA{v\[vs2,SC{Sc_752}\]}, AJII{Adj_768}, 
SC{SubCat _772}1 : SEN.776--- \[suru\] 
cat cat(v, Form_764, \[\], Adj.768, SubCat_772, SEH_776) 
cond \[c2(Sc_752, 0bj_124, Sbj.120, Form_764, SubCat_772, Adj_768, SEM_776)\] 
True. 
. :-c2(.,_,_, F, SC, AD3 ,SEM). 
F = syusi SC = \[\] ADJ ffi \[\] SEN = \[love,ken,naomi\] 
The first line is a user's input. "Ken-ga Naomi-wo ai-suru" means "Ken loves Naomi." 
Then, the parser returns the parse tree and the category and constraint (c2()) of the top node. User solves the constraint 
to get the actual value of the variables. 
Figure 2: Demonstration of our 3PSG parser 
_ : -p ( \[ai, suru, hit o\] ). 
n In\] : Semant ics.824--- \[adjunct_p\] I 
\[--v\[Form_796, AJ|{n\[h\]}, $C{_820}\]:Semmtics_824---\[su~.pl 
I I I I--viva2, SC{Sc.376}1 : \[love,Sbj_lE2,0bj.lS6\]---\[ai\] 
I I I I_.v\[For=_796, AJA{v\[vs2,$C{$¢_376}\]}, AJl{n\[n\]}, $(:{_820}1 :Semmtics.824---\[surul 
I \[ __n In\] : inst (ObJ .932, \[people, 0bj_932\] )--- \[hit o\] 
cat cat(n, n, \[\], \[\], \[\], Semantics.824) cond \[c6($c_376, 0bj.156, $bj_152, Foz~.796, _820, 0bj.932, Semantics_824)1 
Title. 
.:-c6(., ......... ,me=). 
Se~ inst(ObJO.136, \[and, \[people,ObJO_136\], \[love,SbJ1.140,ObjO_136\]\]) 
Sam : Inst (Sbj0.136, \[and, Lpeople,SbJO_136\], \[lova,SbjO.t36,0bj1.140\]\]) 
This is a parse tree of "ai-suru hito" that has two meaning: "people whom someone loves" or "people who loves someone". 
These ambiguity is shown in two solution of the constraint. 
Figure 3: Example of ambiguity 
101 - 
\[6\] K. FURUKAWA and F. MIZOGUTI, editors. 
Program Henkan (Program Transformation). 
Tisiki Johoshori Series No.7, Kyoritu, Tokyo, 
1987. (in Japanese). 
\[7\] T. GUNJI. Japanese Phrase Structure Gram- 
mar. Reidel, Dordrecht, 1986. 
\[8\] K. HASIDA. Conditioned Unification for Natu- 
ral Language Processing. In Proceedings of the 
11th COLING, pages 85-87, 1986. 
\[9\] K. HASIDA. A Constraint-Based View of Lan- 
guage. In Proceedings of Workshop on Situation 
Theory and its AppliCation, 1989. (to appear). 
\[10\] K. HASIDA and S. ISIZAKI. Dependency Prop- 
agation: A Unified Theory of Sentence Cmpri- 
hension and Generation. In Proceedings of IJ- 
CAI, 1987. 
\[11\] S. M. Shieber. An Introduction to Unification- 
Based Approach to Grammar. CSLI Lecture 
Notes Series No.4, Stanford:CSLI, 1986. 
\[12\] H. SIRAI and K. HASIDA. Zyookentuki Tanitu- 
ka (Conditioned Unification). Computer Soft- 
ware, 3(4):28-38, 1986. (in Japanese). 
\[13\] H. TUDA. A JPSG Parser in Constraint Logic 
Programming. Master's thesis, Department of 
Information Science, University of Tokyo, 1989. 
(to appear). 
- 102 - 
