COMMON HEURISTICS FOR 
PARSING, GENERATION, AND WHATEVER... 
HASIDA. K6iti 
Ir/stitute for New Generation Computer Technology (ICOT) 
Mita Kokusai Bldg. 21F. 1-4-28 Mita, Minato-ku, Tokyo 108 JAPAN 
Tel: -4-81-3-3456-3069, E-mail: hasida@icot.or.jp 
ABSTRACT 
This paper discussers general hem'istics to control 
computation on symbolic constraints represented in 
terms of first-order logic programs. These heuristics 
are totally independen! of specific domains and tasks. 
Efficient computation for sentence parsing and genera- 
tion naturally emerge fi'om these heuristics, capturing 
the essence of standar d parsing procedures and seman- 
tic head-driven generat:ion. Thus. the same representa- 
tion of knowledge, inclfiding grammar and lexicon, can 
be exploited in a multi-directional manner in various 
aspects of language use. 
r 
1 Introduction 
One lesson to learn from the repeated failure to design 
large AI systems in general is that the information flow 
in the cognitive systems is too complex and diverse to 
stipulate in the design of these AI systems. To capture 
this diversity of information flow. therefore. At systeins 
must be designed at a more abstract level where direc- 
tion of information flow is not explicit,. 
This is where constrai~t paradigm comes in. Since 
constraints do not stipulate the direction of informa- 
tion flow or processing Order, constraint-based systems 
could be tailored to halve tractable complexity, unlike 
procedural systems, which stipulate information flow 
and thus quickly become too colnplex for human de- 
signers to extend or maintain. 
Naturally, the key issue in the constraint-based ap- 
proach is how to control information flow. A very gen- 
eral control schema independent of any specific domain 
or task is vitally Ile('essal'y for the success of this ap- 
proach. 
The present paper introduces a system of constraint 
in a \[o,-m of logic progi'am, and a set of very general 
heuristics to control symbolic operation on the con- 
straints. The sS'mboli( I operations herr are r('gar(h'd 
as Iransforming logic programs. "lhcv are quite per- 
missivr operations as a whole, allowing very diverse 
information processing involving top-down, bottom-up 
and other directions of informal ion flow. The heuristics 
control this computation so that only relevant infor- 
mation should be exploited and the resuhing represen- 
tation should be compact. Parsing and generation of 
sentences are shown to be efficiently done under these 
heuristics, and a standard parsing algorithm and the 
semantic head-driven generation \[8\] naturally emerge 
thereof. 
The rest of the paper is organized as follows. Sec- 
tion 2 describes the syntax of our system of constraint. 
Section 3 defines the symbolic computation on these 
constraints, and proposes a set of general heuristics to 
control computation. Section 4 and Section 5 show 
how sentence parsing and generation are executed effi- 
ciently by those heuristics. Finally, Section 6 concludes 
the paper. 
8.1. 
2 Constraint Network 
A program is a set of clauses. A clause is a set of lit- 
erals. A literal is an atomic constraint with a sign in 
front, of it. The sign is a '+', '-', or nil. A literal with 
a sign '+' or nil is called a positive literal and one with 
a sign "-' is a negative literal. An atomic constraint is 
an atomic formula such as p(X,Y,Z), a bindin 9 such as 
X=f(Y), a feature specification such as a(X,Y), or an 
equality such as X=Y. Names beginning with capital let- 
ters represent variables, and the other names predicates 
and functions. A feature specification may be regarded 
as an atomic formula with a special binary predicate 
called a feature. A feature is a partial function from the 
first argument to the second argument: that is, if a is 
a feature and both a(X,Y) and a(X,Z) hold, then Y=Z 
must also hold. The other atomic constraints may be 
understood in the standard fashion. The atomic con- 
straints other than equalities are called proper atomic 
constraints. 
A clause is written as a sequence of literals it contains 
followed by a semicolon. The order among literals is 
not significant. So (1) is a clause, which may also be 
written as (2). 
(I) -p(U,Y) +q(Z) -U=f(X) -X=Z; 
(2) +q(Z) -p(f(Z),Y); 
A clause containing a literal with null sign is a defini- 
tion clause of the predicate of that literal. A predicate 
having definition clauses are called defined predicate, 
and its meaning is defned in terms of completion based 
on the definition clauses. For instance, if the definition 
clauses of predicate p are those in (3), the declarative 
meaning of p is given by (4). 
(3) p(X) -q(X,a); p(f(X)) -r(X); 
(4) VA{p(A) ¢0 {3Y(q(A,Y) AY = a)V 
3X(A = f(X) A r(X))}} 
A predicate which is not a defined predicate is called a 
free' predicate. There is a special 0-ary defined predicate 
true. Its definition clauses are called top clauses. A 
top clause corresponds to the query clause of Prolog. 
although the latter has false instead of true. 
Programs are regarded as constraint networks. For 
instance, the following program is a network as in Fig- 
ure 1. 
(i) true -member(a,X); 
(ii) member(A, \[AIS\] ) ; 
(iii) member(A, \[B IS\]) -member(A,S) ; 
Figure 1: Constraint Network 
In graphical representations like Figure 1, a '," often 
represents an argument, of an atomic constraint. There 
are two types of nodes: arguments, and proper atomic 
constraints, An argument is involved in at most one 
proper atomic constraint, but in any number of equal- 
ities. An argument bound to a constant is identified 
with that constant. That is, the first argument of a 
binding ,=a. for instance, is represented simply by a. 
A link, represented as a curve, connects two nodes. For 
any two (possibly the same) nodes, there is at most 
one link connecting them. A link connecting two argu- 
ments is an equality between them. A link connecting 
two proper atomic constraints is called an inference 
link. No link connects an argument and an atomic 
constraint. Although often not explicitly shown, an 
inference link accompanies equalities between the cor- 
responding arguments of the two proper atomic con- 
straints. A clausal domain of clause ~ is the part of the 
constraint network consisting of the atomic constraints 
referred to as literals in • except equalities concerning 
constants. A clausal domain is depicted by a closed 
curve enclosing the included atomic constraints. The 
short thick arrows indicate the references to the atomic 
constraints as positive literals in clauses. A predicate 
domain of predicate 7r consists of all the proper atomic 
constraints with r (binding X=f (Y) is regarded as hav- 
ing binary free predicate =:f, for instance), inference 
links among them, and equalities accompanying these 
inference links. 
The instantiation possibilities of the constraint net- 
work is defined by regarding nodes and links as sets. 
Those sets are disjoint of each other. An instance of 
an argument corresponds to an individual in the do- 
main of interpretation, and an instance of an atomic 
constraint corresponds to an atomic proposition. Con- 
stants (bindings to constants) and 0-ary atomic formu- 
las are singleton sets. A link ~ between nodes o and 
stands for a symmetric relation. That is, ,~ = R U R -1 
for some relation R C o x'8. We call {z 6 ol3y x6y} the 
o-domain of 6. Every link in a clausal domain or the 
predicate domain of a defined predicate is of the form 
R U R -1 for some bijection R. Let ~ be the transitive 
closure of the union of all the links, x~y means that 
x and y correspond to the same object in the domain 
of interpretation if x and y belong to arguments, and 
that they correspond to the same atomic proposition if 
they belong to proper atomic constraints. We say that 
node o subsumes node/9 when a/~ D '8/,~; that is, for 
every y 6 j3 there exists x 6 o such that xAy. For each 
pair of a proper atomic constraint o and an argument 
'3 of o, there is a bijection p from o to ,8, such that xpy 
holds iff y 6 '3 is an argument of x 6 o. p is called a 
role assignment. 
Consider a part T' of the constraint network and 
the minimum equivalence relation including the links 
and the role assingments in "P. A layer of T' is an 
equivalence class with respect to this relation. A split- 
ting domain is a part S of the network in which every 
link is of the form R \[3 R -1 where R is the union of 
(o n f) x (,8 n £) over all the layers f of S and o 
and ,8 are the two endnodes of that link. Thus, if a 
link in a splitting domain splits into two links sharing 
an endnode o and having disjoint s-domains, then the 
entire splitting domain splits into two separate splitting 
domains each containing one of these two links. The 
clausal domains and predicate domains are assumed to 
be splitting domains. 
A joint is a part of a node which connects the node 
with a link or more. Figure 2 shows some joints. The 
figures below illustrate the instantiation possibilities of 
the networks shown above by depicting each node as an 
ellipse enclosing its instances, and each link as a bundle 
of curves representing the pairs belonging to the link. 
A joint J of node o is depicted as an arc convex towards 
o crossing the links involved in J. A joint involving just 
one link, as in (a) and (b), is called a unitary joint, and 
one containing several links, as in (c) and (d), is called 
a multiple joint. Distinct links involved in the same 
multiple joint on node o have disjoint a-domains. A 
joint is partial if it stretches out of the involved links, 
as in (b) and (d). and total otherwise, as in (a) and 
(c). The union of o-domains of the links involved in 
the same joint on node o is equal to o. A total unitary 
joint as in (a) is not explicitly shown as an arc. Partial 
joints on node o are complementary when the union 
of the a-domains of the links involved in them is o. 
Complementary joints are indicated by a dashed arc 
crossing these links. So the union of the s-domains of 
the three links is o in (e), When node o and ~3 are 
connected by link 6 and the joint of '8 involving 6 is 
total and unitary, o and 6 are said to dominate ~. 
The initial structures of predicate domains are shown 
in Figure 3. Such structures, as well as the other 
structures, will change as computation proceeds. 
3 Computation 
Here we introduce a method of symbolic computation 
together with some general control heuristics for con- 
trolling computation. There are two types of symbolic 
operation: subsumption and deletion. Here we chiefly 
82 
O~ CE CE Ct Ct 
j ) 
(a) (b) (c) (d) (e) 
Figure 2: Joints between Nodes and Links 
p(...) )(...) 
p(...) / 
p(...) 
)(...) p(...) 
Bound Predicate Free Predicate 
Figure 3: Predicate Domains 
! 
concern ourselves with subsumption. 
3.1 Subsumpti0n 
'Subsumption" means two things: subsumption rela- 
tion. which we defined above, and subsumption opera- 
tion. which we discuss below. 
The purpose of a subsumption is to let information 
flow from a node. A node o may have probes, c~ is 
called the origin of these probeR. Each probe is placed 
on a link and directed tbwards an endnode. The origin 
of a probe subsumes the' node behind the probe. Probes 
transmit information of their origins across the network 
via subsumptions. The origin of probes has its scope. 
The scope of node o is the part S of the constraint 
network satisfying the following conditions. 
• $ is a connected graph containing a. 
• A node 13 is behind a probe r on link 6 and with 
origin a, iff/3 is in 5" but 6 is not. 
• (~ subsumes every node in $. 
So the scope of a may be illustrated as in Figure 4, 
where arrows are probes, which just cover the boundary 
Figure 4: Scope of Node 
of the scope. 
Every node a can just once create probes on all the 
links connected to a so that a is behind these probes. 
Subsumption extends the scope of a node by advancing 
probes, while preserving the instantiation possibilities 
of the network described above. We consider a sub- 
sumption from node iota to node ~ along link 6. ~, ~, 
and 6 are called the input node, the target node, and the 
axis, respectively, of this subsumption. The joint J of 
involving/f is called the target joint. This subsump- 
tion extends the scopes of the origins of the probes on 
6 directed towards ~. It proceeds as follows. 
First, the set II of the probes on 6 towards ~ is de- 
tached from 6, and 6 is shifted from J to another joint 
J', as illustrated in Figure 5. J! is a copy of J and 
is on a node ~' which is a copy of ~. J' and ~' may be 
created here and now, but may also have been made in 
a previous subsumption, as mentioned below. Below 
we proceed so as to make ~0 = ~1 O ~' A ~1 n ~' = @ 
83 
  Oit.-J "..~.°.. 
8 
......... Z G2 02-Z 
( 
Figure 5: Shifting of Link and Augmentation of Foldability 
true, where ~0 and (1 stand for ~ before and after this 
subsumption, respectively. 
A joint may be foldable to another joint by a set of 
origins of probes. Each joint, involved here, called a 
foldable joint, is one obtained by copying zero or more 
times a multiple joint in the initial state of computa- 
tion. Typically, a foldable joint is one involving links in 
the predicate domain of a defined predicate. No joint. 
just created is foldable to any joint. For any joint G 
and set. O of nodes, there is at most one joint H such 
that (; is foldable to H by O. 
Let E be the set of origins of the probes in II. If J is 
foldable, then for each joint G the foldability relation 
extends in the following way, as illustrated in Figure 
,5, where the foldability relation is depicted by dashed 
arrows. 
• .l is foldable to J' by E. 
• If G is foldable to J by O, then G is foldable to 
J' by O U E. 
• If ,1 is foldable to G by O such that O D Z, then 
J' is foldable to G by O - E. 
If there has already been a joint to which d is fold- 
;Lble by E. then J' is that joint, ~' is the node on J', 
J' becomes a total multiple joint, and tile foldability 
relation remains unchanged. Otherwise, J' and ~' are 
newly created, 6 dominates ~', and the foldability re- 
lation is augmented. We call the former case folding, 
and the latter unfolding. 
If c~ is a proper atomic constraint or an argument of 
a proper atomic constraint, then & stands for the set 
whose elements are this proper atomic constraint and 
its arguments; otherwise dr = {a}. 
In the case of unfolding, each node v in ~ is copied to 
v', and each link a (a y~ ~) connecting v and r/is copied 
to a' connecting v' and some node r/'. 71' is the copy of 
r/ if r/ E ~ and r/' = r I otherwise. Relevant Joints are 
copied accordingly so a~s to preserve the instantiation 
possibilities of the network. 
There are two cases, splittin 9 and non-splitting, 
about how to create ~r'. In the former, it is guaran- 
teed that no layer of the splitting domain including a 
before the copy overlaps with both v and v' after the 
copy. Such a guarantee is obtained iff c~ = R U R -1 for 
some bijection R or (inclusive) 6 and a belong to the 
same splitting domain. There is no such guarantee in 
the non-splitting case. 
In the splitting case. as is illustrated in Figure 6, the 
r/-domains of a and a' are disjoint when r/' = r/. 
v 
13 
~O! ~-~OI e 
V V t 
n 
Figure 6: Copy of Links (Splitting) 
In the non-splitting case, as is illustrated in Figure 
7, if a was a loop, v and v' is connected by an addi- 
V o/ 
T 
v) (v' 
n 
Figure 7: Copy of Links (Non-Splitting) 
tional link representing a relation pertaining to the lay- 
ers overlapping both v and v'. Further if a was involved 
in a multiple joint of 7/, then a subsumption along o" to 
7/must be done before creating o"; otherwise the right 
instantiation possibilities cannot be represented. 
In both splitting and non-splitting cases, the probes 
that v had, if any, are deleted, and v and v' are licensed 
to generate new probes. Then every remaining probe 
on a is copied to a probe on a', towards v', and the same 
origin. Further, each probe in II is advanced through 
~' onto every link r (# ~5) connected with ~' so that ~' 
should be behind the probe. If there is another probe 
on r towards ~' and with the same origin, then both 
probes are deleted. 
Finally, in both folding and unfolding, if ~5 dominated 
before this subsumption, ~ is deleted because it has 
become the empty set now. This deletion propagates 
across links and nodes until possibly non-empty sets 
84 
are encountered: that is. until you come across par- 
tial or multiple joints of remaining nodes. 1 Now the 
subsumption is done. 
To properly split splitting domains, we must aug- 
ment this subsumption procedure so that. a probe may 
carry, instead of origin., some information about which 
layers of the relevant splitting domain are involved in 
the node behind the probe. Such probes are transmit- 
ted from proper atomic constraints to their arguments 
and vice versa. A link is deleted if it contains two 
probes with opposite directions and associated with 
disjoint sets of layers. Further details are omitted due 
to the space limitation. 
So far we have dischssed subsumption in general. 
Below we describe thee particularities of subsump- 
tions along equalities and subsumptions along inference 
links. 
A subsumption along an equality is triggered by a 
dependency between arguments. We say that there is a 
dependency between two arguments, when they com- 
pete with each other and are connected by a dependency 
path. Nodes o and '3 Compete with each other when 
the5" are the first arguments of 
• two bindings (as in (=f(,) and q=g(,)). 
• a binding and a feature specification, or 
• two feature specifications with the same feature. 
A dependency path connecting o and 3 is a sequence 
61~2 '"/5, of strong equhlities such that the endpoints 
of ~i are a,-a and c~i (1 <_ i <_ n), 6i and ~i+x are 
involved in different joihts of c~; one of which is total 
(1 _< i < n). a0 = a and o,, = '3. An equality is strong 
when it belongs to a claiuse or the predicate domain of 
a defined predicate, or g'hen a subsumption has taken 
place along that equality. 
A probe r on an equality ~ might trigger a subsump- 
tion to advance rr, when: there is a dependency between 
the origin c~ of rr and another node/3 and 3 is included 
in a dependency path connecting ~ and /3. 
Suppose the scope of o includes another node L3 com- 
peting with a. If the proper atomic constraints A and 
B, each involving ~ and '3 as the first argument, re- 
spectively, are connected by an inference link 6. then/5 
absorbs B, as shown in, Figure 8. That is, the joint 
o~=f(.) ~=f(-) 
13=f(,) 13=f(.) 
Figure 8: Absorption by Link 
of B involving /~ is modified so that, /3 dominates /3, 
because A has turned out to subsume B. Any other 
inference link involved in this joint is deleted, because 
IThis combination of copy and deletion is vacuous and thus 
may be omitted in actual implementation for the unfolding cases. 
The deletion of probes in the splitting case may also be avoided 
in such a situation. 
it has turned out to be the empty set. Of course each 
equality accompanying 6 must absorb its endnode in B 
at the same time. If there is no inference link between A 
and B, then B is deleted. Deletions of links and nodes 
propagate so long as the empty set is encountered, as 
said before. 
A subsumption along an inference link may be trig- 
gered by cost assigned to the input node. Each literal 
in a clause may be assigned a cost. Similarly to the as- 
sumability cost of Hobbs et al. \[5\], the cost of a literal 
corresponds to the difficulty to abductively assume its 
negation. For instance, if you want to assume atomic 
constraint ~ by using a clause backwards whereas the 
cost of the literal -~ in this clause is not zero, then you 
are to do something in order to cancel the cost. In this 
sense, an atomic constraint with a cost is regarded as a 
goal to achieve, and the abductive usage of the clause 
which gives rise to the goal is regarded as the motiva- 
tion to set up that goal. A cost may be canceled by 
making the atomic constraint subsume another which 
is more readily believable. That is, a goal is fulfilled 
when it is established by some reason other than its 
motivation. 
The input node of a subsumption along an inference 
link is th e goal atomic constraint in the rest of the 
paper. 2 Such a subsumption eliminates the cost if the 
target node has been derived from the top clause with- 
out recourse to that very subsumption. Otherwise the 
cost is inherited into the clause which contains the out- 
put node. In a Horn clause normally used with all the 
atomic constraints therein being true, the head literal 
inherits the cost from a body atomic constraint, and 
the body atomic constraints inherit the cost from the 
head literal. We neglect the cost inheritance among 
body atomic constraints. 
3.2 Heuristics 
Subsumptions along equalities and those along infer- 
ence links both encompass top-down and bottom-up 
information flow. Some heuristics are necessary to con- 
trol such an otherwise promiscuous system of computa- 
tion so that more relevant pieces of information should 
be exploited with greater preferences. 
Each heuristic for a subsumption along an equality 
is that one of the following conditions raises the pref- 
erence of such a subsumption. 
(H1) The origin of a probe on the axis is close to (typ- 
ically included in) the top clause or is a constant. 
(tI2) A dependency path involving the axis and con- 
necting an argument with the origin of a probe 
on the axis is short. 
Both these conditions are regarded as indicating that 
the transmitted information (about the origin) is highly 
relevant to the destination of this transmission. In this 
connection, a subsumption along an equality is unlikely 
to happen if the axis belongs to the predicate domain 
of a free predicate and the target joint is partial, since 
the conveyed information would not be very relevant 
to the target node. 
~Subsumptions for checking consistency need not be triggered 
by cost. 
85 
As for subsumptions along inference links, the fol- 
lowing conditions each raise the preference. 
(H3) Corresponding arguments of the input node and 
the target node are connected via short depen- 
dency paths with the same node. (That is, those 
arguments are 'shared.') 
(H4) The target node has already been derived from 
the top clause. 
(H3) raises the possibility for instances of the two ar- 
guments to coincide in the domain of interpretation. 
(H3) amounts to a generalization (or relaxation) of the 
condition on which an inference link absorbs one of its 
endnodes. (I-14) guarantees that the subsumption in 
question will lead to an immediate elimination of the 
cost. of the input node. Probably (H4) could be relaxed 
to be a graded condition. 
4 Parsing 
Let us consider a simple case of context-free parsing 
based on the following grammar. 
P~a p---~ pp 
A parsing based on this grammar is formulated by the 
program as follows. 
(5) true -p(Ao,B) -Ao=\[alA1\] -Ai=\[aIA2\] .-'; 
(¢) p(\[alX\] ,X) ; 
(q/) p(X,Z) -p(X,Y) -p(Y,Z); 
I)epicted in Figure 9 are the four types of clauses cre- 
(a) (b) 
(c) (d) 
Figure 9: Clauses Produced through Parsing 
ated by this parsing process. A *= \[a I *\] is a shorthand 
representation for a .=\[*l*\] plus an equality between 
the second argument and (the argument bound by) a. 
(a) is a copy of clause ¢ in (5), and the other clauses are 
copies of ~. A label i of a link means that the relevant 
part of the network is in the scope of argument Ai. The 
reason why only these types of clauses are generated 
is that in this case every dependency arises between a 
*= \[a I *\] in the top clause and another .= \[a I*\] some- 
where else and the first argument of the former is the 
origin of the subsumptions to resolve that dependency. 
A strict proof will be obtained by mathematical induc- 
tion. Since the number of these clauses is O(n 3) due to 
(d) and each of them may be generated in a constant 
time, the time complexity of the entire parsing is 0(773). 
where n is the sentence length. Each clause is guaran- 
teed to be generated in a constant time, because each 
foldability test can be performed in a constant time, as 
discussed later. By employing a general optimization 
technique, we can eliminate the clauses of type (d), so 
that the space complexity is reduced to O(n2). Thus, 
our general control scheme naturally gives rise to stan- 
dard parsing procedures such as Eaxley's algorithm and 
chart parsing. 
(5) is graphically represented as Figure 10. We 
,,,...--" ' -P-true ~. 
Figure 10: Parsing (1) 
omit the links involved in the predicate domain of a 
free predicate, until they are modified as in Figure 8. 
Thus no links among ,=\[ale\]s axe shown in Figure 
10. Here is a dependency between the first .=\[alo\] in 
the top clause and the o=\[alo\] in ~, as indicated by 
the dependency paths, which consist of thick links. To 
let information flow from the top clause following the 
above heuristic (H1), we are to do the two subsump- 
tions indicated by the two thin arrows. 
Those subsumptions copy # to ~1 and • to ~l, re- 
suiting in Figure 11. For expository convenience, we 
-h-true 
Figure lh Parsing (2) 
86 
assume here without loss of generality that copying of a 
clause produces a separate clause rather than one shar- 
ing atomic constraints with the original clause. Note 
that the first argument of the *=\[al*\] in ¢1 is sub- 
sumed by h0. 
Computation goes on into the same direction, and 
the two subsumptions are to happen as shown in Fig- 
ure 11. Folding takes place this time, and the result is 
to shift the two inference links upwards, as in Figure 
12. Now the first *=\[al*\] in the top clause dominates 
/......f'-"~ ..m.-true 
0 1 
¢ : L2M rLY ' 
Figure 42: Parsing (3) 
the *=\[al*\] in ~1 as indicated by the inference link 
between them. becaus e, as indicated by number 0 in 
~l, the first argument of the former is within the scope 
of the first argument of the latter. Now the equality 
in the right-hand side of ~1 is within the scope of A1, 
as indicated in the figure. This subsumption also en- 
genders a new set of dependencies between the first 
argument of the second .= \[a I*\] in the top clause and 
that of .= \[a \] o3 in ~, as indicated again by thick links 
in Figure 12. By executing the indicated subsumption 
following (H1), 31 is copied to q"2. so that we obtain 
Figure 13. Further advancing subsumptions as shown 
there, we get Figure 14. Computation goes on in the 
similar way. 
As mentioned above, we are able to assume that each 
foldability test is perfo~:med in a constant time. This 
assumption is justified by, for instance, sorting the fold- 
ability information from each joint in the chronical or- 
der of the first subsumption which advanced probes 
with the relevant origin. In the present parsing exam- 
pie. this order happens to be the increasing order of 
the suffix i of Ai. 
It. is straightforward to integrate such a phrase- 
structure parsing with computation on internal struc- 
tures of grammatical ca~tegories represented in terms of 
feature bundles, for instance. See \[2, 3\] for further de- 
tails in this regard. Note that the above derivation of 
the parsing process is more general than the parsing- 
as-deduction approaches \[6, 7\], because it is free from 
stipulation of the left-to-right and to-down processing 
direction and also from task-dependency with regard 
to parsing or context-free grammar. 
87 
i 
/......i--"~-- -D-true 
Figure 13: Parsing (4) 
Figure 14: Parsing (5) 
5 Generation 
Here we consider how to verbalize the following seman- 
tic content in English. 
S~laughed,kim~ 
This means that Kim laughed, based on Situation 
Theory \[1\]. That is, in some situation S there is an 
event which is of the sort laughed and whose agent 
is kira. So a sentence we might want to generate is 
'Kim laughed." S may be interpreted as, for instance, 
the speaker's model of the hearer's model of the world. 
A state of affairs ((laughed, kira)) will be regarded as 
variable L1 constrained by two feature specifications 
rel (Ll,laughed) and agt (Li ,kim). 
The initial state of computation could be formulated 
in terms of a program including the following clauses. 
among much more others. 
(A) true -s(SEM,WO,WI) -S~SEM -say(WO) 
-SELl $, -rel(Ll,laughed) $ 
-agt(Ll,kim) $ ...; 
(B) s(SEM,X,Z) -np(SBJSEM,X,Y) 
vp(SEM,SBJSEM,Y,Z) ; 
(C) np(kim,X,Y) -X=\['kim 'IY\]$; 
(D) vp(L,AGT,X,Y) -X=\['laughed'IY\] $ 
-rel (L, laughed) -agt (L, AGT) ; 
say(W0) means that the utterance beginning at W0 
should be actually uttered. S~SEM and SELl seper- 
at, ely exist, in (A), because the next utterance need 
not directly refer to L1. For instance, one can mean 
that Kim laughed by saving "Do you know that. Kim 
laughed?' instead of just 'Kim laughed,' or by doing 
something other than utterance. One might even just 
give up the goal and say something quite different. 
A '$" attached to an atomic constraint represents a 
cost, so that the atomic constraint is a goal. The three 
goals in (A) altogether amount to a macroscopic goal 
to make a state of affairs ((laughed, kim)) hold in sit- 
uation S. 
What we would like to demonstrate below is again 
that the control heuristics described in Section 3 tend 
to trigger the right operations depending upon the com- 
putational context, provided that the current goal is to 
be reached by some linguistic means; that is, by eventu- 
ally uttering some sentence. Below we pay attention to 
only one maximal consistent structure of the sentence 
at a time just for the sake of simplicity, but the actual 
generation process may involve OR-parallel computa- 
tion similar to that in parsing of the previous section. 
Figure 15 graphically represents clauses (A) and 
(C). A proper atomic constraint with a binary pred- 
icate, possibly together with equalities involving the 
two arguments, is represented here as an arrow from 
(an argument equalized with.) the first argument to (an 
argument equalized with) the second argument. Links 
in predicate domains are selectively displayed for ex- 
pository simplicity. 
The most. probable operations to take place here 
are subsumptions involving one of these three goals. 
There should be innumerable combinations for such 
~¥. ~/ ....... subsumli~n 
... 
"laughed' 
Figure 15: Generation (1) 
subsumptions, because the speaker's lexicon must in- 
clude a large number of atomic constraints of the form 
• ~•, rel(.,.) and agt(•,•), even though subsump- 
tions with extralinguistic parts of the constraints are 
excluded due to the above provision that the current 
goal is to be fulfilled by linguistic means. 
However, two of such subsumptions are preferred to 
the others. One is the subsumption concerning the 
two •~•s in (A), and the other is from the rel(#,•) 
in (A) to that in (D). In both cases, the two atomic 
constraints share the same argument for the same ar- 
gument place, which raises the preference due to (H3). 
Let us tentatively choose just the latter subsumption 
in this particular presentation. No big difference would 
follow in the long run, even if the former subsumption 
or both were chosen instead. 
By the subsumption concerning the two rel(.,e)s, 
we obtain the structure shown in Figure 16. We have 
i rpd°n 
subsumption- ...... ~ 
"laughed' 
Figure 16: Generation (2) 
88 
i 
copied clause (D) to (D!). because the rel(, ,,) in (A) 
is a goal. Now in Figure 16. vp(,,,,,,,) in (D') is 
a goal. by inheriting the cost from rel(,,,) of (A). 
The cost of ,=\[,\[,\] in (D') is inherent, as indicated 
in (D). Now the most probable next computation is 
the sequence of subsumptlons along the thick hnk(s) 
constituting a dependency path. Following the heuris- 
tic (Hi). those subsumptions propagate from the top 
clause. After that. the inference link between the two 
agt(,,,)s absorbs the 6tie in (B). 
This gives us Figure '17. (D') has not. been dupli- 
i 
kim 
laughed) 
n ') 
"laughed' 
Figure 17; Generation (3) 
cated here. because thd above subsumptions did not 
actually duplicate any clause. In this context, the sub- 
sumption concerning the two vp (.,.,., .) s is possible, 
since the one in (D') is a goal. Due to (H3), this sub- 
sumption is more prefera~ble than the others concerning 
two vp(.,.,., e)s, because their first arguments are 
both connected to kim (that is, the first argument of 
• =kim) via. short dependency paths. As a result, (B) 
is copied to (B') and the vp(.,.,.,.) in (B') is domi- 
nated by that. in (D'), aS in Figure 18. 
Now that s(.,.,.) in (B') is anew goal, it is caused 
to subsume another s(*,*,*) in (A). According to 
(H4), this subsumption:, is particularly preferable be- 
cause (A) is the top clause. On tile other hand. the 
subsumption from the first argument of np(.,.,.) in 
(B') to the first argument of np(.,.,.) in (C) could 
take place here, to resolve the cyclic dependency about. 
kim referred to from (N) and (C). This subsumption 
is the most probable operation concerning this depen- 
dency in this context, because it is along the shortest 
relevant dependency patch. We assume that the direc- 
tion of this subsumption~ is downwards, as indicated in 
Figure 18. It will be the'same in the long rnn if it were 
in the opposite direction'. 
The mentioned computation in Figure 18 takes us 
to Figure 19. We have a new top clause (A'), which 
shares most part of itself with (A). except, the copy 
of s(*,*,*). Some of the previous goals have disap- 
peared due to the subsumption concerning s(.,.,.)s. 
Now the remaining goals are .=\[. I*\]s in (C') and (D') 
suhgumntinn 
Figure 18: Generation (4) 
execution subsumption 
Figure 19: Generation (5) 
89 
and the .~. in the intersection of (A) and (C'). We 
might do a subsumption concerning the two .~.s, be- 
cause they share both the arguments. This subsump- 
tion could have happened earlier, of course, particu- 
larly ever since both arguments came to be shared in 
Figure 16 via (B) and (D'). As mentioned before, how- 
ever, it would have caused no essential difference even- 
tually. At the same time we could execute the proce- 
dure say(*) to realize the goal *=\[*1.\] in (C'). It is 
reasonable to assume that this computation is triggered 
by the fact that the argument of say(.) subsumes the 
first argument of this .=\[.I,\]. This heuristic for fir- 
ing procedures looks generally applicable not only to 
utterance but also to every other output procedure. 
Thus we move to a new computational context in 
Figure 20. The execution of say(.) has created a new 
execution .......... . 
Figure 20: Generation (6) 
• =\[. I.\]. so that 'Kim' has been spoken aloud. This 
• -- \[* I o\] dominates the ,--- \[. I o\] in (C'), as indicated by 
the thick link. Generation of 'Kim laughed' completes 
if say(,) is executed one step further. 
Note that this generation process captures the 
bottom-up feature of semantic head-driven generation 
\[8\], especially when we move from Figure 15 through 
Figure 18. The subsumption concerning the arguments 
of np (., o, o)s happening between Figure 18 and Figure 
19 captures the top-down aspect as well. 
6 Concluding Remarks 
We have introduced a set of general heuristics for con- 
trolling symbolic computation on logic constraints, and 
demonstrated that sentence parsing and generation are 
attributed to these heuristics. In the above presenta- 
tion, parsing is for the most part based on truth main- 
tenance (resolution of dependencies among arguments) 
90 
controlled by heuristics (H1) and (H2), whereas gener- 
ation is more dependent on goal satisfaction controlled 
by (H3) and (H4). In more realistic cases, however. 
both processes would involve both kinds of computa- 
tion in a more intertwined way. 
A related nice feature of our framework is that, in 
principle, all the types of constraints -- syntactic, se- 
mantic, pragmatic and extralinguistic -- are treated 
uniformly and integrated naturally, though a really ef- 
ficient implementation of such an integrated system re- 
quires further research. In this connection, we have un- 
dertaken to study how to implement the above heuris- 
tics in a more principled and flexible way, based on a 
notion of potential energy \[4\], but the present paper 
lacks the space for discussing the details. 
In this paper we have discussed only task- 
independent aspects of control heuristics. Our conjec- 
ture is that we will be able to dispense with domain- 
dependent and task-dependent control heuristics alto- 
gether. The domain/task-dependent characteristics of 
information processing will be captured in terms of the 
assignment of energy functions to the relevant con- 
straints. The resulting system will still be free from 
stipulation of the directions of information flow, allow- 
ing multi-directional information processing, since nei- 
ther the symbolic component nor the analog compo- 
nent (that is, energy) of the constraint refers explicitly 
to information flow. 

References 
\[1\] Barwise, J. (1990) The Situation in Logic, CSLI 
Lecture Notes No. 17. 
\[2\] Hasida, K. (1990) 'Sentence Processing as Con- 
straint Transformation,' Proceedings of ECAI '#0. 
\[3\] Hasida, K. and Tsuda, H. (1991) 'Parsing without 
Parser,' International Workshop on Parsing Tech- 
nologies, pp. 1-10, Cancun. 
\[4\] Hasida, K. (in preparation) Potential Energy of 
Combinatorial Constraints. 
\[5\] Hobbs, J., Stickel, M., Martin, P., and Edwards, 
D. (1988) 'Interpretation as Abduction,' Proceed- 
ings of the ~O6th Annual Meeting of ACL, pp.95- 
103. 
\[6\] Pereira, F.C.N. and Warren, D.H.D. (1983) 
'Parsing as Deduction,' Proceedings of the 21st 
Annual Meeting of ACL, pp. 137-144. 
\[7\] Shieber, S.M. (1988)'A Uniform Architecture for 
Parsing and Generation,' Proceedings of the 12th 
COLING, pp. 614-619. 
\[8\] Shieber, S.M., van Noord, G., and Moore, R.C. 
(1989) 'A Semantic-Head-Driven Generation Al- 
gorithm for Unification-Based Formalisms,' Pro- 
ceedings of the 27th Annual Meeting of A CL, pp. 7- 
17. 
