Interpretation as Abduction 
Jerry R. Hobbs, Mark Stickel, 
Paul Martin, and Douglas Edwards 
Artificial Intelligence Center 
SRI International 
Abstract 
An approach to abductive inference developed in the TAC- 
ITUS project has resulted in a dramatic simplification of 
how the problem of interpreting texts is conceptualized. Its 
use in solving the local pragmatics problems of reference, 
compound nominals, syntactic ambiguity, and metonymy 
is described and illustrated. It also suggests an elegant and 
thorough integration of syntax, semantics, and pragmatics. 
1 Introduction 
Abductive inference is inference to the best explanation. 
The process of interpreting sentences in discourse can be 
viewed as the process of providing the best explanation of 
why the sentences would be true. In the TACITUS Project 
at SRI, we have developed a scheme for abductive inference 
that yields a signi~caut simplification in the description of 
such interpretation processes and a significant extension 
of the range of phenomena that can be captured. It has 
been implemented in the TACITUS System (Stickel, 1982; 
Hobbs, 1986; Hobbs and Martin, 1987) and has been and 
is being used to solve a variety of interpretation problems 
in casualty reports, which are messages about breakdowns 
in machinery, as well as in other texts3 
It~ is well-known that people understand discourse so well ~ 
because they know so much. Accordingly, the aim of the 
TACITUS Project has been to investigate how knowledge 
is used in the interpretation of discourse. This has involved 
building a large knowledge base of commonsense and do- 
main knowledge (see Hobbs et al., 1986), and developing 
procedures for using this knowledge for the interpretation 
of discourse. In the latter effort, we have concentrated on 
problems in local pragmatics, specifically, the problems of 
reference resolution, the interpretation of compound nom- 
inals, the resolution of some kinds of syntactic ambiguity, 
and metonymy resolution. Our approach to these problems 
is the focus of this paper. 
In the framework we have developed, what the interpre- 
tation of a sentence is can be described very concisely: 
ZCharniak (1986) and Norvig (1987) have also applied abductive 
inference techniques to discoume interpretation. 
(1) 
To interpret a sentence: 
Derive the logical form of the sentence, 
together with the constraints that predicates 
impose on their arguments, 
allowing for coercions, 
Merging redundancies where possible, 
Making assumptions where necessary. 
By the first line we mean "derive in the logical sense, or 
prove from the predicate calculus axioms in the "knowledge 
base, the logical form that has been produced by syntactic 
analysis and semantic translation of the sentence." 
In a discourse situation, the speaker and hearer both 
have their sets of private beliefs, and there is a large over- 
lapping set of mutual beliefs. An utterance stands with one 
foot in mutual belief and one foot in the speaker's private 
beliefs. It is a bid to extend the area of mutual belief to 
include some private beliefs of the speaker's. It is anchored 
referentially in mutual belief, and when we derive the logi- 
cal form and the constraints, we are recognizing this refer- 
ential anchor. This is the given information, the definite, 
the presupposed. Where it is necessary to make assump- 
tions, the information comes from the speaker's private 
beliefs, and hence is the new information, the indefinite, 
the asserted. Merging redundancies is a way of getting a 
minimal, and hence a best, interpretation. 2 
In Section 2 of this paper, we justify the first clause of 
the above characterization by showing that solving local 
pragmatics problems is equivalent to proving the logical 
form plus the constraints. In Section 3, we justify the last 
two clauses by describing our scheme of abductive infer- 
ence. In Section 4 we provide several examples. In Section 
5 we describe briefly the type hierarchy that is essential 
for making abduction work. In Section 6 we discuss future 
directions. 
2Interpreting indirect speech acts, such u "It's cold in here," mean- 
ing "C1¢w¢ the window," is not a counterexample to the principle that 
the minimal interpretation is the best interpretation, but rather can 
be seen as a matter of achieving the minimal interpretation coherent 
with the interests of the speaker. 
95 
2 Local Pragmatics 
The fbur local pragmatics problems we have addressed can 
be illustrated by the following "sentence" from the casualty 
reports: 
(2) Disengaged compressor after lube-oil alarm. 
Identifying the compressor and the alarm are reference 
resolution problems. Determinlug the implicit relation 
between "lube-oil" and "alarm" is the problem of com- 
pound nominal interpretation. Deciding whether "af- 
ter lube-oil alarm" modifies the compressor or the disen- 
gaging is a problem in syntactic ambiguity resolution. 
The preposition "after" requires an event or condition as 
its object and this forces us to coerce "lube-oil alarm" into 
"the sounding of the lube-oil alarm"; this is an example 
of metonymy resolution. We wish to show that solving 
the farst three of these problems amounts to deriving the 
logical form of the sentence. Solving the fourth amounts to 
deriving the constraints predicates impose on their argu- 
ments, allowing for coercions. For each of these problems, 
our approach is to frame a logical expression whose deriva- 
tion, or proof, constitutes an interpretation. 
Reference: To resolve the reference of "compressor" in 
sentence (1), we need to prove (constructively) the follow- 
ing logical expression: 
(3) (B c)comFeessor(c) 
If, for example, we prove this expression by using axioms 
that say (71 is a starting air compressor, and that a starting 
air compressor is a compressor, then we have resolved the 
reference of "compressor" to 6'i. 
In general, we would expect definite noun phrases to 
refer to entities the hearer already knows about and can 
identify, and indefinite noun phrases to refer to new enti- 
ties the speaker is introducing. However, in the casually 
reports most noun phrases have no determiner. There are 
sentences, such as 
Retained oil sample and filter for future analysis. 
where "sample" is indefinite, or new information, and "fil- 
ter" is definite, or already known to the hearer. In this 
case, we try to prove the existence of both the sample and 
the filter. When we fail to prove the existence of the sam- 
ple, we know that it is new, and we simply assume its 
existence. 
Elements in a sentence other than nominals can also 
function referentially. In 
Alarm sounded. 
Alarm activated during routine start of 
compressor. 
one can argue that the activation is the same as, or at least 
implicit in, the sounding. Hence, in addition to trying 
to derive expressions such as (3) for nominal reference, 
for possible non-nomlnal reference we try to prove similar 
expressions. 
(3 ... e, a,...)... ^ activate'(e, a) ^ ...s 
That is, we wish to derive the existence, from background 
knowledge or the previous text, of some known or implied 
activation. Most, but certainly not all, information con- 
veyed non-nominally is new, and hence will be assumed. 
Compound Nominals: To resolve the reference of the 
noun phrase "lube-oi\] alarm", we need to Find two entities 
o and a with the appropriate properties. The entity o must 
be lube oil, a must be an alarm, and there must be some 
implicit relation betwee~ them. Let us call that implicit 
relation nn. Then the expression that must be proved is 
(3 o, a, nn)tu~-oit(o) ^ atarm(a) ^ nn(o, a) 
In the proof, instantiating nn amounts to interpreting the 
implicit relation between the two nouns in the compound 
nominal. Compound nominal interpretation is thus just a 
special case of reference resolution. 
Treating nn as a predicate variable in this way seems to 
indicate that the relation between the two nouns can be 
anything, and there are good reasons for believing this to 
be the case (e.g., Downing, 1977). In "lube-oil alarm", for 
example, the relation is 
~x, y \[y sounds if pressure of z drops too low\] 
However, in our implementation we use a first-order sim- 
ulation of this approach. The symbol nn is treated as a 
predicate constant, and the most common possible rela- 
tions (see Levi, 1978) are encoded in axioms. The axiom 
(v=, v)r~,~(y, =) ~ --(=,y) 
allows interpretation of compound nominals of the form 
"<whole> <part>", such as "filter element". Axioms of 
the form 
(Vz, y)sample(y, z) D nn(z, y) 
handle the very common ease in which the head noun is 
a relational noun and the prenominal noun fills one of its 
roles, as in "oil sample". Complex relations such as the 
one in "luhe-oil alarm" can sometimes be glossed as "for". 
(v=, v)fo~Cy, =) ~ --(=, y) 
Syntactic Ambiguity: Some of the most com- 
mon types of syntactic ambiguity, including prepositional 
phrase and other attachment ambiguities and very com- 
pound nominal ambiguities, can be converted into con- 
strained coreference problems (see Bear and Hobbs, 1988). 
SSee Hobbs (1985a) for explanation of this notation for events. 
96 
For example, in (2) the first argument of after is taken to 
be an existentially quantified variable which is equal to ei- 
ther the compressor or the alarm. The logical form would 
thus include 
(3...e,c,y,a .... )... A aftcr(y,a) A ye {c,~} 
A ... 
That is, however after(y, a) is proved or assumed, y must 
be equal to either the compressor c or the disengaging c. 
This kind of ambiguity is often solved as a byproduct of the 
resolution of metonymy or of the merging of redundancies. 
Metonymy: Predicates impose constraints on their 
arguments that are often violated. When they are vio- 
lated, the arguments must be coerced into something re- 
lated which satisfies the constraints. This is the process of 
metonymy resolution. Let us suppose, for example, that 
in sentence (2), the predicate after requires its arguments 
to be events: 
after(ca,e2) : event(ca) A event(e2) 
To allow for coercions, the logical form of the sentence is 
altered by replacing the explicit arguments by "coercion 
variables" which satisfy the constraints and which are re- 
lated somehow to the explicit arguments. Thus the altered 
logical form for (2) would include 
(3 ... kt, k2, y, a, rela, eel2,...).., h after(k1, k2) 
A event(ka) A rcll(kl, y) 
A event(k~) A ret2(k2,a) A ... 
As in the most general approach to compound nominal 
interpretation, this treatment is second-order, and suggests 
that any relation at all can hold between the implicit and 
explicit arg~unents. Nunberg (1978), among others, has in 
fact argued just this point. However, in our implementa- 
tion, we are using a first-order simulation. The symbol eel 
is treated as a predicate constant, and there are a num- 
ber of axioms that specify what the possible coercions are. 
Identity is one possible relation, since the explicit argu- 
ments could in fact satisfy the constraints. 
(Vx)rel(=, x) 
In general, where this works, it will lead to the best inter- 
pretation. We can also coerce from a whole to a part and 
from an object to its function. Hence, 
(vx, y)part(z, y) ~ eel(x, y) 
(Vx, e)function(c, x) D rel(e,z) 
Putting it all together, we find that to solve all the local 
pragnaatics problems posed by sentence (2), we must derive 
the following expression: 
(3 e, x, c, ka, k2, y, a, o)Past(e) 
h disengage'(e, z, c) 
A compressor(c) A after(k1, k~) 
Aevent(kl) A rel(ka,y) A y E {c,e} 
A event(k2) A ret(k2,a) A alarm(a) 
A nn(o, a) A tube-oil(o) 
But this is just the logical form of the sentence 4 together 
with the constraints that predicates impose on their ar- 
guments, allowing for coercions. That is, it is the first 
half of our characterization (1) of what it is to interpret a 
sentence. 
When parts of this expression cannot be derived, as- 
sumptions must be made, and these assumptions are taken 
to be the new information. The likelihood of different 
atoms in this expression being new information varies ac- 
cording to how the information is presented, linguistically. 
The main verb is more likely to convey new information 
than a definite noun phrase. Thus, we assign a cost to 
each of the atoms--the cost of assuming that atom. Tlus 
cost is expressed in the same currency in which other fac- 
tors involved in the "goodness" of an interpretation are 
expressed; among these factors are likely to be the length 
of the proofs used and the salience of the axioms they rely 
on. Since a definite noun phrase is generally used referen- 
tially, an interpretation that simply assumes the existence 
of the referent and thus falls to identify it should be an ex- 
pensive one. It is therefore given a high assumability cost. 
For purposes of concreteness, let's call this $10. Indefinite 
noun phrases arc not usually used referentially, so they are 
given a low cost, say, $1. Bare noun phrases are given 
an inte~ediate cost, say, $5. Propositions presented non- 
nominally are usually new information, so they are given 
a low cost, say, $3. One does not usually use selectional 
constraints to convey new information, so they are given 
the same cost as definite noun phrases. Coercion relations 
and the compound nominal relations are given a very high 
cost, say, $20, since to assume them is to fail to solve the 
interpretation problem. If we superscript the atoms in the 
above logical form by their assumability costs, we get the 
following expression: 
(3 e, z, c, kl, k2, y, a, o)Past( z )" 
^ disengagc'(e, z, c)" 
^ cornpreJsor(c) ss ^ aftcr(kt, k2)" 
^event(k~) .2° ^ rel(kt,y) *~ ^ y ~ {c,e} 
A event(k2) sa° A rel(k2,a) s2° A alarm(a) gs 
^ nn(o, a) s~° ^ tube-oil(o)" 
While this example gives a rough idea of the relative as- 
sumability costs, the real costs must mesh well with the in- 
ference processes and thus must be determined experimen- 
tally. The use of numbers here and throughout the next 
section constitutes one possible regime with the needed 
properties. Vv'e are at present working, and with some 
optimism, on a semantics for the numbers and the proce- 
dures that operate on them. In the course of this work, we 
may modify the procedures to an extent, but we expect to 
retain their essential properties. 
4For justification for this kind of logical form for sentences with 
quantifiers and inteusional operators, see Hobbs(1983) and Hobbs 
(1985a). 
97 
3 Abduction 
We now argue for the last half of the characterization (I) 
of interpretation. 
Abduction is the process by which, from (Vz)p(z I D 
q(r) and q(A), one concludes p(A I. One can think of q(A) 
as the observable evidence, of (Vz)p(z) D q(z) as a gen- 
eral principle that could explain q(A)'s occurrence, and of 
p(A) as the inferred, underlying cause of q(A). Of course, 
this mode of inference is not valid; there may be many 
possible such p(A)'s. Therefore, other criteria are needed 
to choose among the possibilities. One obvious criterion 
is consistency of p(A I with the rest of what one knows. 
Two other criteria are what Thasard (1978) has called 
consilience and simplicity. Roughly, simplicity is that p(A) 
should be as small as possible, and consilience is that q(A) 
should be as big as possible. We want to get more bang 
for the buck, where q(A) is bang, and p(A) is buck. 
There is a property of natural language discourse, no- 
ticed by a number of linguists (e.g., Joos (19721, Wilks 
(1972)), that su~ests a role for simplicity and consilience 
in its interpretation--its high degree of redundancy. Con- 
sider 
Inspection of oll filter revealed metal particle~. 
An inspection is a looking at that causes one ~o learn a 
property relevant to the j~nc~io~ of the inspected object. 
The ~nc~io¢ of a falter is to capture p,~eticle~ from a fluid. 
To reveal is to os~e one ~o/earn. If we assume the two 
causings to learn are identical, the two sets of particles 
are identical, and the two functions are identical, then we 
have explained the sentence in a minimal fashion. A small 
number of inferences and assumptions have explained a 
large number of syntactically independent propositions in 
the sentence. As a byproduct, we have moreover shown 
that the inspector is the one to whom the particles are 
revealed and that the particles are in the filter. 
Another issue that arises in abduction is what might 
be called the "informativeness-correctness tradeotP'. Most 
previous uses of abduction in AI from a theorem-proving 
perspective have been in diagnostic reasoning (e.g., Pople, 
1973; Cox and Pietrzykowski, 1986), and they have as- 
maned "most specific abduction". If we wish to explain 
chest palna~ it is not su~cient to assume the cause is sim- 
ply chest pains. We want something more specific, such as 
"pneumonia". We want the most specific possible expla- 
nation. In natural language processing, however, we often 
want the least specific assumption. If there is a mention of 
a fluid, we do not necessarily want to assume it is lube oil. 
Assuming simply the existence of a fluid may be the best 
we can do. s However, if there is corroborating evidence, 
we may want to make a more specific assumption. In 
Alarm sounded. Flow obstructed. 
SSometimes a cigar is just a cigar. 
we know the alarm is for the lube oil pressure, and this 
provides evidence that the flow is not merely of a fluid but 
of lube oil. The more specific our assumptions are, the 
more informative our interpretation is. The less specific 
they are, the more likely they are to be correct. 
We therefore need a scheme of abductive inference with 
three features. First, it should be possible for goal ex- 
pressions to be assumable~ at varying costs. Second, there 
should be the possibility of making assumptions at vari- 
ous levels of specificity. Third, there should be a way of 
exploiting the natural redundancy of texts. 
We have devised just such an abduction scheme, s First: 
every conjunct in the logical form of the sentence is given 
an assumability cost, as described at the end of Section 2. 
Second, this cost is passed back to the antecedents in Horn 
clauses by assigming weights to them. Axion~s are stated 
in the form 
(4) Pp ^Pp ~ Q 
This says that Pl and P2 imply Q, but also that if the 
cost of assuming Q is c, then the cost of assuming PI is 
wlc, and the cost of assuming P2 is w2c. Third, factoring 
or synthesis is allowed. That is, goal wi~s may be unified, 
in which case the resulting wi~ is given the smaller of the 
costs of the input wi~s. This feature leads to minimality 
through the exploitation of redundancy. 
Note that in (41, if wl + w2 <= 1, most specific abduction 
is favored--why assume Q when it is cheaper to assume PI 
and P~. Hwlq-w2   I, least specific abduction is favored-- 
why assume PI and P2 when it is cheaper to assume Q. But 
in 
pis ^ P~s ~ Q 
if PI has already been derived, it is cheaper to assume P2 
than ~. P1 has provided evidence for Q, and assumlug the 
"remainder" P2 of the necessary evidence for Q should be 
cheaper. 
Factoring can also override least specific abduction. 
Suppose we have the axioms 
PiS A P~ s D QI 
p~s ^ p~s ~ Q~ 
and we wish to derive ~i ^ ~2, where each conjunct has an 
assumability cost of $10. Then assuming QI ^ ~2 will cost 
$20, whereas assuming Pl ^ P2 ^ Ps will cost only $18, since 
the two instances of P2 can be unified. Thus, the abduction 
scheme allows us to adopt the careful policy of favoring 
least specific abduction while also allowing us to exploit 
the redundancy of texts for more specific interpretations. 
In the above examples we have used equal weights on 
the conjuncts in the antecedents. I~ is more reasonable, 
SThe ~bduction scheme is due to Mark Stickel, and it, or a variant 
of it, is described at ~-eater length in Stickel (1988). 
98 
however, to assign the weights according to the "seman- 
tic contribution" each conjunct makes to the consequent. 
Consider, for example, the axiom 
(Vz)ear(z) "s A no-top(z) "4 D convertible(x) 
We have an intuitive sense that ear contributes more to 
convertible than no-top does. r In principle, the weights in 
(4) should be a function of the probabilities that instances 
of the concept Pi are instances of the concept Q in the cor- 
pus of interest. In practice, all we can do is assign weights 
by a rough, intuitive sense of semantic contribution, and 
refine them by successive approximation on a representa- 
tive sample of the corpus. 
One would think that since we are deriving the logical 
form of the sentence, rather than determining what can be 
inferred from the logical form of the sentence, we could not 
use super~et information in processing the sentence. That 
is, since we are back-chaining from the propositions in the 
logical form, the fact that, say, lube oil is a fluid, which 
would be expressed as 
(5) (Vz)lube-oil(z) D fluid(z) 
could not play a role in the analysis. Thus, in the text 
Flow obstructed. Metal particles in lube oil filter. 
we know from the first sentence that there is a fluid. We 
would like to identify it with the lube oil mentioned in the 
second sentence. In interpreting the second sentence, we 
must prove the expression 
( 5 z )lube-oil( z ) 
If we had as an axiom 
(Vz)/tuid(z) ~ tub,-al(:) 
then we could establish the identity. But of course we 
don't have such an axiom, for it isn't true. There are lots 
of other kinds of fluids. There would seem to be no way 
to use superset information in our scheme. 
Fortunately, however, there is a way. We can make use 
of this information by converting the axiom into a bicon- 
ditional. In general, axioms of the form 
species D genus 
can be converted into a bieonditional axiom of the form 
genus A differentiae _= species 
rTo prime this intuition, imagine two doom. Behind one is n ear. 
Behind the other is something with no top. You pick a door. If there's 
a convertible behind it, you get to keep it. Which door would you 
pick? 
Often, of course, as in the above example, we will not 
be able to prove the differentiae, and in many cases the 
differentiae can not even be spelled out. But in our ab- 
ductive scheme, this does not matter. They can simply be 
assumed. In fact, we need not state them explicitly. We 
can simply introduce a predicate which stands for all the 
remaining properties. It will never be provable, but it will 
be assumable. Thus, we can rewrite (5) as 
(Vz)fluid(z) h etcl(z) _ lube-oil(z) 
Then the fact that something is fluid can be used as evi- 
dence for its being lube oil. With the weights distributed 
according to semantic contribution, we can go to extremes 
and use an axiom like 
(Vz)rnammal(z) "2 A atc2(z) "s D elephant(z) 
to allow us to use the fact that something is a mammal as 
(weak) evidence that it is an elephant. 
In principle, one should try to prove the entire logical 
form of the sentence and the constraints at once. In this 
global strategy, any heuristic ordering of the individual 
problems is done by the theorem prover. From a practi- 
cal point of view, however, the global strategy generally 
takes longer, sometimes significantly so, since it presents 
the theorem-prover with a longer expression to be proved. 
We have experimented both with this strategy and with 
a bottom-up strategy in which, for example, we try to 
identify the lube oil before trying to identify the lube oil 
alarm. The latter is quicker since it presents the theorem- 
prover with problems in a piecemeal fashion, but the for- 
mer frequently results in better interpretations since it is 
better able to exploit redundancies; The analysis of the 
sentence in Section 4.2 below, for example, requires either 
the global strategy or very careful axiomatization. The 
bottom-up strategy, with only a view of a small local re- 
gion of the sentence, cannot recognize and capitalize on 
redundancies among distant elements in the sentence. Ide- 
ally, we would like to have detailed control over the proof 
process to allow a number of different factors to interact in 
deterr-ln~ng the allocation of deductive resources. Among 
such factors would be word order, lexlcal form, syntactic 
structure, topic-comment structure, and, in speech, pitch 
accent .s 
4 Examples 
4.1 Distinguishing the Given and New 
We will examine two difllcult definite reference problems in 
which the given and the new information are intertwined 
and must be separated. In the first, new and old informa- 
tion about the same entity are encoded in a single noun 
phrase. 
SPereira and Pollnck's CANDIDE system (1988) is specifically de- 
signed to aid investigation of the question of the most effective order 
of interpretation. 
99 
There was adequate lube oil. 
We know about the lube oil already, and there is a corre- 
sponding axiom in the knowledge base. 
lube-oil( O) 
Its adequacy is new information, however. It is what the 
sentence is telling us. 
The logical form of the sentence is, roughly, 
(3 o)lube-oil( o) A adequate(o) 
This is the expression that must be derived. The proof of 
the existence of the lube oil is immediate. It is thus old 
information. The adequacy can't be proved, and is hence 
assumed as new information. 
The second example is from Clark (1975), and illustrates 
what happens when the given and new information are 
combined into a single lexical item. 
John walked into the room. 
The chandelier shone brightly. 
What chandelier is being referred to? 
Let us suppose we have in our knowledge base the fact 
that rooms have lights. 
(6) (Vr)roorn(r) D (31)light(1) A in(l,r) 
Suppose we also have the fact that lights with numerous 
fixtures are chandeliers. 
(7) (Vl)light(l) A has-fiztures(l) D chandelier(l) 
The first sentence has given us the existence of a room m 
room(R). To solve the definite reference problem in the 
second sentence, we must prove the existence of a chande- 
lier. Back-chaining on axiom (7), we see we need to prove 
the existence of a light with fixtures. Back-chaining from 
light(1) in axiom (6), we see we need to prove the exis- 
tence of a room. We have this in room(R). To complete 
the derivation, we assume the light I has fixtures. The 
light is thus given by the room mentioned in the previous 
sentence, while the fact that it has fl.xtures is new infor- 
mation. 
4.2 Exploiting Redundancy 
We next show the use of the abduction scheme in solving 
internal coreference problems. Two problems raised by the 
sentence 
The plain was reduced by erosion to its presen t 
level. 
are determining what was eroding and determining what 
"it" refers to. Suppose our knowledge base consists of the 
following axioms: 
(Vp, l, s)decrease(p, l, s) A vertical(s) 
A etc3(p, I, s) = (3 el)reduce'(el, p, l) 
or el is a reduction of p to l if and only if p decreases to l 
on some vertical scale s (plus some other conditions). 
(Vp)landform(p) A flat(p) ^ etc4(p) - plain(p) 
or p is a plain if and only if p is a fiat landform (plus some 
other conditions). 
(V e, lt, l, s)at'(e, It, l) ^ on(l, s) ^ vertical(s) 
A/tat(y) A etcs(e, it, l,s) -- levee(e,l,y) 
or e is the condition of l's being the level of y if and only 
if e is the condition of y's being at I on some vertical scale 
s and It is fiat (plus some other conditions). 
(Vz, I, s )decrease( z, I, s) A landform(z) 
A altitude(a) A etce(y, l, s) -- (3 e)erode'(e, z) 
or • is an eroding of z if and only if z is a landform that 
decreases to some point I on the altitude scale s (plus some 
other conditions)° 
(Vs)vertical(s) A etcr(p) - altitude(s) 
or s is the altitude scale if and only if s is vertical (plus 
some other conditions). 
Now the analysis. The logical form of the sentence is 
roughly 
(3 ca, p, l, z, e2, It)reduce'(el, p, l) A plain(p) 
A erode'(el, z) A present(e2) A level'(e2, l, y) 
Our characterization of interpretation says that we must 
derive this expression from the axioms or from assump- 
tions. Back-chainlng on reducer(el, p, l) yields 
decrease(p, l, sl) A vertical(s1 ) A etcs(p, l, sl ) 
Back-cb~r~ing on erode'(e:, z) yields 
decrease(z, 12,s2) A landform(z) ^ altitude(s2) 
A etc4( z,12, s2 ) 
and back-chaining on altitude(s2) in turn yields 
vertical(s2) A etcr( s2 ) 
We unify the goals decrease(p, I, st) and decrease(z, 12, s2), 
and thereby identify the object of the erosion with the 
plain. The goals vertical(sl ) and vertical(s2) also unify, 
telling us the reduction was on the altitude scale. Back- 
chaining on plain(p) yields 
landform(p) A flat(p) A ete,(p) 
and landform(z) unifies with landform(p), reinforcing our 
identification of the object of the erosion with the plain. 
Back-chainlng on level'(e2, I, y ) yields 
100 
at'(e2,y,l) A on(l, ss) A vertical(ss) A flat(y) 
^ etcs(p) 
and vertical(s3) and vertical(s2) unify, as do flat(y) and 
flat(p), thereby identifying "it", or y, as the plain p. We 
have not written out the axioms for this, but note also that 
"present" implies the existence of a change of level, or a 
change in the location of "it" on a vertical scale, and a 
decrease of a plain is a change of the plain's location on a 
vertical scale. Unifying these would provide reinforcement 
for our identification of "it" with the plain. Now assum- 
ing the most specific atoms we have derived including all 
the "et cetera" conditions, we arrive at an interpretation 
that is minimal and that solves the internal coreference 
problems as a byproduct. 
4.3 A Thorough Integration of Syntax, 
Semantics, and Pragmatics 
By combining the idea of interpretation as abduction with 
the older idea of parsing as deduction (Kowalski, 1980, pp. 
52-53; Pereira and Warren, 1983), it becomes possible to 
integrate syntax, semantics, and pragmatics in a very thor- 
ough and elegant way. 9 Below is a simple grammar written 
in Prolog style, but incorporating calls to local pragmatics. 
The syntax portion is represented in standard Prolog man- 
ner, with nonterminals treated as predicates and having as 
two of its arguments the beginning and end points of the 
phrase spanned by the nonterminal. The one modification 
we would have to make to the abduction scheme is to allow 
conjuncts in the antecedents to take costs directly as well 
as weights. Constraints on the application of phrase struc- 
ture rules have been omitted, but could be incorporated in 
the usual way. 
(Vi,j, k, x,p, args, req, e, c, rel)np(i, j, x) 
A vp(j, k,p, args, req) A 'pt(e, c) $3 A rel(c, z) $2° 
A subst(req, cons(c, args)) $1° D s(i, k, e) 
(V i, j, k, e, p, ar gs, req, et, c, ~el)s( i, j, e) 
A pp(j, k,p, args, req) A p'(el, c) s3 A tel(c, e) 12° 
A subst(req, cons(c, args)) *x° D s(i, k, e&el) 
(Vi,j,k,w,z,c, rel)v(i,j,w) A np(j,k,z) 
A rel(c, z) *2° 
3 vp(i, k, ~z\[w(z, c)\], <c>, Req(w)) 
(V i, j, k, z)det(i, j,"the") A cn(j, k, z, p) 
Ap(z) 'm D n1~i,k,z) 
(Vi,j,k,z)det(i,j,"a") A cn(j,k,z,p) A p(z) n 
D rip(i, k, z) 
(Vi,j,k,w,z,y,p, nn)n(i,j,w) A cn(j,k,z,p) 
^w(y)" ^ .n(y,=) '=° ~ ~(i,k,z,p) 
(V i, j, k, z, ~ , ~, args, req, c, rel)cn( i, j, z, Pl ) 
A pp(j, k,p2, args, req) 
9This idea is due to Stuart Shieber. 
A subst(req, cons(c, argo)) st° ^ rel(c, z) s2° 
~(i,k,=,;~z\[p~(:) ^ ~(~)\]) 
(Vi,j,w)n(i,j,w) D (3z)cn(i,j,z,w) 
(Vi,j, k, w, z, c, rel)prep(i, j, w) ^ np(j, k, x) 
A rel(c, z) In° 
3 ptXi, k, ,~z\[w(c, z)\], <c>, Req(w)) 
For example, the first axiom says that there is a sentence 
from point i to point k asserting eventuality e if there 
is a noun phrase from i to j referring to z and a verb 
phrase from j to k denoting predicate p with arguments 
arg8 and having an associated requirement req, and there 
is (or, for $3, can be assumed to be) an eventuality e of 
p's being true of ¢, where c is related to or coercible from 
x (with an assumability cost of $20), and the requirement 
req associated with p can be proved or, for $10, assumed to 
hold of the arguments of p. The symbol c&el denotes the 
conjunction of eventualities e and el (See Hobbs (1985b), 
p. 35.) The third argument of predicates corresponding to 
terminal nodes such as n and det is the word itself, which 
then becomes the name of the predicate. The function 
Req returns the requirements associated with a predicate, 
and subst takes care of substituting the right arguments 
into the requirements. <c> is the list consisting of the 
single element c, and cons is the LISP function cons. The 
relations re/and nn are treated here as predicate variables, 
but they could be treated as predicate constants, in which 
case we would not have quantified over them. 
In this approach, s(0, n, e) can be read as saying there is 
an interpretable sentence from point 0 to point n (asserting 
e). Syntax is captured in predicates like np, vp, and s. 
Compositional semantics is encoded in, for example, the 
way the predicat e p' is applied to its arguments in the first 
axiom, and in the lambda expression in the third argument 
of vp in the third axiom. Local pragmatics is captured by 
virtue of the fact that in order to prove s(O, n, e), one must 
derive the logical form of the sentence together with the 
constraints predicates impose on their arguments, allowing 
for metonymy. 
Implementations of different orders of interpretation, 
or different sorts of interaction among syntax, composi- 
tional semantics, and local pragmatics, can then be seen 
as different orders of search for a proof of s(O, n, e). In 
a syntax-first order of interpretation, one would try first 
to prove all the "syntactic" atoms, such as np(i,j,x), 
before any of the "local pragmatic" atoms, such as 
p'(e, c). Verb-driven interpretation would first try to prove 
vp(j, k, p, args, req) by proving v(i, j, w) and then using the 
information in the requirements associated with the verb 
to drive the search for the arguments of the verb, by de- 
riving subst(req, cons(c, args)) before trying to prove the 
various np atoms. But more fluid orders of interpreta- 
tion are obviously possible. This formulation allows one 
to prove those things first which are easiest to prove. It is 
also easy to see how processing could occur in parallel. 
101 
It is moreover possible to deal with ill-formed or unclea~ 
input in this framework, by having axioms such as this 
revision of our first axiom above. 
(V i, j, k, z, p, args, req, e, c, tel)rip(i, j, z) '4 
^ vp(j, k,p, args, req) "s ^ p'(e, c) Is 
A re/(c, :)12o A subst(req, cons(c, args)) st° 
D s(i, k, e) 
This says that a verb phrase provides more evidence for 
a sentence than a noun phrase does, but either one can 
constitute a sentence if the string of words is otherwise 
interpretable. 
It is likely that this approach could be extended to 
speech recognition by using Prolog-style rules to decom- 
pose morphemes into their phonemes and weighting them 
according to their acoustic prominence. 
5 Controlling Abduction: Type 
Hierarchy 
The first example on which we tested the new abductive 
scheme was the sentence 
There was adequate lube oil. 
The system got the correct interpretation, that the lube oil 
was the lube oil in the lube oil system of the air compressor, 
and it assumed that that lube oil was adequate. But it 
also got another interpretation. There is a mention in the 
knowledge base of the adequacy of the lube oil pressure, so 
it identified that adequacy with the adequacy mentioned 
in the sentence. It then assumed that the pressure was 
lube oil. 
It is clear what went wrong here. Pressure is a ma~i- 
rude whereas lube oil is a material, and magnitudes can't 
be materials. In principle, abduction requires a check for 
the consistency of what is e.mumed, and our knowledge 
base should have contained axioms from which it could be 
inferred that a magnitude is not a material. In practice, 
unconstrained consistency checking is undecidable and, at 
best, may take a long time. Nevertheless, one can, through 
the use of a type hierarchy, eI~minate a very large number 
of possible assumptions that are likely to result in an in- 
consistency. We have consequently hnplemented a module 
which specifies the types that various predicate-argument 
positions can take on, and the likely disjointness relations 
among types. This is a way of exploiting the specificity 
of the English lexicon for computational purposes. This 
addition led to a speed-up of two orders of magn/tude. 
There is a problem, however. In an ontologically promis- 
cuous notation, there is no commitment in a primed propo- 
sition to truth or existence in the real world. Thus, \]ube- 
oil'(e, o) does not say that o is lube oil or even that it 
exists; rather it says that • is the eventuality of o's being 
lube oil. This eventuality may or may not exist in the real 
world. If it does, then we would express this as Re,fists(e), 
and from that we could derive from axioms the existence 
of o and the fact that it is lube oil. But e's existential 
status could be something different. For example, e could 
be nonexistent, expressed as not(e) in the notation, and 
in English as "The eventuality e of o's being lube oil does 
not exist," or as "o is not lube oil." Or e may exist only 
in someone's beliefs. While the axiom 
(V z)Fressure(z) D-qube-oil(x) 
is certainly true, the axiom 
(Vel,z)~essure'(e,,=) ~ -,(3 eDtu~e-oir(e2, =) 
would not be true. The fact that a variable occupies the 
second argument position of the predicate lube-o/l' does 
not mean it is lube oil. We cannot properly restrict that 
ar~Btment position to be lube oil, or fluid, or even a ma- 
terial, for that would rule out perfectly true sentences like 
"~uth is not lube oil." 
Generally, when one uses a type hierarchy, one assumes 
the types to be disjoint sets with cleanly dei~ed bound- 
aries, and one assumes that predicates take arguments of 
only certain types. There are a lot of problems with this 
idea- In any case, in our work, we are not buying into this 
notion that the universe is typed. P~ther we are using the 
type hierarchy strictly as a heuristic, as a set of guesses 
not about what could or could not be but about what it 
would or would not occur to someone to 5~zI/. ~'hen two 
types are declared to be disjoint, we are saying that they 
are certainly disjoint in the real world, and that they are 
very probably disjoint everywhere except in certain bizarre 
modal contexts. This means, however, that we risk fmling 
on certain rare examples. We could not, for example, deal 
with the sentence, ~It then assumed that the pressure was 
lube oily 
6 Future Directions 
Deduction is explosive, and since the abduction scheme 
augments deduction with the assumptions, it is even more 
explosive. We are currently engaged in an empirical in- 
vestigation of the behavior of this abductive scheme on a 
very large knowledge base performing sophisticated pro- 
ceasing. In addition to type checking, we have introduced 
two other tevhnlques that are necessary for controlling the 
exploslon~unwinding recursive axioms and making use of 
syntactic noncoreference information. We expect our in- 
vestigation to continue to yield techniques for controlling 
the abduction process. 
We are also looking toward extending the interpretation 
processes to cover lexical ambiguity, quantifier scope am- 
biguity and metaphor interpretation problems as well. We 
will also be investigating the integration proposed in Sec- 
tion 4.3 and an approach that integrates all of this with 
the recognition of discourse structure and the recognition 
of relations between utterances and the hearer's interests. 
102 
Acknowledgements 
The authors have profited from discussions with Todd 
Davies, John Lowrance, Stuart Shieber, and Mabry Tyson 
about this work. The research was funded by the Defense 
Advanced Research Projects Agency under Office of Naval 
Research contract N00014-85-C-0013. 
References 
\[1\] Bear, John, and Jerry R. Hobbs, 1988. "Localizing the 
Expression of Ambiguity", Proceeding-., Second Confer- 
ence on Applied Natural Language Proce-.-.ing, Austin, 
Texas, February, 1988. 
\[2\] Charniak, Eugene, 1986. "A Neat Theory of Marker 
Passing", Proceedings, AAAI-86, Fifth National Con- 
ference on Artificial Intelligence, Philadelphia, Pennsyl- 
vania, pp. 584-588. 
\[3\] Clark,Herbert, 1975. "Bridging". In R. Schank and 
B. Nash-Webber (Eds.), Theoretical I~sue-. in Natu- 
ral Language Processing, pp. 169-174. Cambridge, Mas- 
sachusetts. 
\[41 Cox, P. T., and T. Pietrzykowski, 1986. "Causes for 
Events: Their Computation and Applications", Proceed. 
ing~, CADE-& 
\[5\] Downing, Pamela, 1977. "On the Creation and Use of 
English Compound Nouns", Language, vol. 53, no. 4, 
pp. 810-842. 
\[6\] Hobbs, Jerry 1~, 1983. "An Improper Treatment of 
Quantification in Ordinary English", Proceeding, of the 
51Jr Annual Meeting, Association for Computational 
I, inguiatic$, pp. 5%63. Cambridge, Massachusetts, June 
1983. 
\[7\] Hobbs, Jerry R. 1985a. "Ontological promiscuity." Pro. 
ceedings, 23rd Annual Meeting of the A85ociation for 
Computational Linguistics, pp. 61-69. 
\[8\] Hobbs, Jerry R., 1985b, "The Logical Notation: Onto- 
logical Promiscuity", manuscript. 
\[9\] Hobbs, Jerry (1986) "Overview of the TACITUS 
Project", CL, Vol. 12, No. 3. 
\[10\] Hobbs, Jerry R., William Croft, Todd Davies, Dou- 
glas Edwards, and Kenneth Laws, 1986. "Commonsense 
Metaphysics and Lexical Semautics', Proceeding-., ~th 
Annual Meeting of the A~aociation for Computational 
LinguiaticJ, New York, June 1986, pp. 231-240. 
\[11\] Hobbs, Jerry R., and Paul Martin 1987. "Local Prag- 
matics". Proceedings, International Joint Conference on 
Artificial Intelligence, pp. 520-523. Mila~o, Italy, Au- 
gust 1987. 
\[12\] Joos, Martin, 1972. "Semantic Axiom Number One", 
Language, pp. 257-265. 
\[13\] Kowalski, Robert, 1980. The Logic of Problem Soh. 
lug, North Holland, New York. 
\[14\] Levi, Judith, 1978. The Synta= and Semantics of 
Complez Nominals, Academic Press, New York. 
\[15\] Norvig, Peter, 1987. "Inference in Text Understand- 
ing", Proceedings, AAAI-87, Sizth National Confer- 
ence on Artificial Intelligence, Seattle, Washington, July 
1987. 
\[16\] Nuaberg, Geoffery, 1978. "The Pragmatics of Refer- 
enee", Ph.D. thesis, City University of New York, New 
York. 
\[17\] Pereira, Feraando C. N., and Martha E. Pollack, 1988. 
"An Integrated Framework for Semantic and Pragmatic 
Interpretation", to appear in Proceedings, 56th Annual 
Meeting of the Association for Computational Linguis- 
tics, Buffalo, New York, June 1988. 
\[18\] Pereira, Fernando C. N., and David H. D. Warren, 
1983. "Parsing as Deduction", Proceeding8 of the 51~ 
Annual Meeting, AJsociation for Computational Lin- 
guistics, pp. 137-144. Cambridge, Massachusetts, June 
1983. 
\[19\] Pople, Harry E., Jr., 1973, "On the Mechanization 
of Abductive Logic", ProceedingJ, Third International 
Joint Conference on Artificial Intelligence, pp. 147-152, 
Stanford, California, August 1973. 
\[20\] Stickel, Mark E., 1982. "A Nonclausal Connection- 
Graph Theorem-Proving Program", ProcecdingJ, AAAI. 
85 National Conference on Artificial Intelligence, Pitts- 
burgh, Pennsylvania, pp. 229-233. 
\[21\] Stickel, Mark E., 1988. "A Prolog-like Inference Sys- 
tem for Computing Minimum-Cost Abductive Explana- 
tions in Natural-Language Interpretation", forthcoming. 
\[22\] Thagard, Paul R., 1978. "The Best Explanation: Cri- 
teria for Theory Choice", The Journal of Philosophy, 
pp. 76-92. 
\[23\] Wilks, Yorick, 1972. Grammar, Meaning, and the Ma- 
chine Analy-.iJ of Language, Routledge and Kegan Paul, 
London. 
103 
