A Critical Evaluation of Commensurable Abduction Models 
for Semantic Interpretation * 
Peter Norvig and Robert Wilensky 
University of California, Berkeley 
Computer Science Division, Evans Hall 
Berkeley, CA 94720, USA 
Introduction 
Language interpretation involves mapping from a string 
of words to a representation of an interpretation of those 
words. The problem is to be able to combine evidence 
I'rom the lexicon, syntax, semantics, ,'rod pragmatics to 
arrive at the best of the many possible interpretations. 
Given the well-worn sentence "The box is in the pen," 
~;yntax may say that "pen '~ is a noun, while lexical knowl- 
edge may say that "pen" most often means writing im- 
plement, less often means a fenced enclosure, and very 
rarely means a female swan. Semantics may say that the 
object of "in" is often an enclosure, and pragmatics may 
::ay lhat the topic is hiding small boxes of illegal drugs im 
side aquatic birds, Thus there is evidence for.multiple in- 
tc.rpretations, and one needs some way to decide between 
them. 
In the past few years, some general approaches to inter- 
pretation have been advanced within an abduction frame- 
work. Charniak (1986) and Norvig (1987, 1989) m'e two 
examples. Abduction is a term coined by Pierce (1955) 
to describe the (unsound) inference rule that concludes A 
from the observation 13 and the role A ~ B, along with 
the fact that there is no "better" rule explaining B. 
In this paper we critically evaluate three recent alxluc- 
tire interpretation models, those of Chamiak and Gold- 
man (1989); Hobbs, Stickel, Martin and Edwards (1988); 
a~(1 Ng and Mooney (1990). "Itmse three models add the 
important property of commensurability: all types of ev- 
idence are represented in a common currency that can be 
compared and combined. While commensurability is a 
desirable property, and there is a clear need for a way to 
compare alternate explanations, it appears that a single 
scalar measure is not enough to account for all types of 
processing. We present other problems for the abductive 
approach, and some tentative solutions. 
Cost Based Commensurability 
Iqobbs et al. (1988) view interpreting sentences as "'pro- 
viding the best explanation of why the sentences would 
*Sponsored by the Defense Advanced Research Projects 
Agency (DoD), Arpa Order No. 4871, monitorcxl by Space and 
Naval Warfare Systems Command under Contract N00039-84- 
CO089. This paper benefined from discussions wifll Michael 
Braverman, Dan .Iurafsky, Nigel Ward, Dekai Wu, and other 
members of the BAIR seminar. 
225 
be true." In this view a given sentence (or an entire text) 
is translated by an ambiguity-preserving parser into a log- 
ical form, L. Each conjunct in the logical form is anno- 
tated by a number indicating the cost, $C, of assuming tim 
conjunct to be true. Conjuncts corresponding to "new" 
information have a low cost of assumability, while those 
corresponding to "given" information have a higher cost, 
since to assume them is to fail to find the proper connec- 
tion to mutual knowledge. Each conjunct must be either 
assumed or proved, usinga rule or series of rules from the 
knowledge base. Each rule also has cost factors associ- 
ated with it, and the proper interpretation, I, is the set of 
propositions with minimal cost that entails L. 
As an example, consider again the sentence "The box 
is in the pen." The cost-,'mnotated logical form (in a sim- 
plified notation omitting quantifiers) is: 
L = box(x) $1° A pen(y) $1° A in(x, y)$3 
where psi: means the final interpretation must either as- 
sume P for $x, or prove P, presumably for less. Consider 
the proof rules: 
wriling pen( x ) "9 D pen(x) 
female(x) '3 A swan(x) "6 D pen(x) 
enclosure(y) '3 A inside(x, y).6 D in(x, y) 
The first rule says that anything that is a writing-pen 
is also a member of the class 'pen'--things that can be 
described with the word "pen". The superscripted num- 
bers are preference information: the first rule says that 
pen(z) s~° can be derived by assuming writing pen(x) s9. 
Predicates of the form etci(x), as in the second rule, de- 
note conditions that are stated elsewhere, or, for some 
natural kind terms, can not be fully enumerated, but can 
only be assumed. They seem to be related to the ab- 
normal predicates, ab(x) used in circumscription theory 
(McCarthy 1986). 
Below are two interpretations of L. The first just as- 
sumes the entire logical form for $23, while the second 
applies the rules and shares the enclosure(y) predicate 
common to one of the definitions of pen(y) and the defi- 
nition of in(x, y) to arrive at a $20.80 solution. 
box(x)s 0 ^ pen( :)s o ^ y)s3. 
box(x) sm A enclosure(y) $3 A fenced(y) $3 
A etcl(y) $3 A enclosure(y) $° A inside(x, y)$1.8 
The second enclosure(y) gets a cost of $0 because 
it has already been assumed. Let me stress that the de- 
tails here are ours, and the authors may have a different 
treatment of this example. For example, they do not dis° 
cuss lexical ambiguity, although we believe we have been 
faithful to the sense of their proposal. 
"lThis approach has several problems, as we see it: 
(1) A single number is being used for two separate 
measures: the cost of the assumptions and the quality 
of the explanation. Hobbs et al. hint at this when they 
discuss the "informativeness-correctness tradeoff." Con- 
sider their example "lube-oil alarm," which gets trans- 
lated as: 
lubeoil(o) s5 A alarm(a) s5 A nn(o, a) $~a 
where nn means noun-noun compound. It is given a high 
cost, $20, because failing tO find the relation means fail- 
ing to fully understand the referent. Intuitively this moti- 
vation is valid. However, the nn should have a very low 
cost of assumption, because there is very strong evidence 
for it--the juxtaposition of two nouns in the input--so 
there is little doubt that nn holds. Thus we see nn should 
have two numbers associated with it: a low cost of as- 
sumption, and a low quality of explanation. It should not 
be surprising to see that two numbers are needed to search 
for an explanation: even in A* search one needs both a 
cost function, 9, and a heuristic function h'. 
The low quality of explanation is often the sign of a 
need to search for a better explanation, but the need de- 
pends on the task at h,'md. To diagnose a failure in the 
compressor, it is useful to know that a "lube-oil alarm" 
is an alarm that sounds when the lube-oil pressure is low, 
and not, .gay, and alarm made out of lube-oil. However, 
if the input was "Get me a box of lube-oil alarms from 
the warehouse," then it may not be necessary to filrther 
explain the nn relation) Mayfield (1989)characterizes a 
good explanation as being applicable to the needs of the 
explanation's user, grounded in what is already known, 
and completely accounting for the input. 
To put it another way, consider the situation where a 
magician pulls a rabbit out of his hat. One possible ex- 
planation is that the rabbit magically appeared in the hat. 
This explanation is of very high quality--it perfectly ex- 
plains the situation--but it has a prohibitive assumption 
cost. An alternate explanation is that the magician some- 
how used slight of hand to insert the rabbit in the hat 
when the audience was distracted. This is of fairly low 
quality--it fails to completely specify the situation--but 
it has a much lower assumption cost. Whether this is a 
sufficient explanation depends on the task. For a casual 
observer it may will do, but for a rival magician trying to 
steal the trick, a better explanation is needed. 
(2) Translating, say, "the pen" as pen(v) $1° conflates 
two issues: the final interpretation must find a referent, y, 
and it must also disambiguatE "pen". It is true that defi- 
nite noun phrases are often used to introduce new infor- 
mation, and thns must be assumed, but an interpretation 
tTranslating "lube-oil alarm" as (3o)htbeoil(o) is suspect; 
in the ease of an alarm still in the box, there is not yet any par- 
ticular oil for which it is file alarm. 
226 
that does not disambiguate "pen" is not just making al~ 
assumption--rather it is failing altogether. One could ac- 
commodate this problem by writing ttisambiguatio~l rules 
where tile sum of tile icft-haud.-side compo~w, nts is less 
than 1. Thus, the system will always prefer to find some 
interpretation for"pen", rather titan leaving it ambi~,uous. 
In the case of vagueness rather than ambiguity, one would 
probably want the leftohand-side to total greater |hau l. 
For example, in "He saw her duck", the word "duck" is 
ambiguous between a water fowl and a downward move~ 
meat, and any candidate solution should be force^ to de- 
cide between the two meanings. In contrast, "he" is vague 
between a boy and a man, but it is not necessary lot a valid 
interpretation to make this choice. We could model this 
with the rules: 
ducklowl(~) "9 D duck(x) 
"9 d  k(x) 
.9 ^ 
maZe(z) "9 A child(z) "2 D he(z) 
However, this alone is not enough. Consider the sen- 
tence "The pen is in the box." By the rules above (and aso 
suming a box is defined as an enclosure) we could derive 
three interpretations, where either a writing implement, 
a swan, or a fenced enclosure is inside a box. All three 
would get a cost of $20.8. To choose among these three, 
we would have to add knowledge about the likelihood of 
these three things being in boxes, or add knowledge about 
the relative frequencies of the three senses of "pen". For 
example, we could change the numbers as follows: 
writitt.q pen(z) '9 D pen(z) 
enelosure(:e)'31A fenced(z).sl A etei(z) .3t 
fe  al (x) "n ^ sw,m(z') 9 
This has the effect of making the writing implement 
sense slightly more likely than the fenced enclosure 
sense, and much more likely than the female swan sense° 
These rules maintain the desirable property of commeno 
surability, but the numbers are now even more over- 
loaded. Hobbs et al. already are giving the numbers re- 
sponsibility for both "probabilities" and "semantic relat- 
edness", and now we have shown they must account for 
word frequency information, and both the cost of assump- 
tions and the quality of the explanation, the two measures 
needed to control search. As our previous criticisms have 
shown, a single number cannot represent even the cost 
and quality of an explanation, much less these additional 
factors. 
Also nOtE that to constrain search, it is important to 
consider bottom-up clues, as in (Charniak 1986) and 
(Norvig 1987). It would be a mistake to use the rules 
given here in a strictly top-down manner, just because 
they are reminiscent of Prolog rules. 
(3) There is no notion of a "good" or "bad" interpre- 
tation, except as an epiphenomenon of the interpretation 
rules. In the "pen" example, the difference between failo 
ing completely to understand "pen" and properly disam- 
biguating it to fenced-enclosure is less than 10% of the 
total cost. The numbers in the rules could be changed 
2 
to increa,~c this difference, but it would still be a quantio. 
tative rather than qualitative difference. The problem is 
that ther~ are at least three reasons why we might want to 
maintair~ ambiguity: because we are unsure of the cause 
of an event, because it is so mundane as to not need an 
explanation, and because it is so unbelievable that there 
is no explanation. This theory does not distinguish these 
cases. The theory has no provision for saying "I don't 
understand--the only interpretation I can find is a faulty 
one," and then looking harder for a better interpretation. 
(4) There is no way to entorce a penalty worse than the 
cost of an assumption. Consider the sentence "Mary said 
she had killed herself." "Hie logical form is something 
like: 
say,(Ma,'y, re) $3 A z = kill(Mary~ Mary)S3o 
Thus, for $6 we can just assume the logical form, without 
noticing the inherent contradiction. Now let's consider 
some fulton. We've collapsed most of the interesting parks 
of these rules into eic predicates, leaving just the parts 
relevant m the contradiction: 
,aive(p) "~ A et,.2(p, ~).9 D s,W(P, ~) 
-~alive(p) "5 A etcs( rn, p).5 ?) kill(m, p) 
We've ignored time here, but the intent is that the alive 
predicate ~s concerned with the time interval or situation 
after the killing, including the time of the saying. Now, 
an alternative interpretation of L is: 
alive(Mary) $'3 A -~alive( M ary) $L5 
A etc2(Mary, x) $2"7 A e tcs(Mary, Mary) $L5 
Presumably fllere should be some penalty (finite or in~. 
finite,) for deriving a contradiction, so this interpretation 
will total more than $6. The problem is there is no way to 
propagate this contradiction back up to the first interpreta- 
tion, where we just assmned both clauses. We would like 
to penalize that interpretation, too, so that it costs more 
than $6, but there is no way to do so. 
A solution to this problem is to legislate that rather than 
finding a .~olution to the logical form of a sentence, L, the 
hearer must find a solution to the larger set of proposi- 
tions, L', where L ~ is derived from L by some process of 
direct, "obvious" inference. We do not want the full de- 
ductive closure from L, of course, but we want to allow 
for some amount of automatic forward chaining from the 
input. 
(5) We would like to be able to go on and find alter~ 
native explanations, perhaps one where Mary is speaking 
from the afterworld, or she is lying, or the speaker is ly- 
ing. One could imagine rules for truthful and untruthful 
saying, and such rules could be applied to Mary's speech 
act. However, since the goal of the interpretation process 
is "providing the best explanation of why the sentences 
would be true," it does not seem that we could use the 
rules to consider the possibility of the speaker being un- 
tmthflfl. The truth of the text is assumed by the model, 
and the speaker is not modeled. 
Probability Based Commensurability 
Charniak and Goldman (1988) started out with a model 
very similar to Hobbs et al., but became concerned with 
227 
the lack of theoretical grounding for Ihe number,¢ in rules, 
much as we we.re. Chamiak and Goldman (1989a, 1989b) 
switched to a system based strictly on probabilities in 
the world, combined by Bayesian probability theory. Alo 
though this solves some problems, other problems reo 
main, and some new ones are introduced. For example: 
(1) The approach in (1989a) is based on "events and 
objects in the real world". As the authors point out, it 
cannot deal with texts involving modal verbs, nor can it 
deal with speech acts by characters, or texts where the 
speaker is uncooperative. So problem (4) above remains. 
(2) Because the probabilities are based on cvcnL~ in the 
real world, the basic system often failed to find stories as 
coherent as they should be. For example, the text: 
Jack got a rope. lie killed hhnselfi 
sugge.~ts suicide by hanging when interpreted as a text, 
but when interpreted &~ a partial report of eveuL~ in the 
world, that interpretation is less compelling. (After all, 
the killing nmy have taken place years after the getting.) 
It is only when the two even|s are taken as a part of a 
coherent text that we assume they are related, tempo° 
rally and causally. In Chmniak and Goldman (1989a), 
the coherence of stories is explained by a (probabilistic) 
assumption of spatio~temporal l~ality between evenLs 
mentioned in adjacent sentences in the text, Thus the 
story would be treated roughly as if it were: 
Jack got a rope. Soon after, nearby, a male was found 
to have killed himself. 
The Bayesian networks compute a probability of hanging 
of.3; this seems about right for the later story, but too low 
for the original version. 
Perhaps anticipating some of these problems, Chamiak 
and Goldman (198%) introduce an alternate approach ino 
volving a parameter, /'7, which denotes the probability 
that two arbitrary things are the same. They claim that 
in stories this parameter should be set higher than in real 
life, and that this will lead, tbr example, to a high prob~ 
ability for the interpretation where the rope that Jack got 
is the one he used lbr hanging. But E does a poor job of 
capturing the notion of coherence. Consider: 
John picked an integer from one to ten. Mary did so 
too. 
Here the probability that they picked the same number 
should be. 1, regardless of whether we are observing real 
life or reading a story, and regardless of the value of E. 
Chamiak and Goldman (1989b) go on to propose a the- 
ory of "mention" rather than a theory of coincidence, but 
they do not develop this alternative. 
(3) It seems that for many inferences, frequency in the 
world does not play an important role at all. Consider the 
text: 
Jack wanted to tie a mattress on top of his car. lie also 
felt like killing himself, lie got some rope. 
Now, the probability of getting a rope to hang oneself 
given suicidal feelings must be quite low, maybe .001, 
while the probability of getting a rope for tying given 
a desire to secure a mattress is much higher, maybe .5. 
Thus the Charniak-Goldman model would strongly pre- 
fer the latter interpretation. With the "mention" theory, 
it would like both interpretations. Yet ~ sample of hi- 
formants mostly found the text confllsing-they reported 
finding both interpretations, and were unable to choose 
between them. It would be useful to find a better char° 
acteriz~ation of when frequencies in the world are useful, 
and when they appear to be ignored in favor of some more 
discrete notion of "reasonable connection." 
Problems With Both Models 
Neither model is completely explicit on how the final ex- 
planation is constructed, or on what to do with the fi~ 
nal explanation. In a sense, Hobbs et al.'s system is 
like a justification-based truth-maintenance system that 
searches for a single consistent state, possibly explor- 
ing other higher-cost states along the way. Charniak 
and Goldman's system is like an assumption-based truth- 
maintenance system (ATMS) that keeps track of all pos- 
sible worlds in one grand model, hut needs a separate in- 
terpretation process to extract consistent solutions. Thus, 
the system does not really do interpretation to the level 
that could lead to de~cisious. Rather, it provides evidence 
upon which decisions can be based. 
Both approaches are problematic. Imagine the situa- 
tion where a hearer is driving a car, and is about to enter 
an intersection when a traffic officer says "don't - stop". 
The hearer derives two possible interpretations, one cor- 
responding to "Don't stop." and the other correspond- 
ing to "Don't. Stop." Hobbs et al.'s system would assign 
costs and chose the one with the lower cost, no matter how 
slight the difference. A more prudent course of action 
might be to recognize the ambiguity, and seek more infor- 
mation to decide what was intended. Charniak and Gold- 
man's system would assign probabilities to each proposi- 
tion, but would offer no assistance as to what to do. How- 
ever, if the model were extended from Bayesian networks 
to influence diagrams, then a decision could be made, and 
it would also be possible to direct search to the important 
parts of the network. 
Deliberate ambiguity is also a problematic area. In a 
pun, for example, the speaker intends that the hearer re- 
cover two distinct interpretations. Such subtlety would be 
lost on the models discussed here. This issue is discussed 
in more depth in Norvig (1988). 
A number of arguments show that strict maximization 
of probability (or minimization of cos0 is a bad idea. 
First, as we have seen, we must sometimes admit that 
an input is truly ambiguous (intentionally or unintention- 
ally). 
Second, there is the problem of computational com- 
plexity. Algorithms that guarantee a maximal solution 
take exponential time for the models discussed here. 
Thus, a large-scale system will be forced to make some 
sort of approximation, using a less costly algorithm. This 
is particularly true because we desire an on-line system-- 
one that computes a partial solution after each word is 
read, and updates the solution in a bounded period of 
time. 
228 
Third, communication by language has the property 
that "the sg~..aker is always right". In chess, if I play opti- 
mally and my opponent plays sub-optimally, I win. But in 
language understanding, if I abduce the "optimal" inter- 
pretation when the speaker had something else in mind, 
then we have failed to communicate, and I in effect lose. 
Put another way, there is a clear "evolutionary" advan- 
tage tbr optimal chess strategies, but once language ha~g 
evolved to the point where communication is possible, 
there is no point for a hearer to try to change his interpre- 
tation strategy to derive what an optimal speaker would 
have uttered to an optimal hearer-because there are no 
such optimal speakers. Indeed, there is an advantage tot 
communication strategies that can be computed quickly, 
allowing the participants to spend time on other ~sks. 
By the second point above, such a strategy must be sub- 
optimal. 
Earlier we said that Charuiak and Goldman (1989b) in~ 
troduced the parameter E to account for the coherence of 
stories. But they also provide a brief sketch of another ac- 
count, one where, in addition to deriving probabilities of 
events in the world, we also consider the probability that 
the speaker would mention a particular entity at all. Such 
a theory, if worked out, could account for the difficulty in 
processing speech acts that we have shown both models 
suffer from. 
Itowever, a theory of "mention" alone is not: enough. 
We also need theories of representing, intending, believe 
ing, directly implying, predicting, and acting. The chain 
of reasoning and acting includes at least the following: 
H attends to utterance U by speaker S 
H infers "S said U to H" 
H infers "L represents U" 
H infers "L directly implies L'" 
H infers "S intended tI to believe S believes L" 
H infers "S intended H to believe L'" 
H believes a portion of L' compatible with H's beliefs 
H forms predictions about S's future speech acts 
H acts accordingly 
This still only covers the case of successful, cooperative 
communication, and it leaves out some steps. A success- 
fld model should be able to deal with all these rules, when 
necessary. However, the successful model should also be 
able to quickly bypass the rules in the default case. We 
believe that the coherence of stories stems primarily from 
the speaker presenting evidence to the hearer in a fashion 
that will lead the hearer to focus his attention on the evi- 
dence, and thereby derive the inferences intended by the 
speaker. Communication is possible because it consists 
primarily of building a single shared explanation. It is 
only in unusual cases where there are multiple possibili- 
ties that must be weighed against each other and carried 
forth. 
Both models seem to have difficulty distinguishing am- 
biguity from multiple explanations. This makes a differ- 
ence in cases like the following: 
John was wondering about lunch when it started to 
rain. Ite ran into a restaurant. 
Here there are two reasons why John would enter the 
4 
::cstaurant-lto satisfy hunger and to avoid the a'aiu. ~n 
~ther word,~ there are two explanations, say, A D R and 
~3 D 1~, and we would like to combine them to yiekt 
A A B ~ /% As we understand it, Itobbs et al. appear 
~o use "exclusive or" in all c~t~es, so they would not find 
this explanation. Charniak and Goldman allow compet- 
ing explanations to be joined by an "or" node, but require 
competing lexical senses to be joined by "exclusive or" 
~odes. So they would find A v B 2) R. In other words, 
~hey would find both explanations probable, which is not 
quite the same thing as finding the conjunction probable. 
Now consider: 
lie's a real sweetheart. 
':(his has a straight and an ironic reading: sweetheart(z) 
and -~sweethearl(z). The disjunction is a tautology 
and the conjunction is a contradiction, so in this case 
the Hobbs approach of keeping the alternatives separate 
.,;ceres better than allowing their disjunction. Finally, con- 
sider: 
Mary was herding water fowl while dodging hostile 
gunfire. John saw her duck. 
Ilere we do not want to combine the two interpretations 
i~ato a single interpretation. If we amend a model to allow 
nmltiple explanations, we must be carefifl that we don't 
go too far. 
Coherence Based Commensurability 
Much of the criticism above stems from the lack of a 
model of textual coherence. Intuitively, an explanation 
that makes the text cohere (by finding propositions that 
relate pieces of the text to other pieces) will be preferred 
t(, an equi-probable explanation that is not as coherent. 
Ng and Mooney (1990) attempt to tormally define a mea- 
sure of coherence. In their model the logical form of the 
input text is taken as a set of propositions that they call ob- 
servations. The interpretation is the conjunction of these 
observations with another set of propositions called as- 
sumptions, where each assumption is introduced as a node 
in the proof graph of the observations. The most coher- 
ent interpreuttion is the one that maximizes the number 
of nodes that support (directly or indirectly) two observa- 
tions. Nodes are counted multiple times if they support 
multiple pairs of observations. The coherence metric is 
normalized to the range 0-1 by dividing the count by the 
total number of possible connections. 
Ng and Mooney give as an example the text"John was 
happy. The exam was easy." They propose two inter- 
pretations. The first relies on the (assumed) rule that op- 
timists are happy. Thus, by making the single asst, mp- 
tion that John is an optimist, one can explain the fact that 
he is happy. This takes one assumption to make one ex- 
planation, but it gets a coherence score of 0, because no 
pairs of observations are tied together. The other inter- 
pretation makes use of the (assumed) rules that succeed- 
ing on something makes one happy, and that studying 
for and taking an easy exam leads to succeeding on the 
exam. This makes two assumptions (studying and taking 
the exam) and again explains only one input (that John 
was happy), so it tares worse ~m the ~'atio~a o~ ~ c×pla~Ja~ 
tions to assumptions. However, it has a higher coherence 
score, because it ties together the exam and the exam beo 
ing easy with John being happy. Therefore, they conclude 
that coherence is more important than other metrics con.. 
sidered here. 
We tend to agree with this conclusion. While the other 
models may be able to duplicate this particular example, 
consider "John was happy. The winning lottery number 
was 174625." Here, we assert, the best interpretation is 
that John has number 174625 and has won the lottery. 
However, a probability-based model would have to put a 
very low probability on John having that particular mnno 
bet, and would prefer some other explanation tot his hap.. 
piness. 
However, we do not feel that coherence should be used 
alone without some notion of relative costs or probabili- 
ties, nor do we feel that Ng and Mooney have accurately 
captured the notion of coherence. There are several prob- 
lems with their metric. 
First, recall that if an assumption A supports two ob- 
servations, it adds to the coherence metric, and that any 
further assumption that supportsA also adds to the metric. 
Thus, Ng and Mooney prefer explanations with arbitrar- 
ily long backward chaining, no matter how improbable 
the connection. They have no way to cut off the explana- 
tion at an appropriate level of detail. 
Second, they do not attempt to choose between alter- 
natives. For example, in the "John was wondering about 
hmch when it started to rain. He ran into a restaurant." 
example, they would accept both explanations. Since two 
explanations are always more coherent than one, they can 
never reject a coherent explanation in favor of a better 
one. 
Third, they have no way to guide the search for coher- 
ence to the right places. Suppose the input consists of 
three observations, and that the first two have been tied 
together. It would be prudent to try to tie the third ob- 
servations in with the first two, but the coherence metric 
gives just as many points for finding another explanation 
for the first two as for connecting the third to one of the 
others. To their credit, Ng and Mooney discuss a heuristic 
search mechanism that is guided by coherence. We feel 
this is the right idea, but not with their exact coherence 
metric. 
Finally, while coherence is important, it is not the only 
criteria by which texts are constructed. Consider "John 
was 22 years old. He lives in California." This is per- 
fectly acceptable as the setting for a story to come. We 
would not want to try to explain this passage by assum- 
ing, for example, that John thought that California was a 
good place for 22-year-dials to live, and thus he moved 
there. 
Conclusions 
Abduction is a good model for language interpretation, 
and commensurability is a vital component of an abduc- 
tion system. But the models discussed here have serious 
limitations, due to technical problems, and due to a fail~ 
229 
ure to embrace language as a complex activity, involv- 
ing actions, goals, beliefs, inferences, predictions, and 
the like. We don't believe that knowledge of probabil- 
ity in the world, plus a few general principles (such as E) 
can lead to a viable theory of language use. This "com- 
plicated" side of language has been studied in depth for 
over a decade (a list very similar to our chain of reason- 
ing and acting appears in Morgan (1978)), so our task is 
clear: to marry these pre-theoretic "complicated" notions 
with the fonnal apparatus of commensurable abductive 
interpretation schemes. 

References 

Charniak, E. A neat theory of marker passing, AAAI-86. 

Charniak, E. and Goldman, R. (1988) A logic for se- 
mantic interpretation, Proc. of the 26th Meeting of 
the ACL. 

Charniak, E. and Goldman,R. (1989a) A semantics for 
probabilistic quantifier-free first-order languages, 
with particular application to story understanding, 
IJCA\[-89. 

Charniak, E. and Goldman, R. (1989b)Plan recognition 
in stories and in life, Uncertainty Workshop, HCAI- 
89. 

Hobbs, J. R., Stickel, M., Martin, P. and Edwards, D. 
(1988) Interpretation as abduction, Proc. of the 26th 
Meeting of the ACL. 

Mayfield, L M. (1989) Goal ~inalysis: Plan recognition 
in dialogue systems, Univ. of Cal. Berkeley EECS 
Dept. Report No. UCB/CSD 89/521. 

McCarthy, J. (1986) Applications of circumscription to 
formalizing common-sense knowledge. Artificial 
Intelligence, 26(3). 

Morgan, J. L. (1978) Toward a rational model of dis- 
course comprehension. Theoretical Issues in Natu- 
ral Language Processing. 

Ng, H. T. and Mooney, R. J. (1990) The role of coher- 
ence in constructing and evaluating abductive expla- 
nations. Proceedings of the AAAI Spring Sympo- 
sium on Automated Abduction. 

Norvig, P. (1987) A Unified Theory of Inference for Text 
Understanding. Univ. of Cal. Berkeley EECS Dept. 
Report No. UCB/CSD 87/339. 

Norvig, P. (1988) Multiple simultaneous interpretations 
of ambiguous sentences. Proc. of the lOth Annual 
Conference of the Cognitive Science Society. 

Norvig, E (1989) Marker passing as a Weak Method for 
Text Inferencing. Cognitive Science, 13, 4, 569-620. 

Pierce, C. S. (1955) Abduction and Induction. Dover, 
NY. 
