USING LINGUISTIC, WORLD, AND CONTEXTUAL KNOWLEDGE 
IN A PLAN RECOGNITION MODEL OF DIALOGUE ~ 
LYNN LAMBERT and SANDRA CARBERRY 
Department of Computer and Information Sciences 
University of Delaware 
Newark, Delaware 19716, USA 
email : lambert@cia, udel, edu, earberry@cis, udel, edu 
Abstract 
This paper presents a plan-based model of dialogue 
that combines world, linguistic, and contextual knowl- 
edge in order to recognize complex communicative ac- 
tions such as expressing doubt. Linguistic knowledge 
suggests certain discourse acts, a speaker's beliefs, aud 
the strength of those beliefs; contextual knowledge sug- 
gests the most coherent continuation of the dialogue; 
and world knowledge provides evidence that the appli- 
cability conditions hold for those discourse acts that 
capture the relationship of the current utterance to the 
discourse as a whole. 
1 Introduction 
Recognizing the roles that utterances play in a 
dialogue and how the utterances should be interpreted 
in the context of preceding dialogue is a crucial part of 
a robust model of understanding. In order to perform 
this recognition, our tripartite plan-based model of di- 
alogue identifies not only domain and problem-solving 
actions but also discourse or communicative actions 
that determine how utterances relate to each other. 
For this communicative action recognition, we com- 
bine information gleaned from a variety of knowledge 
sources: contextual, linguistic, and world knowledge. 
The combination of these different knowledge sources 
enables the recognition of complex communicative ac- 
tions such as expressing donbt. Although our tripar- 
tite model recognizes three different kinds of actions 
(domain, problem-solving, and discourse), the focus of 
this paper will be the recognition of discourse actions 
and how a combination of knowledge sonrces enables 
us to perform this recognition. 
2 Our Tripartite Model 
A number of researchers have contended that a 
coherent discourse consists of segments that are related 
to one another through some type of structuring rela- 
tion \[Gri75, MT83\] or have modeled discourse based on 
the semantic relationship of individual clauses \[Pol86\] 
or groups of clauses \[Rei78\]. But all of these fail to 
capture the goal-oriented natnre of discoursc. Grosz 
and Sidner \[GS86\] argue that recognizing the struc- 
tural relationships among the intentions underlying a 
1 This work is being supported by the National Science Foun- 
dation trader Graait No. IR1-9122026. The ~overrlnlellt h&s cer- 
tMn rights in tiffs material. 
discourse is necessary to identify discourse structure; 
although they do not provide the details of a compu- 
tational mechanism for recognizing these relationships, 
they do argue convincingly that it requires multiple 
knowledge sources. We have developed a plan-based 
model of dialogue and have incorporated into our model 
Grosz and Sidner's claim that linguistic, contextual, 
and world knowledge should be combined in recogniz- 
ing the role of an utterance in a discourse. . 
We contend that at least three kinds of ac- 
tions, domain, problem-solving, and discourse, should 
be captured by a model of task-oriented dialogue. 
Many researchers \[A1179, Car87, LAB7, Sid85, GS86\] 
have demonstrated that recognition of domain ac- 
tions endows a system with the ability to success- 
fnlly address many important and difficult problems 
in understanding. Several researchers have also in- 
vestigated the recognition of problem-solving actions 
\[LA87, Ram9I, Wil81\]. For example, if a user wants to 
earn a degree, the user might perform problem-solving 
actions of 1) evaluating alternative degrees (i.e., the 
user might decide whether a BS or a BA is more desir- 
able), 2) instantiating the type of degree to be earned, 
and 3) building a plan for performing the domain ac- 
tion of earning the selected degree. 
Carberry \[Car89\] points out the importance of 
recognizing discourse actions, the communicative ac- 
tions that speakers perform iu making an utterance 
(e.g., asking a question, providing baekgrouml infor- 
mation, or expressing surprise). Discourse actions pro- 
vide expectations for subsequent utterances (e.g., when 
a question is asked, one expects the question to be 
accepted and eventually answered). Recognition of 
some discourse actions such ms Give-Background also 
explains the purpose of an utterance and bow it should 
be interpreted; rather than just a statement of fact, 
the utterance providing background information should 
be used by the system to fill in necessary background 
knowledge in order to fully understand related utter- 
antes. 
We capture these three types of actions in sep- 
arate levels of our disconrse model, which we refer to 
as a DM \[LC91\]. Within each of these levels, actions 
may contribute to other actions on the same level; for 
example, on the discourse level, providing background 
data, asking a question, and answering a question all 
can be part of obtaining information. Thus, actions 
at each level form a tree structure in which each node 
represents an action that a participant is performing 
and the children of a node represcnt actions pursued 
in order to perform the parent action, ttowever, dis- 
course, problem-solving, and domain actions are not 
ACTES DE COLING-92, NAMES, 23-28 AOI~'P 1992 3 1 0 PROC. OV COLING-92, NAI, ZrES. AUG. 23-28, 1992 
Domain Level r ........................ , 
I *rI'~+'Co~S I ' CS+60) I : 
. ......... -~- ............. , 
P t °- b- (e-m- -':s- -~Y-I 9tJ- -~Y-q. ....... ~ t ............................... , 
I """d'P~"~s_!: :S 2, Ta~-Co'"o~S 1, CS3~°!) \] 
IlmNlliale-V,ars(S l r $21Lea~lI-Mateflal(S \[ t CS3601 ..fac)l \[ 
I 
Discourse Level l ...................... - .............. i ......... , ..... 2",'. ........................ 
I Ob ,min+tnfo-RefCSI, $2, _fat, Teaches( fac, CS300)) \] 
Z 
A\[.~:uot(sl, ss, :.~,'r+~h¢~ t.¢, css6o)) 
Reque~l(Sl. S2, lnform+Ref(S2. \[st, f~, Teachex(_ fac, CS360)) \[ 
Sur face-Wn-Question(S 1, S2, I \[ fac, Teach~(_fac, C8360)) \[ 
\[ Answer-Ref(S2, Sl, jac, - '+ 
I TeachesL.fa~, CS36.O), Teaches~D6 Smith~ CS360 ) | 
\[ hlform(S2, Sl, Teaches(Dr. Snlim. CS360)) I ¢ 
\[*T'¢II(S;,'SI, Teaches(Dr. Smlth, lLq360))" } 
l 
I Surface-Inform(S2, Sl, Teaches(Ill. Smith, CS360)) I 
Figure 1: Dialogue Model for two utterances 
-- ~" = Emfi)le Arc 
= Subactlon Arc 
completely independent of one another; discourse ac- 
tions may be executed in order to obtain the informa- 
tion necessary for performing a problem-solving action 
and problem-solving actions may bc executed in order 
to construct a domain plan. Our model captures this 
interaction by allowing links between the actions at ad- 
jacent levels. Figure I contains a sample DM derived 
from two utterances, and section 3 describes how the 
DM in Figure 1 is constructed. 
3 Different Kinds of Knowledge 
The following dialogue will be used to demon- 
strate why our system needs world, contextual, and 
linguistic knowledge, and to show }low the combina- 
tion of these different knowledge sources enables the 
system to recognize implicit acceptance of previously 
commuuicated propositions and to identify the role of 
utterances that cannot be determined from one or two 
knowledge sources alone. The system is playing the 
role of $2, 
(1) SI: Who is teaching CS3607 
(2) $2: Dr, Smith is teaching CS360. 
(3) SI: Bul isn't CS360 an undergrad course? 
(d) $2: Yes. 
(5) Dr. Smith leaches gradnate 
and undergrad courses. 
(6) Sl: Who handles the CS'360 lab? 
3.1 World Knowledge 
In order to recognize how actions are intended 
to contribute to otlmr actions, our system needs knowl- 
edge abont how to perform actions. This world knowl- 
edge is I)rovided in the form of a library of discourse, 
problem-solving, and domain recipes \[Pol90\]. Our rep- 
resentation of a recipe includes a header giving the 
natoe of tile recipe and the action that it accomplishes, 
preconditions, applicability conditions, constraints, a 
body, effects, and a goal. Applicability conditions rep- 
resent conditions that nmst be satisfied ill order for 
the recipe to be reasonable to apply ill tile given situa- 
tion whereas constraints limit the allowable instantia- 
lion of variables in each of the components of a recipe 
\[LA87, Car87\]. Figure 2 contains a sample discourse 
recipe. 
Given the senlalltic representation of a new ut- 
terance, the system mnst assimilate tim utterance and 
produce an updated dialogue model (DM). Plan in- 
ference rules \[A1179\] and constraint satisfaction \[LA87, 
Car87\] suggest chains of higher level actions that an 
utterance may in! part of, and foetlsing heuristics 
\[Car87, SidSl\] order these inference paths according to 
coherence. For exanlple, the semantic rel)resentation 
of (1) is: 
Surface WH-Question(fil, $2, _fac, Teaches(_fac, 
csa@)) 
From this surface question, t)lan inference rules sug 
gest that (1) is executing a Rcquest action and that 
this Request action is part of an Ask-Re\] action which 
in turn is part of an Obtain-lnfl~-Re\] action since each 
of these actions is part of the body of a recipe that 
performs the higher level action As the system infers 
these actions, tile system also tentatively ascribes cer- 
tain beliet~ that must hold in order fl~r the agent to 
be pursuing these discourse actions l"or example, m 
order for (1) to lw part of ;m Obtaln-ln\]o-Rcf action, 
,ql must not know the answer to thv questh)n; if SI 
knew who was teaching CS360, this utterance might 
be part of a 7'est-L+slcncr action instead. These requi- 
site beliefs are captured ill tile applicability conditions 
of discourse recipes. As tile system inDrs actions, it 
must be plausibh~ that the applicability conditions are 
ACRES DE COLING-92, NANTES. 23-28 AO~r 1992 3 1 1 PRec. OF COLING-92, NANTES, AUG. 23-28. 1992 
Discourse Recipe-Cl: {_agent1 expresses doubt to _agent2 about _propl because _agent1 believes _prop2 to be true} 
Action: Express-Doubt(_agentl, _agent2, _propl, _prop2, _rule) 
App Cond: believe(_agentl, _prop2) 
believe(_agentl, befieve(_agent2, _propl)) 
believe(_agent 1, _rule) 
believe(_agentl, ((_prop2 A _rule) =¢, ~_propl)) 
in-focus(_propl) 
Body: Convey-Uncertain-Belief(_agentl, _agent2, .prop2) 
Address-Q-Acceptance(_agent2, _agent1, -prop2) 
Effects: believe(_agent2, ~beheve(_agentl, _propl)) 
beheve(_agent2, want(_agentl, Resolve-Conflict(_agent2, _agentl, _prop1, _prop2))) 
Goal: want(_agent2, Resolve-Conflict(_agent2, _agentl, _propl, _prop2)) 
Figure 2: Sample Discourse Recipe 
satisfied; otherwise, the inference is rejected. 
So, another part of our system's world knowl- 
edge is a model of the speaker's beliefs. Since our in- 
vestigation of naturally occurring dialogues indicates 
that people express shades of belief in propositions and 
expect others to recognize these beliefs, we maintain a 
multi-strength model of beliefs to represent an agent's 
varying degrees of belief in a proposition, ranging from 
having no idea whether a proposition is true (or false), 
to being certain that a proposition is true (or false). We 
also maintain a model of a stereotypical user whose be- 
liefs may be attributed to the speaker as appropriate 
during tim course of the conversation. 
After the system has inferred actions on the 
discourse level, it must identify how these relate to 
problem-solving and domain actions. This is accom- 
plished by chaining between actions on adjacent lev- 
els of the DM. For example, once the system infers 
that (1) contributes to an Obtain-lnfo-Ref action on 
the discourse level, plan inference rules suggest that S1 
wants the goal of the Obtain-lnfo-Ref action, namely 
knowref(S1, _fac, teaches(_fac, CS360)). Since knowing 
possible fillers for a parameter is a precondition to in- 
stantiating that parameter, the system infers that S1 
wants to know who is teaching CS360 in order to in- 
stantiate the instructor parameter in an action to learn 
the material for that course; that is, the system in- 
fers that S1 wants lnstantiate-Single-Var(S1, $2, _fac, 
Learn-Material(S1, CS360, _fac)). Since instautiating 
one paranteter m an action is part of a plan to instan- 
tiate all of the parameters in that action, the system 
infers that St wants lnslanlialc-Vars(S1, $2, Learn- 
Matenal(Sl, CS360, _fac)), and since this latter action 
is part of a recipe for building a plan, the system then 
infers the problem-solving action, Build-Plan(S1, $2, 
Take-Course(S1, CS360)). These instantiate-Single- 
Var, lnstantiate-Vars, and Build-Plan actions are en- 
tered into the problem-solving level of the DM. Build- 
ing a plan to perform some domain action is a precon- 
dition to doing that action (assuming agents are acting 
intentionally), so the system infers that Sl wants Take- 
Course(Sl, CS360), and this domain action is entered 
into the domain level of the DM. 
Once the system has assimilated an utterance 
into the DM, it nmst update its belief nmdel for the 
speaker to reflect the beliefs that were tentatively as- 
cribed to the speaker during the plan inference process. 
These beliefs can then be used in understanding sub- 
sequent utterances. 
3.2 Contextual Knowledge 
Each new utterance must be interpreted with 
respect to the existing dialogue context \[GS86, Car87\]. 
This process requires contextual knowledge, which our 
system captures with the use of the DM, a focus of 
attention which designates the most salient .action on 
each level of the DM, and focusing heuristics which sug- 
gest the most coherent continuation of the dialogue. 
For example, on the discourse level, utterances that 
contribute to the currently focused action are more ex- 
pected, and thus more coherent, than utterances that 
contribute to an ancestor that is further removed from 
the focus of attention. This contextual knowledge cre- 
ates expectations that help determine how to interpret 
new utterances. For example, after a qnestion is asked, 
the context suggests that acceptance of the question 
will be pursued (i.e., the listener will ensure that the 
question is understood, justified, and answerable); then 
it is expected that the question will be answered. 
Because our system does not yet include a plan- 
ner, we incorporate S2's utterances into the DM in the 
same way as Sl's have been. So, continuing with our 
example dialogue, we express the semantic representa- 
tion of (2) as: 
Surface-lnform(S2, $1~ Teacites(Dr. Smith, CS360)) 
Plan inference rules suggest that the surface inform 
might be part of a Tell action which might be part of an 
Inform action 2 which might he part of an Answer-Ref 
action which might in turn be part of an Obtain-lnfo- 
Rcf action since each of these actions is part of the 
body of a recipe that performs the higher le4el action. 
Contextual knowledge is then used to determine how 
to relate (2) to the previous dialogue. Focusing heuris- 
tics suggest that the best interpretation of (2) is that 
it is part of a plan for performing the Obtain-lnfo-Ref 
action that was an ancestor of the Request action of ut- 
terance (1). No new problem-solving or domain actious 
are inferred. 
Figure 1 gives the DM that our system builds 
from utterances (l) and (2) with the current focus of 
2We differentiate between telling a listener some string of 
words said informing a listener of a proposition, hi order to 
inform a listener of some proposition, the listener must first un- 
derstaald the content of the proposition; tiffs is the goal of the Tell 
action. The goal of the ln\]orm action is that the listener 
believe the COlmnunicated proposition. 
AcrEs DE COL1NG-92, NANTES, 23-28 AOI~T 1992 3 1 2 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
attention on each level marked with an asterisk. Thus, 
we have seen that both world knowledge (consisting of 
a plan library and beliefs about the speaker's beliefs) 
and contextual knowledge (consisting of the existing 
DM, the current focus of attention, and focusing heuris- 
tics) are required in order to determine what actions a 
speaker is performing and how these actions relate to 
the l)revious dialogue. 
3.3 Linguistic Knowledge 
Linguistic knowledge must also be taken into 
account in recognizing the actions that a speaker is 
performing. This knowledge inehldes the surface form 
of an utterance and clue words. The surface form of 
an utterance is one way that a speaker communicates 
varying degrees of belief in a proposition. Consider, for 
example, the following two utterances: 
(7) Is Dr. Smith teaching CS3107 
(8) Isn't Dr. Smith teaching CS3107 
The form of the utterance in (8) indicates an uncertain 
belief in tim proposition that Dr. Smith is teaching 
CS310; utterance (7), however, conveys only a lack of 
knowledge about the proposition. Similarly (3) is not 
merely a Yes/No question; instead, this surface form 
conveys that S1 thinks that CS360 is an undergrad 
course, but is not certain of it. Our system uses the 
form of the utterance to recognize the strength of a 
speaker's beliefs; these beliefs are then used to deter- 
mine whether the applicability conditions for the sug- 
gested discourse actions are satisfied. 
Tile second type of linguistic information that 
we use is clue words. These lingtfistic clues often sng- 
gest what type of discourse action the speaker might be 
pursuing\[LAB7, Hin89\]. a We use these linguistic clues 
as evidence for discourse actions. For example, utter- 
ance (3) contains the clue word "but," which suggests 
a non-acceptance discourse action. Thus, the linguistic 
information that our system captures includes knowl- 
edge about the surface form of an utterance and about 
clue words. 4 
4 Combining Knowledge for 
Discourse Act Recognition 
Because there are nlany ways that an utterance 
can continue a dialogue and because the correct inter- 
pretation is not always the one most strongly suggested 
by plan chaining and focusing heuristics, evidence from 
other knowledge sources is needed to identify the in- 
tended relationship between an utterance and the ex- 
isting dialogue context. For example, the interpreta- 
tion of (3) most strongly suggested by focusing heuris- 
tics is that of requesting clarification of (2) in order to 
nnderstand it. Intuitively, however, (3) seems to be e×- 
pressing doubt at S2's answer, not trying to understand 
3The surface form of some utterances may also serve this 
purpose. 
4The utter~ame itself in fact contains more ilfformation than 
just clue words, surface fornl, and propositional content, but our 
system uses only these three. Part of our future work includes 
incorporating other linguistic information such as tense. 
it. Plan inference rules and focasing heuristics then are 
not sufficient to determine what role (3) is serving in 
the discourse. More knowledge is needed from other 
sources. 
Intuitively, for each new utterance that relates 
to the previous dialogue on the discourse level, 5 there 
is some discourse action that captures this relationship 
and serves to describe the role of the new utterance 
with respect to the preceding dialogue. Since many 
such relationships are plausible (e.g., (3) could be in- 
terpreted as expressing doubt or as indicating a lack of 
understanding), we contend that evidence is required 
for recognizing the discourse action that identifies the 
correct relationship. 
In our model, this discourse action will he an el- 
ement of an inference path from the utterance to some 
action in the current tree structure on the discourse 
level. Furthermore, it will introduce new parameters 
which must be instantiated based on values from the 
DM in order for the path t ° terminate with an action 
already in the DM. By replacing these new parameters 
with values from the DM that are not present in the 
semantic representation of the utterance, wc are hy- 
pothesizing a relationship between the new utterance 
and the existing discourse level of tile DM. Thus, this 
action serves the aforementioned role of capturing how 
the new utterance relates to the current dialogue con- 
text. We will refer to such actions as e-actions since 
we contend that there must be evidence to support the 
inference of these actions. 
For example, suppose that the semantic repre- 
sentation of an utterance such as (8) is 
Surface-Neg-YN-Question(Sl, $2, PropA) 
and that chaining suggests that the utterance is part 
of a Convey-Uncertain.Belief action which is part of 
a recipe for the Express-Doubt action shown in Fig- 
ure 2. The parameters _agent;I, _agent2, and _p~rop2 
in the Express-Doubt action will be instantiatedwith 
the values $1, $2, and PropA that appear in the se- 
mantic representation of the utterance and propagate 
during chaining to the Convey Uncertain-Belief action 
and in turn to the Express-Doubt action, tlowever, the 
parameters _propl and Aru\].e are introduced for the 
first time in the Express-Doubt action and have many 
plausible instantiations. Continued chaining from the Express-Doubt 
discourse action could eventually lead 
to an Inform action, and we might equate this Inform 
with an Inform that already exists in the DM, thereby 
interpreting tile new utterance as related to this previ- 
ously identified action. Unifying the Inform action on 
tim inference path with an Inform action in the DM, 
and propagating the resultant substitutions back down 
the inference chain, will result in _propt anti _vu:l.e be- 
ing instantiated based on information from the DM. In 
particular, _propl will be instantiated with the propo- 
sition from the Inform action and _vu:l_e will be con- 
strained to a rule that SI might ttfink suggests that 
_prop2 and _prt~pl are inconsistent. 
llowever, it is not enough that these instantia- 
tions plausibly satisfy the applicability conditions for 
the Express-Doubt action. For example, consider tile 
following dialogue: 
5 Solne utterances, sllch a~ (1) and (6), do Ilot relate to previ- 
ous dialogue on the discourse level. 
ACRES DE COLING-92, NANq~S, 23-28 AO'/Zr 1992 3 1 3 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
t9t SI: Who is teaching CS3607 ( $2: Dr. Smith. 
(11) SI: Isn't Dr. Smith lhe professor who won 
the teaching award last year? 
While it is at least plausible that, in the mind of the 
speaker, there is a rule which makes winning a teaching 
award inconsistent with teaching CS360, interpreting 
(11) as expressing doubt at (10) seems incorrect. Thus, 
to prevent such erroneous interpretations, we contend 
that evidence is needed to recognize discourse actions 
that capture the relationship between a new utterance 
and the existing dialogue context. 
In our recognition algorithm, evidence may take 
two forms: 1) world knowledge indicating that tile ap- 
plicability conditions for an e-action are satisfied, and 
2) linguistic evidence from clue words suggesting a par- 
titular discourse action. If the system has evidence that 
the applicability conditions of an e-action are satisfied, 
tilen the system will use the knowledge as evidence 
that this may be the discourse action that the speaker 
is pursuing. On the other hand, if there is sufficient 
linguistic knowledge suggesting a particular discourse 
action, then these applicability conditions should be 
attributed to the speaker, as long as they are plausi- 
ble (i.e., if there is nothing in the system's model of 
the speaker's beliefs to suggest that the applicability 
conditions are not satisfied). So, if the clue word but 
is used, then a non-acceptance discourse action such 
as expressing doubt should be easier to recognize (i.e., 
silould require less evidence that the applicability con- 
ditions hold) than if the clue word is not present. 
For example, consider the following, in which 
there are no clue words in (14a) and (14b), but (14~) 
contains the clue word but. 
(12) Sl: Who is teaching CS3607 
(13) $2: Dr. Smith. 
(14,) SI: Isn't Dr. Smith on sabbatical next year? 
(14b) SI: Isn't Dr. Smith the professor who won 
the teaching award last year? 
(14c) SI: But isn't Dr. Smith the professor who 
won the teaching award last year? 
Chaining from (14~) could produce an infer- 
ence path containing an Express-Doubt discourse ac- 
tion. The surface form of (14,) establishes that S1 has 
an uncertain belief that Dr. Smith is on sabbatical, the 
first applicability condition for an Express-Doubt ac- 
tion (see Figure 2).s The effect of the question/answer 
pair in (12) and (13) is that SI believes that $2 thinks 
that Dr. Smith teaches CS360, tile second applicabil- 
ity condition in all Express-Doubt action. Tile stereo- 
type mmhl\[e contains the belief that professors on sab- 
batical do not teach, so the system can ascribe to SI 
the following belief: Vx,y (course(y) A professor(x) 
A on-sabbatical(x)) ~ ~teachcs(x, y). This belief sat- 
isfies tbe third applicability condition (the belief about 
a rule), and this belief, along with the belief that Dr. 
Smith is on sabbatical, implies that Dr. Smith is not 
teacbmg CS360, the fourth applicability condition. Fi- 
nally, a check of the DM indicates that the proposition 
eThe first applicability condition is actually an uncertain be- 
lief. \[LC92\] describes our fortnal belief model and how belief 
strengths are represented in recipes and in our belief model. 
that Dr. Smith teaches CS360 is in focus, the last ap- 
plicability condition. Therefore, since there is evidence 
for the applicability conditions of the Express-Doubt 
action, and since focusing heuristics suggest that this 
is a coherent discourse action (although not the most 
preferred), (14a) is recognized as expressing doubt at 
S2's answer, that Dr. Smith is teaching CS360. 
If the dialogue included (14b) or (14c) instead 
of (14a), some of the same evidence would exist. Ill 
both (14b) and (14c), the system believes l) that S1 
believes that Dr. Smitll won the teaching award last 
year (though S1 is not sure of this), 2) that S1 believes 
tlmt $2 thinks that Dr. Smith is teaching CS360, and 
3) that the proposition that Dr. Smith teaches CS360 is 
in focus. Thus, some of the applicability conditions for 
expressing doubt at Dr. Smith teaching CS360 hold. 
However, tim system has no knowledge for tile crucial 
implication that determines how this utterance relates 
to tile preceding dialogue; the system has no evidence 
that S1 believes that winning a teaching award implies 
that Dr. Smith is not teaching CS360. Therefore, (14b) 
is not interpreted as an expression of doubt at the re- 
sponse in (13), and other discourse acts are consid- 
ered. The presence of the clue word in (14e), however, 
strongly suggests an Express-Doubt discourse act and 
thus less evidence is needed to recognize it; that is, the 
system does not need explicit evidence that S1 holds 
the requisite beliefs but only needs to be able to plan- 
sibly ascribe them to SI. Therefore, since mlr model 
call plausibly ascribe to S1 belief in the implication 
that Dr. Smith winning a teaching award implies that 
Dr. Smith is not teaching CS360, the system will rec- 
ognize (14~) and (14c) as expressions of doubt, using 
evidence from linguistic knowledge for (14~) aml from 
world knowledge for (14o). 
Thus, we want our system to use lin- 
guistic knowledge when present, world knowledge when 
present, and both when possible. Onr algorithm for 
processing is tim following: from the semantic repre- 
sentation of the utterance, infer sequences of actions 
(inference paths) using plan inference roles. If the ap- 
plicability conditions for any of these actions are im- 
plausible, reject the inference. For actions which are 
not e-actions, tentatively ascribe the beliefs in the ap- 
plicability comlitions. For actions that are e-actions, 
determine how much evidence is available for tile ac- 
tion. If there is more than one e-action for which there 
is evidence from both linguistic and world knowledge, 
then choose the inference path closest to the focus of 
attention for which there is multiple evidence. If lin- 
guistic or world evidence is available for nmre than one 
e-action, then choose the inference path closest to the 
focus of attention with this single supporting piece of 
evidence. If there is no evidence for any e-action, then 
choose tim inference path which contains no e-actions 
and is closest to the focus of attention. 
5 Completing the Example 
We return briefly to the dialogue given in Sec- 
tion 3 to ilhlstrate how our process model uses the 
above algorithm to recognize complex communicative 
actions such as expressing doubt as well as implicit ac- 
ceptance of previous utterances. 
Utterances (1) and (2) were discussed earlier, 
ACTES DE COLING-92, NANTES, 23-28 Ao(rr 1992 3 1 4 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992 
Dluoourse Level 
Dom~dn Level i..~..i 
p r ol?l.em -..,~ly} n~ .1: .re(el .... \] ........................... 
Ilnstantiate-V~ • .1 l 
:llnstantiate-Sin~le-Var \] I~,.,~si.N0v~ I ,. .... -:~- ................................ .-~- ":- ~ ~ .~ ,.. 
---~-- = Enable Arc 
= ,~tlbactloll Arc 
Obtain-hffo-Ref 
(1) (2) (3) (4) (6) 
(5) 
Figure 3: Dialogue Model for Example 
and the I}M for these utterances is given in Figure 1. 
Tile semantic representation of(3) is: Surface-Neg-YN- 
Question(Sl, $2, undergrad-eonrse(CS360)). 
The surface form of the utterance suggests that 
Sl thinks that CS360 is all undergrad course, hut is not 
certain of it. This belief is tentatively entered into the 
system's model of Sl's beliefs. Plan inference rules sug- 
gest that the surface request is an action in a Convey- 
Uncertain-Belief action. The linguistic clue but sug- 
gests that S1 is objecting to sornc previous utterance 
of S2's. One of the possible infereuees from the Convey- 
Uncertain-Belief action is an Ezpress-Doubt action. 
tlowever, since this is an e-action, evidence is neces- 
sary to infer this action. The linguistic clue provides 
one piece of evidence to support this inference, but the 
system also looks for evidence from world knowledge. 
It therefore cheeks to see if it is plausible that S1 's belief 
about CS360 being an undergrad course might mq)ty 
t, hat Dr. Smith is not teaching CS360. Although tbe 
system has no such belief explicitly represented in its 
model of S 1, there is also no evidence to suggest that S 1 
does not helieve that this implication might hold. Since 
there is no other e-action for which there is evidence 
and since tim applicability conditions for an Express- 
Doubt action are plausible, the system infers that the 
Convey-Uncertain-Belief action may be an action in 
an Express-Doubt action which may be an action in an 
Address-Unacceptance discourse action which may be 
an action in an Address-Believability discourse action 
which may be an action in an lnJorm action. Focusing 
heuristics suggest that this hfform is the same Inform 
that (2) is pursuing. Tiros, (3) is interpreted ms not 
accet)ting (2) by expressing doubt at it. v 
lnferencing for the rcmainder of the dialogue is 
similar to the first three utterances. The int~rence 
paths which result from utterances (4) and (5) are 
shown in Fig~lre 3 above the appropriate numbers. 
Finally, although the system initially a~tentpts 
to closely relate utterance (6) to other utterances at 
the discourse level, there is no evidence for any e-action 
that might link this utterancc to the existing context 
on tile discourse level. There arc no clue words to 
suggest a relationship, and there arc no e-actions for 
which there is evidence that the applicability conditions 
hoht. Therefore, a completely new discourse action of 
obtaining information is inferred, s Ttfis initiation of 
a new discourse action indicates implicit acceptance 
of tile previous discourse actions since if S1 did not 
accept $2'.~ answer, $1 would be required to indicate 
non-acceptance \[LC92\]. The DM for the entire dia- 
logue is shown in 1,'igure 3 (for space rea.sons, only tile 
action names are shown). Thus our model recognizes 
both acceptance and non-acceptance of communicated 
7Although not discussed, there must also be m~ c-actlon that 
relates (2) to tile DM. This action is tile Answer-J~e\]; evidence for 
the An.~wer-Re\] action is fi'om world knowledge, whic:h indicates 
that the applicat)ility conditions for this Answer- Re\] action hold. 
htferencing of (2) is then similar to that of (3). 
8Further discussion may determine that SI is trying to con- 
tinue the negotiation dialogue; however, that has not been com- 
municated by this uttera~lce. We a~'e ittvestigatlng how our sys- 
tern might modify the I)M that it has built if it discovers later 
that the structure built previously is incorrect. 
ACTES DE COLING-92, NANTES, 23-28 AOt~Tr 1992 3 1 5 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
propositions, including acceptance after negotiation of \[GS86\] 
conflicting beliefs. 
This example has illustrated: 1) how the struc- 
ture of a discourse is identified in our three level model, 
how our model recognizes the relationship of the cur- \[Rin89\] 
rent utterance to the existing context and to other ut- 
terances, and how the tripartite structure produces a 
richer model of discourse structure than previous mod- 
els; 2) how beliefs are communicated, recognized, and 
used in the identification of discourse actions and dis- 
course structure; and 3) how our process model uses \[LA87\] 
linguistic, world, and contextual knowledge together in 
order to recognize acceptance and non-acceptance of 
communicated propositions. 
6 Conclusions and Future Work \[LC91\] 
Our plan-based model of dialogue incorporates 
world, linguistic, and contextual knowledge sources 
into the recognition of communicative actions. Lin- \[LC92\] 
guistic knowledge suggests certain discourse acts, a 
speaker's beliefs, and the strength of those beliefs; con- 
textual knowledge suggests the most coherent contin- 
uations of the dialogue; and world knowledge provides 
evidence that the applicability conditions hold for those \[MT83\] 
discourse acts that identify the relationship of the cur- 
rent utterance to the discourse as a whole. By com- 
bining these different knowledge sources, we are able 
to recognize complex discourse acts such as express- \[Po186\] ing doubt, to identify the relationship of utterances to 
one another, and to capture the rich structure of task- 
oriented dialogue. 
Grosz and Sidner \[GS86\] claim that a robust 
model of understanding must use constraint satisfac- 
tion to interpret utterances; that is, when evidence is \[Pol90\] 
available from one source, less evidence is needed from 
other sources. We have partially included their sug- 
gestion by using world and linguistic knowledge when 
contextual knowledge is not sufficient to infer actions 
for which there must he some evidence. However, we 
would like to expand our notion of partial evidence \[Ram91\] 
to allow evidence from the three knowledge sources to 
be represented in terms of degree: thus, when world 
knowledge is overwhelmingly strong, no other knowl- 
edge is needed, but when it is very weak, knowledge 
from other sources will be needed to support the infer- 
ences not allowed by the weak world knowledge alone. 
AcrEs DE COLING-92, NANTES, 23-28 ^ot~r 1992 3 1 6 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 

References 

James F. Allen. A Plan-Based Approach to 
Speech Act Recognition. PhD thesis, Univer- 
sity of qbronto, Toronto, Ontario, Canada, 
1979. 

Sandra Carberry. Pragmatic Modeling: To- 
ward a Robust Natural Language Interface. 
Computational Intelligence, 3:117-136, 1987. 

Sandra Carberry. A Pragmatics-Based Ap- 
proach to Ellipsis Resolution. Computational Linguistics, 
15(2):75-96, 1989. 

Joseph E. Grimes. The Thread of Discourse. 
Mouton, The Hague, Paris, 1975. 

Barbara Grosz and Candaee Sidner. Atten- 
tion, Intention, and the 
Structure of Discourse. Computational Lin- 
guistics, 12(3):175-204, 1986. 

Elizabeth Hinkelman. Two Constraints on 
Speech Act Ambiguity. In Proceedings of 
the 27th Annual Meeting of the Association 
for Computational Linguistics, pages 212- 
219, Vancouver, Canada, 1989. 

Diane Litmus and James Alien. A Plan 
Recognition Model for Subdialogues in Con- 
versation. Cognitive Science, 11:163-200, 
1987. 

Lynn Lambert and Sandra Carberry. A Tri- 
partite Planobased Model of Dialogue. In Pro- ceedings of the 29th Annual Meeting of the 
ACL, pages 47-54, Berkeley, CA, June 1991. 

Lynn Lambert and Sandra Carberry. Model- 
ing Negotiation Dialogues. In Proceedings of 
the 30th Annual Meeting of the ACL, Newark, 
DE, June 1992. To appear. 

William C. Mann and Sandra A. Thompson. 
Relational Propositions in Discourse. Techni- 
cal Report ISI/RR-83-115, ISI/USC, Novem- 
ber 1983. 

Livia Polanyi. The Linguistics Discourse 
Model: Towards a Formal Theory of Dis- 
course Structure. Technical Report 6409, 
Bolt Beranek and Newman Laboratories Inc., 
Cambridge, Massachusetts, 1986. 

Martha Pollack. Plans as Complex Mental 
Attitudes. In Philip R. Cohen, Jerry Mor- 
gan, and Martha E. Pollack, editors, Inten- tions in Communication, 
pages 77-104. MIT 
Press, 1990. 

Lance A. Ramshaw. A Three-Level Model for 
Plan Exploration. In Proceedings of the 29th 
Annual Meeting of the Association for Com- 
putational Linguistics, pages 36-46, Berkeley, 
California, 1991. 

Rachel Reichman. Conversational Coherency. 
Cognitive Science, 2:283-327, 1978. 
Candace L. Sidner. Focusing for Interpreta- 
tion of Pronouns. American Journal of Com- 
putational Linuistics, 7(4):217-231, 1981. 

Candace L. Sidner. Plan Parsing for Intended 
Response Recognition in Discourse. Compu- 
tational Intelligence, 1:1-10, 1985. 

Robert Wilensky. Meta-Ptanning: Represent- 
ing and Using Knowledge About Planning in 
Problem Solving and Natural Language Un- 
derstanding. Cognitive Science, 5:197-233, 
1981. 
