A Bayesian Approach for User Modeling in Dialogue Systems 
AKIBA, Tomoyosi and TANAKA, Hozulni 
Department of Computer Science 
Tokyo Institute of Technology 
2-12-10okayama Meguro Tokyo 1,52 Japan 
{akiba, tanaka}@cs, t itech, ac. jp 
Abstract 
User modeling is an iml>ortant COlnponents of dia- 
log systems. Most previous approaches are rule-based 
methods, hi this paper, we proimse to represent user 
models through Bayesian networks. Some advantages 
of the Bayesian approach over the rule-based approach 
are as follows. First, rules for updating user models are 
not necessary because up<lating is directly performed 
by the ewduation of the network base<l on probal>ility 
theory; this provides us a more formal way of dealing 
with uncertainties. Second, the Bayesian network pro: 
rides more detailed information of users' knowledge, 
because the degree of belief on each concept is pro- 
vided in terms of prol~ability. We prove these advan- 
tages through a prelinfinary experiment. 
1 Introduction 
Recently many researchers have pointed out that user 
modeling is important in the study of (tiMog sys- 
tems. User n:o<h!ling does not just render a dialog 
syst(,nl more cooperative, lint constitutes an indis- 
1)ensable l)rerequisite fin" any flexible (lialog in a wider 
<tomain\[9\]. The user models interact closely with all 
other components of the system and often cannot eas- 
ily be separated from them. For examph,, the inl)ut 
anMysis component refers to tile user's knowledge to 
solve referentiM ambiguities, and tile output genera- 
tion component does the same for h,xical el,oices. 
The con<:epts are usually explained l>y showing their 
relations to the other known concepts. Thus, for the 
<lialog system it is important to guess what the user 
knows (user's knowledge) in order to explain new con- 
cel)ts in terms of know,t concepts. For examl/le , con: 
sider that tit(, system explains the location of a restau- 
rant to the user. It might 1)e useless to tell the. user the 
position in terms of the Mlsolute <:oordinate system, 
since the user's mental model is not based on the ab- 
solute coordinate. Therefore, the system should show 
the relative location frmn the lo(:ation tit(' user alrea(ly 
knows. It is difficult to predict which locations the 
user, who l)erhaps is a stranger to the system, knows. 
Though the syst:em <:ouhl atteml)t to a('quire the in- 
formation l/y asking the user al)out her k,towle(lge, too 
many questions may irritate the user. Such a system 
is considered mechanical and not helpful. Therefore, 
tit(" system is required to guess the user's knowledge 
by finding clues in the user's utterance and to refine 
the user's model incrementally. 
In the user modeling component of UC\[5\], several 
stereotyped user models which vary the user's level 
of expertise were prepared beforehand and the appro- 
priate model was selected based o1: the user's utter- 
ances. Ill the approach used by Wallis and Shortlifl'e 
\[12\], the expertise h,vel was assigned to all concepts in 
the user model. The system guessed the user's level, 
and the concepts with the expertise level lower than 
her level are considered to be known by her. This 
n:o(lel can deal with tit(.' level of expertise more appro- 
priately than UC, because the system does not have 
to prepare the nmltiple user nlodels for each expertise 
h, vel. 
The approach of pr<.'paring several user models and 
adoptit,g one, however, is an al>l>roximation of user 
modeling. The expertise level of tit(: user is continuous 
and, in general, the unique measuremelfl: of expertise 
level is not appropriate for some domMns, specifically 
the domain of town guidance consi<lere<l in this paper, 
because the areas that are known differ with the users. 
Another problem of user modeling is updating the 
nmdel as the (tialog progresses. At the beginning of the 
diMogue the system cannot expect the user nm<M to 
be accurate. As the diMogue progresses the. system can 
acquire clues of the user's knowledge fl'om his utter- 
anees. Also, the system can assume that the concepts 
mentioned are known to the user. Thus. updating the 
user model shouhl 1)e performed incrementally. 
One difficulty of updating user nmdels is dealing 
with uncertainties. The clues that can be obtained 
from the user's utterances are uncertain, the iltfol'nla- 
tiol( may conlli<:t with what has been hi,rained, and, as 
a result, the user mo<lel may be revised. The effects of 
the systtnn's explanation are also uncertain. Further- 
more, reasoning about the user's kuowledge must be 
performed Oil the basis of uncertainties. Most previous 
apl)roaches to this prolflem are rule-based metho(ts. 
Cawsey \[2\] sorted the update rules in order of their 
reliability and applied them in this order. In another 
approach, tit(., mechanisnl such as TMS\[6\] or nomnono- 
tonic logic\[l\], is used to maintain the consistency of 
7272 
|;he 211odcl. I(; SCCliIS that rule,-l),tse(\[ aLl)l)ro~t(:hes h~tve a 
pol;entiM defect for dealing with unt:ertMnties\[4\]. The 
Bayesian al)proa(:h ca, n (leM wil;h bol;h un(:erta.in (am- 
biguous) evidences and uncertain re~Lsoning sl;raight- 
forwardly. 
In this pat)or , wc t)roposc ;~ prol)nhilistic ~l)l/ro~tch 
for user modeling ill dialog systems. The Bayesian net- 
works ;tre Itsc(l to rel)re.sent the user's knowledge and 
(Ir~tw inferen(:es froni that, ~trt(l provide the fine-grahwxl 
solutioils to the ln'ol)lems l/reviously mcntiol,ed. In 
Sl)ite of the pol:entiM ;t(lwud;;tge of I;he Bayesi;Ln al)- 
I/ro~ch, l;her(~ are few attenq)ts to mnploy it in user 
modeling. 
The adva.nt;ages of the Bayesian ;q)l)roach over the 
rule-1);tsed ;q)l)roa(:h are ~ts follows. First, rules for 
updating nscr models are not necessary. C;twsey \[2\] 
1)oiuted out; there are four lmdn sources of informal;ion 
l;hat can be used to up(l;tte tim user model wh;~t, lahe 
user s;~ys ~tnd asks, what the .~ysl;em l;ells I;he user, 1,11('. 
leve.l of exl)ertise of the user, and rel:d;ionshit)s I)\[!tween 
con(:el)l;s in the domain. '\['hey c~tli l)(! incorl)oratt(~d 
ill the tel)resented.ion of \]~tyesian nel;works au(l can 
be used to Ul)(lal:e the user m.(lel 1)y (,v;duacting the 
networks. 
Second, the ll~yesian network t)rovidcs more de.. 
tailed infi)rmal;ion of users' knowledge, hi t,he (:;tse of 
l)imtry modeling of knowh~(lge, whe.reby (tither the user 
klmws or does llO~ kllow ~1, c(}iic(}p\[;~ i{; is too co3J',qc to 
judge the model under mlccrl:Mnl:y. Therefl)rc, usually, 
the degree of I)elief is ;tssigned t.o M1 (:on(:etyts iu the 
model. It is nol; (:leau' where the degree of belief comes 
from or wharf; it llIC;Lll.q. ()ll tim or:her h;~nd, how~,.ver, 
l.h(', lbLy(!sian ,tf)l)ro;~(:h provides I:he (lel~r(~(! of belie\[ for 
cle~u' semantics, which is 1)rohal~ility. 
The relnMnder of I;his pap(w is organized ill four se(:- 
ti(lltS. Section 2 is devoted to an oul.linc of l~a.ye,'d;m 
networks. ,qection 3, knowledge represental;iou in 
terms of \]btyesian uctworks is discussed. If the model 
is once represeul;e(l, then l;he upd;d;hl\[~ of t.he model 
will 1)(! taken (:are of t.hrough the ev;du;~tion of the net- 
work. ,qe(:tion 4, some exanllfles ;cre given Mon K with 
lilt (!xl)eriu~ent; to show the lt(lvlLill;~tge (if o/lr al)tlro~tch. 
Section 5 concludes this l);q)cr. 
2 Bayesian Networks 
//ea~soning based (m prol)ability t.hem'y requires prob- 
ahilisti(: models to bc specilled. In generM, a cora- 
l)lore lwol)M)ilistic model is sl)ecitied by the joinl: prob- 
;LI)ilities of all random wn'iM)h~s ill the domahl. Tim 
l)rol)lem is th~tl; the coral)let(: Sl)ecilic~tion of the .ioint 
prol)abilities r(.'(lllil'eS a.1)suM amounts nf mlmbe.rs. For 
ex;unl)h; , (:onsi(ler \[.he (:~tse where Ml l'3AldOnl V;kl'i- 
al)les are binary, having ~t wdlle 0 or l, the com- 
lllete t)rol)Mfilistic model is Sll(!(:iti(~(l by 2 '~ - 1 joint 
1)roba.bilities. (Assumiug "n bimrry random wtriables, 
a:\], x~ .... xn, the distribution is :;pecitied by tit(! prol);> 
I)ilitics, P(:*:I = 0, a:u = 0 ..... :,:. = 0), P(:r, = 1, ;ru = 
0, ..., :on = 0), ._, 1)(a:1 --- \],x2 = 1, ..., :l:,~ = 1), th~tt 
sum up to unit, y so one of them can be automatically 
g~dned.) Moreover, in l)racl;it:e it is difficult 1;o explic- 
itly specify the joint prol)Mfility. Concerning our pur- 
pose of modeling the user's knowledge, where a ran- 
dom variable corresponds 1;o a concept and whose value 
<:orresl>OlMS to the user's Mmwledge of the (:oncepl~, 
it is Mmost; imp<>ssit)le to specify MI joinl; probM>ili-. 
ties 1)ec~mse this involves cnumerat:ing all of the user's 
klmwledge t)~d;terus. 
llayesi;u, networks need fat\]: fewer \])robabilil;ies and 
CILI/ l)rovide the coinplete probabilistic luo(lels. The 
inform~fl:ion that (:Oml)ens~d;es \['or the g~t I) is qualit;> 
l:ive, which is obtMned I)y investigathlg the mtl:ure of 
I, he (loin;tin. The \]l~Ly('.sian neLwork h;ts both quali- 
t~ttive and qmrntit;d;ive (:h;~ra(:teristi(:s, l.h('r('.fore, we 
CaAl rel)resenl; the knowledge quMitatively ;utd reason 
al)oti{; t)rol)M)ility (luanl;il;atively. Formally, l/ayesi~ul 
networks m'e directed m:y(:lic graphs (DAG) with the 
nodes ret~re.qent;ing ;~ ramdoln wu'ial)le and the dire(:tcd 
arcs representing the dirccl, del)endent re.la~ion be- 
t:weet, t;he linked variables. It ;~ ;~rc goes from one nod(: 
to ;umther, we say l,hat the fornmr is a l);U'enl node of 
the. \[;tl;ter, and the btH;er is a (:hihl of l.hc former. The 
(list;ril)ut, ion on the network is specified to MI nodes :r 
its 1)rotlability t,(.:lp(.:)) (:on(lil;ioned by the set of its 
paren\[; lio(I(,.s p(x). The lio(l('.s without parents ~urc ~s- 
signed the l)rior 1)rob;d)ilities P(x). That is all |;h;d; is 
ne(:e,ssary for specifying ~ conll)lete t)robM)ilistic nm(lel \[:10\]. 
The reasoning \[m Bayesilm net:works (:orrespnn(ts 
to (.'valuating the posterior prol~al)ilit;y P(;r\[l¢) ml all 
nodes a: given lhe evidence. I'; that is Sl)ecilied hy pro- 
viding certain values t.o ;~ cert;ain sul)se.l; of lmdes in 
th(, networks (fo,: illS|;;tll(:(!, \]'\] = {y = 1, Z" -: 0} for 
some uodes y aud z). The cvMu;ttiOll of the nel,work is 
doue in generM by the st(ich;~st,ic simulation \[10\]. The 
upd;tl:ing of the u;;cr models are directly performed by 
ev;tllt~Ll;illg \[;he net;work once ghe. kn()wledgc of I;11(.' do- 
main has 1)<~en corre<:l:ly represented t)y the /Ltyesialt 
nctw<)rk. In the next section, we discuss knowledge 
rel)resent;ttion with g;ty('.silm networks. 
3 Knowledge Representation 
with Bayesian Networks 
3.1 DesigMng the Language 
We haw; said the nodes ill the \]l;tyesian network are 
F~Lntl{)lll v;triables that r~tltge over sol,le vahles. In ol'del' 
to represent knowledge in terms of the l~tyesi~m net> 
work, we must design the l~ulgllage for the seutt.'nt:es 
assigned to the nodes of the network. We th'st as- 
sume t.ha.t the v,u'iMfles haw'. two lmssible values, so 
\[:hat th¢'. sentt'.uces have truth wtlues, tlutt is, :1. (trllc) 
or (I (fMse). Note thud; this ~tssumption is not cruciM; 
we m~g ~tssign values such ~ts KNOWN, NOT-KNOW, 
NO-\[NI:()I{MNFION as hi UMFE \[11\]. 
1213 
The type of sentences may deI)end on tit(: applica- 
tion we pursue. For general explanation, it is impor- 
tant to make a (:lear distinction between tile two user's 
states; knowing tile name of a conceI)t and knowing the 
other attril>nte of tile coucel)t. For example, suppose 
the user asked the following: 
"Where is FRISCO ?" 
where FRISCO is the name of a record store. From 
this question, the system infers that the user knows 
the name of the store, but does not know its location. 
Now we will give a precise definition of our language. 
All the sentence, s in the language have the form 
( la, beI) : (co,,,t,.,.t) 
where ( label ) is one of PRE, POST, JUDGE, 
TOLD, and TELL, and ( content ) is represented 
by a term ef tile first-order predicate' logic. An 
object and an expertise field are represented by 
an atomic symbol, and an attribute of an object 
is represented by a fimction syml)ol. For exam- 
ple, store001(object), records_collector(expertise 
field), location(store001)(attributc), and so forth. 
The user's knowledge about an attribute is repre- 
sented by five sentences, all having the same (content) 
representing t.he attribute, and one of the five labels. 
The sentenees labeled PRE, express that the user 
knows the attrilLutc t)rior to the <lialogue session, while 
those labeled POST, express that the user has come 
to know it during the session. For instan<:e, PRE: lo- 
cation(store001) means that the user have ah'eady 
knows the h)catiou of store001 betorc the interaction 
starts, whih.' POST: location(store001) means the 
user has <:ome to know the location through the sys- 
tem's explanation. The sentences labeled JUDGE, 
express the user's (:urrent knowledge and is used 
to exploit tile user mo<lel by other coml><ments in 
the dialogue system. For instance, JUDGE: loca- 
tion(store001) means the use.r now knows tit(.' loca- 
tion of store001. The sentences labele<l TOLD an(l 
TELL, express the evi<le.nce, gained by the user's ut- 
terance and the system's explanation. F<Lr instance, 
TOLD: name(store001) means the user has iLL- 
dicated by the clues that she knows the name of 
store001, while TELL: name(store001) means the 
system has explai,m<t the name. For exception, in the 
case of location, the form TELL: location(X)(whcre 
X is some obje(:t \[l)) is not usc<l because a location 
is explained in terms of the relative h)cation of an- 
other object. Instead, the form TELL: relation(X, 
Y)(where X and Y are some ol)ject IDs) is used. 
Tit(.' sentences representing objects and exi)ertisc 
fields have only the label PRE. The sentence repre- 
senting an object (e.g. PRE: store001) means that 
the user knows the object, that is she knows ,nost of 
the attributes of the object. The sentence represent- 
ing an expertise rich\[ (e.g. PRE: records_collector) 
means thai: the user is an exl)ert of the field, that is 
she knows the objects related to the expertise field. 
3.2 Constructing the Networks 
As mentioned, arcs of the Bayesian network represent 
direct probablistic influence between linked variables. 
Tim directionality of the arcs is essential for rei)resent- 
ing nontransitive dependencies. In order to represent 
the knowledge in terms of Bayesian Network, we must 
interpret the qualitative relation betwee.n the sentences 
that are represented by our language as a directed arc 
or some such combination of arcs. 
In our ease, the network has two sub-networks. One 
represents the user's knowledge be.fore the dialog ses- 
sion, which is used to guess the user's model fronl her 
utterances . The sentences assigne<l to the nodes in 
this part have either the label PRE or TOLD. We 
call this subnetwork the prior part. The other sulmet- 
work in which the nodes have either the label POST 
oi' TELL is used to deal wil;h tit(', influence of the sys- 
tem's utterances. This sulmetwork we call the poste- 
rior part. It is important t;o make a clear distinction. 
Considering that the system explains a concept, it is 
not proper to assume that the user knows some other 
related concepts. For example, if tile user utters that 
she knows some location x then it can be inferred that 
she also knows locations that are (:los(; to x. But that 
is not true if the location x is explained by the system. 
The relations ill the prior part of the network are 
categorized into four types as follows: 
(1) tl,e relations between objects in an expertise field 
(2) the relations between attributes of obje(:ts 
(3) the relations lmtween an ol)je<-t and its attributes 
(4) the relations betwee.n an att,'ibute of an object 
and the evi<lence that the user knows it 
The relations (1) are (:oncerL,ed with the expertise 
fiehl. The objects ill the same expertise field are re- 
lated through the expertise field node. We introduce 
the arcs that go from the expertise tMd no<le to the ob- 
je<:t nodes belonging to that fiel(1. For example, ares go 
Dora the node of "records collector" to that of "Com- 
pact Disk","Tower Records" (name of a record store) 
and so on. The level of expertise can be controlled 
by the conditi<mal probal)ilities assigned to the object 
nodes conditioned by tile ext)ertise tMd node. In this 
framework, we can intro<hLce arbitrary numbers of ex- 
pertise fiekls, all of which can be assigned the level of 
expertise. 
'\]/he re.lations (2) are conce.rned with the <lolnain 
knowledge. In our domain, those are the relations be- 
tween the locations, whi<:h are based on the assump- 
tion that the user l)robably knows the locations close 
to the location she known. TILe relations are assunn.'d 
to be symmetric. A single directe<l arc of Bayesian 
networks does not represent a symmetric relation. In 
ordeL' to rel)resent a symmetric relation, we introduce a 
dummy evi(tence node, whereby two arcs go forth from 
the two location nodes as shown in figure 1. The prior 
1214 
O O "-,.®/ 
dlllnlfiy lit)do 
Figure 1: Symmetric rel~d;iolt 
conditional probabilities of l;hc dummy node lutve high 
wdue it' the two parent nodes h~tve the same wdue. 
The relations (3) are (:on(:erned with g(:ner~d knowl- 
edge, such ;ts knowing ;m obj(!ct well imt)li(:~d;cs know. 
ing its ;d;tril)utes. In order to rel)resent such kiltd of 
I'(!l;ttio,ls, WC ill\[to(hi(:(; the ~tl'(:s to go fl'Olll the ,lode of 
~m object to the nodes of its ;tttributcs. 
The arc ec)rresponding to the relation (4) is intro- 
du(:e(l, to go frmn the node of an al.trilmte of an ollj(~ct 
to an evidence node. The ;~ttribul.e nolle ~utd the ev- 
iden(:e node have the s~mm ('ontent, whih, they h;Lve 
the different bd~els, PRE and TOLD. 
Iu tim l)OSterior l)i~rt of the network, the.re ~tr(,. only 
;~rcs rci)resenting the relations (4). The ;d;tribul;e 
nodes ~md the evidence lmdes are lalmle(l POST ~md 
TELL. In a(hlition, tile TELL node. Ill;-ty ll.~tve lllOl'e. 
I;h;tn Ol,(! it;\[reid; ,lode \])(!CaAlS(~ th('. (!Xl)l}tll}ttiOll8 of 
the att|'ilmt(; are m;t(le l)y referring to the other at- 
tributes. Actually, ill ()Ill' towtl gllid~t,l(:(! (lonudn, 
the syst(;m explains the new ht(:~ttkm using |;Ill; lo- 
cations that the user already knows. Fro' instance, 
the nodes POST: h)cation(store001) and POST: 
location(store0()2) ~tre l)iU'ei,ts of the. llode TELL: 
relation(store001~ store002) whe.n the system ('.x- 
Ill;tin till! location of store001 by using the lo(:~tti(m of 
store002. The. more the system shows the l'el~d:ions, 
the deeper the user's un(lerst;ul(ting bc(:on~(~s. 
The ~unbiguous e.videnee (:~ul lm dealt with str~ight- 
forwardly ill tit(; tl;tyesi;ul al)l)ro~(:h. All evidence 
l,o(le Citll luwe lllore th~tll Ol,(! l)a,l'eltt llo(le, to re,1)r(> 
sent the ambiguity. F(lr exam,pie, when (le~ding with 
Sl)oken inputs, it might be ~md)iguous tit;d; the user 
said either "tower recor(ls" ()r "power records." If both 
r(.'cord stores exist, an evidence uode hd~c'le.d TOLD 
is intro(luced as ;~ oh|hi node for both no(les, PRE: 
name(tower) :rod PRE: name(power) (figure 2). 
Fimdly, wc introduce the ~u'(:s that conne(:t the two 
subnetworks. For each ~ttribute., there ~n'e three kinds 
of n(l(les lalleh,.(l PRE, POST, ltll(l JUDGE. The 
two arc are (lraw,t from the PRE node to the JUDGE 
node,rod the POST node to the JUDGE nolle. That 
means the user knows the attribute either 1)e.c~mse he 
alrea(ly knew it before the current (li~dogu(! sessi()n or 
because it has been exi)l~dned by the system during 
1;he session. 
Tim ex~mxI)le of the resulting network is shown ill 
tigure 3. 
PRE: name(tower) I'RE: name(power) 
© 0 "-,..®/ 
TOl,l): name(?ower) 
Figure 2: Ambiguous evidence. 
4 Examples 
Suppose the user ~tsks the sysLe, lll to show the w;ty to 
:-t record store l|~ulle, d FRISCO ill ,% towll (figure 4). 
The systmn uses the Imtwork ill ~igllr(! 3. The diM.gue 
st~u'ts with the user's reqllt!st. 
(1) user: Wht!re is FRISCO? 
in l)rat:tise, the input ~m~tlysis (:Omlmnent is needed 
to obt:-tin cvident:cs of the uctwork \['l'Oll\[ I;}l(! user's 
tlt~(!l'~tllC(!S, lint this 1)ro(:ess is b(!ymul the scope 
of this paper. By amdyzing the inlmt , the sys- 
tem obtains the inforuu~t;ion th;tt the user knows 
the ,l}l, llle Of a (:err&ill store~ \[)Ill; do(!s ilot klloW 
its loc~ttion, The. input;, i.e. the evidence, to 
the network is .E = {T()LD: name(frisco) = 
I, TOLD: location(frisc.o) = 0}. Evalu~tting the de- 
gree of belief of elt(:h con(:el)t :r by using the llOSl;erior 
1)rob~d)ility l)(:rl TOLD: llanle(frisco) = \], TOLD: 
location(frisco) -- 0) gives the resulting user model. 
Though this result (:;m bc directly obtaine(l by evalu.. 
ittiug the network, we will briefly tra.ce our reasoning 
for expl~m~tory l)urposes. (NoLe that tim actmd pro- 
(:ess is l,Ot (!3,sy to Cxl)lain ~ts all nodes of the netwm'k 
influence e.;L(:h other, th;d: is till; reason why simulation 
is nee(led for ('wduation.) 
The user knows th(; ,stole FRISCO, which l'('.p,'e- 
sents that she has the high expertise level f()r records 
colh;(:tors and r~dses the t)rob~d)ility of the node PRE: 
record.s_collector a,n(l ~tlso raises that of the node 
of other re<:l\[rd store.s, Tower R.ecords(Pl{E: tower), 
W~we Records(PRE: waw'.). These nodes then ~dI'e<:t 
thl'. n<)de <If their attributes, PRE: location(tower), 
PRE: name(tower), eRE: lot.at|on(wave), ~u,t 
s<) on. TluLt :';dses the 1)robal)ility of the l<)<:ation 
node HANDS l)ct)artment (PRE: bleat ion(hands)), 
whi(:h is close to the loc;d;io|t the user (l)rOb~dfly) 
knows, i.e. PRE: lo('ation(wave). 
Next, the systmn gene.r;ttes the answer by using tim 
resulting us(!r model. This |;ask is done 1)y at i)la,nner 
for utterance generation. The system nu~y (h~cidc to 
use the. h)(:~ttion of HANDS. 
(2) systmn: It is 300m to lhe smd:h frmn 
HANDS Delm, rt, nmnt. 
1215 
~ ~----,.¢ ~.~_ 
o 
........................................ 
0 H 
.~l Jf 0 ~ Iq /" .,q 
J ~ rn r4 
o ,,7 
Cu 0 
/ ~ - 
°,~ --j "~'- ° V" ,"< o -\ "~ -I u I \ 
~~ o ~ ~,~ ,, \ 
. ~ ~1~m / -~ ,, \. 
o ' 
? 
5 
o ~ o 
Figure 3: Examplc of a network 
1215 
I 
HANDS Department 
WAVE RECORDS ® 
l 
TOWER RECORDS 
N 
FRISCO 
(records store) 
l,'igure 'l: A 
After ut.l:ering the sent:torte, the syste:n adds the 
evidence, TELL: name(hands)= 1, TELL: rela- 
tion(hands, frisco):: 1, t;o the nel;work. Note t;h~t 
the e×planation of the location is made lty show- 
ing i~s l'eladon I~o oLher locations. That: nt*tkes l:he 
probMfility of 1.he node, POST; local;ion(frlsco), 
/'(POST: location(li'isco)ll'; ) raise, where 15' rel,re- 
sents all evidence obt;ai.ed. The .exl; utl,erance o\[ the 
ll.Sel' is: 
(3) user: 1 don'l; know whe:e \]\[ANI)S is. 
This input gives (;he sysl:em l.he evidence, TOLl): 
location(hands) := 0. After obtaining this evidence, 
l;he belief is revised. The probability of Lhc node PRE: 
location(hands) falls, which in turn causes l:he prol~- 
Mfility of the node PRI'?,: location(wave) to fMl. 
Next, the i)lammr ll\]~ty t;ry 1:o explain the loc~tl.ion of 
\]IANI)S, by using l:he, location of Tower I/e, cords whidt 
gives the evidence TEI, L: relat;ion(hands~tower)-~ 
1. 
(4) sysl:em: \[lANDS is l.wo blocks away t~o 
l;he wesl; fronl "Power llecords. 
This expla,ation not; only can influence t;he user's 
undersl;mMing of the lo<-al;ioll of IIANDS bul; also 
the local.ion of FI/ISCO, because the evidence raises 
the posterior prot)alfilit.y of the node POS'D: lo- 
eation(ti'isco) t.hrongh the. node POST: loca- 
tion(hands). 
\]i\]vMual;ilm resull;s of lhe above diMogue are shown 
in 'P~d~h! 1. 
lll~t\]\[) of a, tOWll 
5 Conclusion 
We, have prol)osed the \]:htyesian approadt for user 
modeling in dialogue syst;ems. The knowledge rcp- 
resenl:at.ion, in l;e, rms o\[ \]\]~wesian net:works, figs been 
tlist:uss(;d. Rcasoniltg would I)c aatl;mnatit:;dly ~uld (li- 
recl;ly t)erformed l)y ewtlu;d, ing tim network followed 
by sl:oeh~Lsl;ic simulation. 
Most exact, solutions for |;he inl:eresting problents 
in a.rt;ilicial intelligmlce are knowtl 1.o have NP-hard 
comput~d.i(ma\] complexil:y. '12hus, it luts beelL l'ecog- 
nized tfia.t solving t.hem by ;tic al)t)roximal.e method is 
a more realistic a.pproach. ~Phe \]\];tyesi~ul nel;works ;~rc 
(wMmd;cd l)y the stocha, si;ic sinmhd:iol h which is the 
ai)l)rOXilll~tt(: solut.iol, of probM)ilist;ic reasm,ing. The 
simuhd:iml cost, however, is still expellsive with the 
present COmlml:ing resources. The imr~dlel imphmmn- 
l;;tlion has relmrl;ed good performance resull:s \[7\]. 
After gaining l;hc' aecur;d;e expeetalions of user mod- 
els, a mechamism to ll.q(: t;helll for utterance genet'~tl;ion 
is required. This will be done by planners for uLt;erance 
/';e,eration, whM, try to ~chieve the system's goals, 
The In'ol~al*ilil;ie, s in the user model conla'ibute to mea- 
sure 1:o wh;tt exl, cnt the pl~ul will succeed. 
In the study of nat;urM lauguage processing, 
Bayesian ;tl~proatt:hes lmve bee, ;Ldolfl:ed in t.he field 
of t,hm recoglfidon \[3\] and lexical dis;unbiguation \[7\]. 
We have adopted tile \]l~tyesi;ul networks for user Inod- 
cling because we have pereeiw',d that user modeling is 
one of the core components of diMogue systems whose 
1)eh~wim" strongly iMluences t;he otl,e,' parts of the sys- 
\[;elll. We ende~tvor I;o eclnsl;rllct |;fie eXl)erilllellt~tl dig- 
logue syslmln I;hat accepts l;he users' inputs by speech 
recognit;ion\[8\]. Sl;;trting with user modeling, we' will ex- 
1217 
llode 
JUDGE:location(frisco) 
JUDGE:location(wave) 
J UD GE:location(t ower) 
JUD GE:location(hands) 
JUDGE:name(frisco) 
JUDGE:name(wave) 
JUDGE:name(tower) 
JUD GE:name(hands) 
PRE:records_collect or 
prior 
.51 
.48 
.51 
.48 
.47 
.4'/' 
.47 
.46 
.39 
probabilities after 
the utterance (n) 
(:) I (2) I (3) i (4) 
.21 .43 .43 .66 
.67 .67 .31 .31 
.64 .64 .58 .82 
.67 .76 .43 .74 
.86 .86 .80 .80 
.78 .77 .63 .63 
.78 .77 .64 .90 
.53 .87 .83 .83 
.85 .84 .64 .64 
Table 1: The result of ewtluation 
i)and th(; adoption of Bayesian al)l)roaches in most of 
the eomi)onents in the system. The al)l)roaches must 
be quite effective ill the other colni)onellts , and lead to 
a systeIn whose contl)onents closely interact with each 
other on the common basis of t)i'obability theory. 
References 
\[1\] Douglas E. ApI)elt and Kurt Konolige. A non- 
monotonic logic for reasoning about speech a(:ts 
and belief revision. In International Workshop on 
Nonmonotouie Reasoning, pp. 164 175, 1988. 
\[2\] A. Cawsey. Explanation and Interaction. MIT 
Press, 1993. 
\[3\] E. Charniak and R.P. Gohhnan. A I);~yesian 
model of I)lan recognition. Artificial Inte.lligenee, 
Vol. 64, No. 1, PI). 53 79, 1983. 
\]4\] Peter Cheeseman. In defcnce of l)rol)ability. In the 
Proeeedi'ngs of th.e International Joint Conference 
on Artfieial b~.telligence, pp. 1002-1009, 1985. 
\[51 David N. Chin. KNOME: Modeling what the user 
knows in UC. In A. Kobsa and W. Wahlster, 
editors, User Models in Dialog Systems, chal)ter 4, 
pp. 74 107. S1)ringer-Verlag , 1989. 
\[6\] J. Doyle. A truth maintenance system. Artificial 
\[ntellige~nce, Vol. 12, PI). 231-272, 1979. 
\[rl Leil;t M. \]/.. Eizirik, Valmir C. Babosa, and Sueli 
B. T, Mendes. A 1)ayesian-network approach 
to lexical disambiguation. Cognitive Science, 
Vol. 17, t)p. 257 283, 1993. 
\[8\] K. Itou, S. Hayamizu, and H. Tanaka. Continuous 
speech recognition by context-dependent phonetic 
tlMM and an efficient algorithm for finding n-I)est 
sentence hyl)otheses. In 5~. Proceedings of Lnterna- 
tional Coferenee on Acoustics, 5~meeh, and Signal 
PTvcessing, 1992. 
\[9\] A. Kobsa and W. Wahlstcr, editors. User Modds 
in Dialog Systems. Springer Verlag, 1989. 
\[10\] .1. Pearl. Probabilistic Reasoning in Intelligent 
Systems. Morgan Kauflnann, 1988. 
\[11\] D. Sh'(unan. UMFE: A user modelling fi'ont 
end subsystem. \]:a, ternational Journal of Man- 
Machine Studies, Vol. 23, 1)P. 71 88, 1985. 
\[12\] J.W. Wallis and E. II. Shortliffe. Customized 
explanations using causal knowh!dge. In B.G. 
Buchanan a,ld E.II. Shortliff(!, editors, Rule Based 
Expert Systems: Th, e MYC\[N experiments of 
the Stanford Heuristic P'rogrammi'ug Project, i)p. 
371 390. Addison Wesley, 1985. 
1218 
Reserve Papers 

