REPAIRING REFERENCE IDENTIFICATION FAILURES 
BY RELAXATION 
Bradley A. Goodman 
BBN Laboratories 
I0 Moulton Street 
Cambridge. Mass. 02238 
ABSTRACT 
The goal of thls work is the enrichment of 
human-machlne mteractIons in a natural language 
envlronment. 1 We want to provide a framework less 
restrictive than earlier ones by allowing a speaker 
leeway tn forming an utterance about a task and in 
determining the conversational vehicle to deliver it, A 
speaker and listener cannot be assured to have the 
same beliefs, contexts, backgrounds or goals at each 
point in a conversation. As a result, dlfflcultles and 
mistakes arise when a listener interprets a speakers 
utterance. These mistakes can lead to various kinds of 
mlsunderstandlngs between speaker and hstener. 
including reference failures or failure to understand 
the speaker's mtentlon. We call these 
mtsunderstandlngs mlscommunmatlon Such m~stakes 
constitute a klnd of "ill-formed" input that can slow 
down end possibly break down communication. Our goal 
~s to recognize and Isolate such mlscommunlcattons and 
circumvent them. Thls paper will hlghhght a particular 
class of mlscommunlcatlon - reference problems - by 
descrlbmg a case study, includlng techniques for 
avoldlng failures of reference 
I Introduction 
Cohen, Perrault and Allen showed in thelr paper 
"Beyond Question Answering" \[8~ that ",.. users of 
cluestlon-answerzng systems expect them to do more 
than just answer isolated questions -- they expect 
systems to engage tn conversation. In doing ~o, the 
system ts expected to allow users to be less than 
meticulously hteral ~n conveying their zntentlons, and tt 
is expected to make hnguxstlc and pragmatic use of the 
previous discourse." Following in thelr footsteps, we 
want to build robust natural language processing 
systems that can detect and recover from 
mlsc~mmunlcatton. The development of such systems 
requires s study on how people communicate and how 
they recover from problems In communication. This 
paper summarizes the results of a dissertation \[13\] 
that tnvestlgates the kinds of mlscommunlcatlon that 
occur in human communication with a special emphasis 
on reference prooiems, i.e.. problems a listener has 
determining whom or what a speaker ts talking about. 
We have written computer programs and algorithms that 
demonstrate h~w one could handle such problems m 
IThis reseorcn was suDDorted in port by the Oefenee 
Advonce4 Reseorch Pro~ect Aqency under ¢ontr=ct Neee14--77-- 
C-~378. 
the context of a natural language understand2ng 
system. The study of mzscommunlcatlon is a necessary 
task wlthm such a context since any computer capable 
of communlcat~ng with humans tn natural language must 
be tolerant of the tmprecIse, lll-devlsed or complex 
utterances that people often use. 
Our current research \[25, 26\] views most 
dialogues as being cooperatlve and goal directed, l,e.. a 
speaker and hstener work together to achieve a 
common goal. The interpretation of an utterance 
involves Identifying the underlying plan or goal that 
the utterance reflects \[5. I, 23\]. Thls plan, however, is 
rarely, d ever, obvious at the surface sentence level. 
A central issue In the interpretation of utterances ts 
the transformation of sequences of imprecise, zll- 
devised or complex utterances into well-speclhed plans 
that might be carried out by dialogue participants. 
Within thls context, mlscommunlcatlon can occur. 
We ere particularly concerned with cases of 
mxscommunlcatlon from the heater's viewpoint, such as 
when the hearer is mattentlve to. confused about, or 
misled about the zntentlons of the speaker. In 
ordinary exchanges speakers usually make assumptions 
regarding what thelr listeners know about a topic of 
discussion. They w111 leave out details thought to be 
superfluous \[2. 19\]. Since the speaker really does not 
know exactly what a listener knows about a topic, tt ts 
easy to make statements that can be misinterpreted or 
not understood by the listener because not enough 
details were presented. One principal source of trouble 
Is the description constructed by the speaker to refer 
to an actual object in the world. The descmptlon can 
be tmpreclse, confused, ambiguous or over!v speclflc. It 
might be interpreted under the wrong context. This 
leads to dlfflculty for the hstener when figuring out 
what oblect ~s being described, that Is. ref.erence 
identification errors. Such descriptions are "all- 
formed" input, the blame for ill-formedness may lie 
partly with the speaker and partly with the listener 
The speaker may have been sloppy or not taken the 
hearer into consideration, the listener may be either 
remiss or unwilling to admit he can't understand the 
speaker and to ask the speaker for clarification, or 
may slmply feel that he has understood when he zn fact 
has not. 
Thls work ts part of an on-going effort to 
develop a reference Identlfzcatmn and plan recognition 
mechanism that can exhibit more "human-hke ' 
tolerance of such utterances. Our goal zs to build a 
more robust system that can handle errorful 
utterances, and ~hat can be incorporated in exlstlng 
systems. As a start, we have concentrated on 
reference tdentlflcatzon. In conversation people use 
imperfect descriptions to communicate about objects; 
sometimes their partners succeed zn understanding and 
occasionally they fail. Any computer hoping to play the 
part of a listener must be capable of taking what the 
204 
speaker says and either deleting, adapting or clarifying 
it. We are developing a theory of the use of 
extensional descrlptlons that will help explam how 
people successfully use such imperfect descriptions. 
We call thls the theory of reference mlscommunlcation 
Section 2 of this paper highlights some aspects of 
normal communication and then provides a general 
discussion on the types of miscommunlcatlon that occur 
In conversation, concentrating primarily on reference 
problems and motivating many of them with Illustrative 
protocols. Section 3 presents possible ways around 
some of the problems of miscommunxcation in reference. 
Motivated there is a partial Implementation of a 
reference mechanism that attempts to overcome many 
reference problems. 
We are following the task-omented paradigm of 
Grosz \[14\] since it Is easy to study (through 
videotapes). It places the world In front of you (a 
primarily extensional world), and It limits the 
dlscusslon whlle still providing a rlch environment for 
complex descriptions. The task chosen as the target 
for the system Is the assembly of a toy water pump. 
The water pump Is reasonably complex, containing four 
subassemblies that are built from plastic tubes, 
nozzles, valves, plungers, and caps that can be screwed 
or pushed together. A large corpus of dialogues 
concerning thls task was collected by Cohen (see 
\[7. 8. 9\]). These dialogues contained instructions from 
an "expert" to an "apprentice" that explain the 
assembly of the toy water pump, Both participants 
were working to achieve a common goal - the 
successful assembly of the pump Thls domain Is rlch 
m perceptual information, allowing for complex 
descriptions of elements in it. The data provide 
examples of imprecision, confusion, and ambiguity as 
we!l as attempts to correct these problems 
The following exchange exemplifies one such 
situation. Here A Is instructing J to assemble part of 
the water pump. Refer to Figure l(a) for a picture of 
the pump. A and J are communicating verbally but 
neither can see the other. (The bracketed text In the 
excerpt tells what was actually occurring while each 
utterance was spoken.) Notlce the complexity of the 
speaker's descriptions and the resultant processing 
required by the listener, Thls dialogue illustrates when 
listeners repair the speakers description in order to 
flnd a referent, when they repair their mztlal reference 
choice once they are given more information, and when 
they fall t ~. choose a proper referent In Linp 7, A 
:\[,=scribes the two holes on the BAjEVALVE as "the httle 
hoie" J must repair the descrlptlon, reahzmg that A 
doesnt really mean "one hole but is referring to t,~e 
'two' holes. J apparently does this since he doesnt 
complain about as description and correctly attaches 
the BASEVALVE to the TUBEBASE Figure lib) shows 
the configuration of the pump after the TUBEBASE is 
attached to the MAINTUBE "n Lme I0, \[n Lme 13. J 
interprets "a red plastic piece" to refer to the .VOZZLE 
When A adds the relative clause "that has four gi=mos 
on it." J is forced to drop the NOZZLE as the referent 
and to se{ect the SLLDEV~LVE In Lmes i7 and 18, A'S 
description "the other--the open part of the maln 
tube. the lower valve" is ambiguous, and J selects the 
wrong slte, namely the TUBEBAEE, in which to insert 
the SLIDEVALVE. Since the SL/DEVALVE flts, J doesn't 
detect any trouble. L~nes 20 and 21 keep I from 
thinking that something is wrong because the part fits 
loosely, In L~nes 27 and 28, J indicates that A dld not 
glve him enough znformatlon to perform the requested 
action. In Lme 30. J further compounds the error in 
Line 18 by putting the SPOUT on the TUBEBASE. 
Excerpt 1 (Telephone) 
A. I. Now there's a blue cap 
\[J grabs the TUBEBASE\] 
2. that has two little teeth sticking 
3. out of the bottom of it. 
J: 4. Yeah. 
A. 5. Okay On that take the 
6. brlght shocking pink piece of plastic 
\[J takes BASEVALVE\] 
7. and stick the little hole over the 
teeth. 
\[J starts to install the BASEVALVE. backs off, looks 
at it again and then goes ahead and 
installs it\] 
J. 8 Okay 
A: 9 Now screw that blue cap onto 
I0. the bottom of the maln tube. 
\[J screws TUBEBASE onto MAINTUBE\] 
J. 11. Okay 
A. 12 Now. there's a-- 
13. a red plastic piece 
\[J starts for NOZZLE\] 
14 that has four gizmos on It. 
\[J switches to SLIDEVALVE\] 
J. 15 Yes. 
A 16 Okay Put the ungtzmoed end In the 
uh 
17 the other--the open 
18 part of the maln tube, the lower 
valve 
\[3 puts SLIDEVALVE into hole in TUBEBASE, but A 
meant OUTLET2 of MAINTUBE\] 
I 19 All right 
A 20 !t ;ust hts loosely It .doesnt 
'~I have to f'.t right. Okay. then take 
.~2 the clear plastic elbow \]omt 
\[J takes SPOUT\] 
J 23 All right 
A $4 And put tt over the bottom opening, 
too. 
\[J trees installing SPOUT on TI/BEBASE\] 
l -~ Okay 
a. 28. Okay Now. take the-- 
27 Which end am I supposed to put It 
over') 
28 Do you know ° 
A. -:'9 Put the--put the--the big end-- 
30 the blg end over it. 
\[J pushes big end of SPOUT on TUBEBASE. twlstlng 
zt to force it on\] 
205 
NO:zZe 
Figure I: 
I,~.d) 
I' 
(a) (b) 
The Toy Water Pump 
C 
2 Miscommunication 
People must and do manage to resolve lots of 
(potentaal) mascommumcataon In everyday conversataon. 
Much of it as resolved subconscaously wlth the 
hstener unaware that anything is wrong, Other 
mlscommumcatlon is resolved wath the listener actively 
deleting or replacang mformataon m the speakers 
utterance until It flts the current context. Sometimes 
thls resolutlon Is postponed until the questlonable part 
of the utterance is actually needed. Shll. when all 
these fail. the hstener can ask the speaker to clarlfy 
what was said. 2 
There are many aspects of an utterance that the 
hstener can become confused about and that can lead 
to mascommunacatton. The hstener can become 
confused about what the speaker intends for the 
referents, the actaons, and the goals described by the 
utterance, Confuslons often appear to result from 
confhct between the current state of the conversation. 
the overall goal of the speaker, or the manner In which 
the speaker presented the anformatlon. However, when 
the hstener steps back and is able to discover what 
k~nd of confuslon ~s occurring, then the confusion can 
qulte possibly be resolved. 
2.1 Causes of mlscommunication 
Thls sectaon attempts to motlvate a paradlgm for 
the kinds of conversation that we studled and traes to 
point out places m the paradlgm that leave room for 
mlscommumcatlon. 
~'.1.1 Effects of the structure of task-oriented 
dialogues 
Task-oriented conversatlons have a speclfic goal 
to be achleved: the performance of a task (e.g.. \[14\]). 
The partlclpants in the dlalogue can have the same 
skill level and they can slmply work together to 
accomplish the task; or one of them, the expert, could 
know more and could direct the other, the apprentlce. 
to perform the task. We have concentrated prlmarlly 
on the latter case - due to the protocols that we 
examlned - but many of our observations can be 
generahzed to the former case, too. We will refer to 
thls as the apprentlce-expert domaln. 
The vlewpomts of the expert and apprentlce differ 
greatly In apprentlce-expert exchanges. The expert, 
having an understandlng of the functlonahty of the 
elements in the task. has more of a feel for how the 
elements work together, how they go together, and how 
the indlvldual elements can be used. The apprentlce 
normally has no such knowledge and must base hls 
declslons on perceptual features such as shape \[15\]. 
The structure of the task affects the structure of 
the dlalogue \[14}. partlcularly through the center of 
attentlon of the expert and apprentlce. Thls is the 
phenomenon called focus \[14. 20. 24\]. whlch, in task- 
orlented dlalogues Is a very real and operational thlng 
(e.g., focus is used In resolving anaphorac references). 
Shafts ~n focus correspond dlrectly to the task, ats 
subtasks, the oblects an a task and the subpleces of 
each object Focus and focus shifts are governed by 
many rules \[14. :~0, 24\] Confusaon may result when 
expected shafts do not take place. For example. If the 
expert changes focus to an object but never discusses 
Its subpaeces ~such as an obvaous attachment surface) 
or never bothers to talk about the object reasonably 
soon after its antroductlon (Le., between the tame of ~ts 
mtroductlon and its use. without digressing in a well- 
structured way In between (see \[20\])), then the 
apprentlce may become confused, leavang hlm r~pe for 
mlscommunlcatlon. The reverse anfluence between focus 
and oblects can lead to trouble, too. A shzft In focus 
by the expert that does not have a manHestatlon In 
the apprentlce's world wall also perplex the apprentice 
Focus also influences how descr:ptlons are 
formed \[15, 2\]. The level of detail requlred in a 
description depends directly on the elements currently 
highlighted by the focus If the oblect to be descrabed 
Is samflar to other element~ m focus, the expert must 
be more speclhc m the formulation of the descraptlon 
or may conslder shlftmg focus away from the posslbly 
ambiguous objects to one where the amblgulty wont 
occur. 
2.2 Consequences of miscommunicatlon 
In thls section we will make It clear that people 
do m:scommunlcate and yet they often manage to flx 
thlngs. We will look at speclfic forms of 
mlscommunlcatlon and descrlbe ways to detect them. 
We will hzghhght relatlonsh;ps between different 
mlscommunzcat;on problems but won't necessarzly 
demonstrate ways to resolve each of them. 
2An analysis of clarification suodialogues can be found 
;n \[17). 
206 
2.2.1 Instances of mtscommun/cation 
There are many ways hearers can get confused 
during a conversation. Figure 2 outlines some of them 
that were derived from analyzing the water pump 
protocols. This section defines and illustrates many of 
them through numerous excerpts. Each excerpt is 
marked in parentheses to show what modality of 
communication was used (see \[9\] for a description 
about the collection of these excerpts). Each 
bracketed portion of the excerpt explains what was 
occurring at that point in the dialogue. The confusions 
themselves, coupled with the description at the end of 
this section on how to recognize when one of them is 
occurring, provides motivation for the use of the 
algorithm outlined in Section 3 as a means for 
repairing communication problems. We will only discuss 
referent confusion tn this paper. The other forms of 
confusion - Action. Goal, and Cogmtive Load - are 
described in \[11. 13\]. Another categorization of 
confusmns that lead to conversation failure can be 
found in \[22\]. 
• Figure 2: A taxonomy of confusmns 
Referent ~onfuslon occurs when the listener is 
unable to correctly determine what the speaker is 
referring to with a particular descrlptmn. \[t occurs 
when the descriptions In the utterance are ambiguous 
or imprecise, when there IS confusion between the 
speaker and listener about what the current focus or 
context Is, or when the descriptions in the utterance 
are either incorrect or incompatible with the current 
or global context. 
Erroneous Specificity 
Ambiguous (and. thus, imprecise) descnptxons can 
cause confusion about the referent. Excerpt 2 below 
illustrates a case where the speaker's description is 
underspecxfled - it does not provide enough dated to 
prune the set of possible referents down to one. 
Excerpt 2 (Pace-to-Face) 
S 1. And now take the little red 
3. peg, 
\[P takes PLUG\] 
3. Yes, 
4. and place it xn the hole at the 
5. green end. 
\[P starts to put PLUG into OUTLETR of MAINTUBE\] 
6. no 
7. the--in the green thing 
\[P puts PLUG into green part of PLUNGER\] 
P: 8. Okay. 
In Line 4 and 5, S describes the location to place a peg 
into a hole by giving spatial information. Since the 
location is given relative to another location by "in the 
hole at the green end", it defines a region where the 
peg might go instead of a specific location. In this 
particular case, there are three possible holes to 
choose from that are near the green end. The listener 
chooses one - the wrong one - and inserts the peg 
into it. Because this dialogue took place face to face, 
S is able to correct the ambiguity in Lines 6 and 7. 
A speaker's description can be imprecise in 
several possible ways. (1) It may contain features that 
do not readily apply in the domain. In fine 3, Excerpt 
3, the feature "funny" has no relevance to the listener. 
It is not until A provides a fuller description in Lines 5 
to 8 that E is able to select the proper piece. (2) It 
may use a vague head noun coupled with few or no 
feature values (and context alone does not necessarily 
suffice to distinguish the object). In Excerpt 4, Line 9, 
"attachment" is vague because all objects in the 
domain are attachable parts. The expert's use of 
"attachment" was most likely to signal the action the 
apprentice can expect to take next. The use of the 
feature value "clear'* provides little benefit either 
because three clear, unused parts exist. The size 
descriptor "little" prunes this set of possible referents 
down to two contenders. (3) Enough feature values are 
provided but at least one value is too vague leading to 
trouble. In Excerpt 5, Line 3, the use of the attribute 
value "rounded" to describe the shape does not 
sufficiently reduce the set of four possible referents 
(though, in this particular instance, A correctly 
identifies it) because the term is applicable to 
numerous parts In the dommn. A more precise shape 
descriptor such as "bell-shaped" or "cylindrical" would 
have been more beneficial to the listener, 
Excerpt 3 (Telephone) 
E: I. All right. 
2. Now. 
3. There's another funny little 
4. red thing, a 
\[A is confused, examines both NOZZLE 
SX.,mr-VALVE \] 
5. little teeny red thing that's 
6. some--should be somewhere on 
7. the desk, that has um--there's 
8. like teeth on one end. 
\[E takes SLIDEVALVE\] 
and 
A: 9. Okay. 
E: 10. It's a funny-loo--hollow, 
11. hollow projection on one end 
12. and then teeth on the other. 
Excerpt 4 (Teletype) 
A: I. take the red thing with the 
2. prongs on it 
3. and fit it onto the other hole 
4. of the cylinder 
5. so that the prongs are 
6. sticking out 
2O7 
R: 7. ok 
A: 8. now take the clear little 
9. attachment 
10. and put on the hole where you 
11. just put the red cap on 
12. make sure it points 
13. upward 
R: 14. ok 
F, xeerpt 5 (Teletype) 
S; I. Ok, 
2. put the red nozzle on the outlet 
3. of the rounded clear chamber 
4. ok? 
A: 5. got it. 
Improper Focus 
Focus confusion can occur when the speaker sets 
up one focus and then proceeds with another one 
without letting the listener know of the switch (i.e., a 
focus shift occurs without any indication). An opposite 
phenomenon can also happen - the listener may feel 
that a focus shift has taken place when the speaker 
actually never intended one. These really are very 
similar - one Is viewed more strongly from the 
perspective of the speaker and the other from the 
listener. 
Excerpt 6 below lUustrates an mstance of the 
first type of focus confusion. In the excerpt, the 
speaker (S) shifts focus without notifying the listener 
(P) of the switch. As the excerpt begins, P ,s holding 
the TUBEBASE. S provides in Lines 1 to 16 
mstructzons for P to attach the CAP and the SPOUT to 
outlets OUTLETI and OUTLET2, respectively, on the 
MAINTUSE. Upon P's successful completion of these 
attachments. S switches focus m Lines 17 to 20 to the 
TUSESASE assembly and requests P to screw tt on to 
the bottom of the M,e/NTUSE. White P completes the 
task. S realizes she left out a step in the assembly - 
the placement of the SLIDEVALVE into OUTLET2 of the 
M,eJNTUSE before the SPOUT ts placed over the same 
outlet. S attempts to correct her mistake by 
requesting P to remove "the pies "~ piece in ~nes 22 
and 23. Since S never indicated a shift in focus from 
the TUSESASE back to the IPOUT, P mterprets "the 
pies" to refer to the TUSESASE. 
Excerpt 6 (Face-to-Face) 
S 1. And place 
2. the blue cap that's left 
\[P takes CAP\] 
3. on the side holes that are 
3The whole ward here is "pleetic." People in general 
tend to be good ot proceedinq before heorin 9 the whole 
utteronce or even the whole word. 
4. on the cylinder, 
\[P lays down TUBEBASE\] 
5. the side hole that is farthest 
6. from the green end. 
\[P puts CAP on OUTLET! of MAINTUBE\] 
P: 7. Okay. 
S; 8. And take the nozzle-looking 
9. piece, 
\[P grabs NOZZLE\] 
10. no 
11. I mean the clear plastic one, 
\[P takes SPOUT\] 
12. and place it on the other hole 
\[P identifies O~ of MA1NTUBE\] 
13. that's left, 
14. so that nozzle points away 
15. from the 
\[P installs SPOUT on OUTLET2 of MAINTUBE\] 
16. right. 
P: 17. Okay. 
S: 18. Now 
19. take the 
20. cap base thing 
\[P takes TUBEBASE\] 
21. and screw it onto the bottom, 
\[P sorewsTUBEBASE on)L~3NTUBE\] 
22, ooops, 
\[S realizes she has forgotten to have P put 
SLIDL~ALVE into OUTLET2 of MAINTUBE\] 
23. un-undo the pies 
\[P starts to take TUBEBASE off MAINTUBE\] 
24. no 
25. the clear plastic thing that I 
26. told you to put on 
\[P removes SPOUT\] 
27. sorry. 
28. And place the little red thing 
\[P takes $LID~ALVZ\] 
29. tn there first, 
\[P mserts SLXD~ALVZ into OUTLET~ of M\[AINT~E\] 
30. it fits loosely in there. 
Excerpt 7 below demonstrates the latter type of 
focus confuszon that occurs when the speaker (S) sets 
up one focus - the M,4\]NTUBE, which is the correct 
focus In this case - but then proceeds in such a 
manner that the listener (J) thinks a focus shift to 
another piece, the TUBESASE, has occurred. Thus, 
Line 15 refers to "the lower side hole in the 
M,41NTUBE" for S and "the hole in the TUBEBASE" for 
J. J has no way of realizing that he has focused 
incorrectly unless the description as he interprets it 
doesn't have a real world correlate (here something 
does satisfy the description so J doesn't sense any 
problem) or if, later in the exchange, a conflict arises 
2O8 
due to the mistake (e.g,, a requested action can not be 
performed). In Line 31, J inserts a piece into the 
wrong hole because of the misunderstanding in Line 15. 
Line 31 hints that J may have become suspicious that 
an ambiguity existed but since the task was 
successfully completed (i.e., the red piece fit into the 
hole in the base), and since S did not provide any 
clarification, he assumed he was correct. 
hcerpt 7 (Telephone) 
S: 1. Um now. 
2. Now we're getting a little 
3. more difficult. 
J: 4. (laughs) 
S: 5. Pick out the large air tube 
\[l picks up SAND\] 
6. that has the plunger in it. 
\[J puts down STAND. takes PLUNGER/MAINTUB~. 
assembly\] 
J: 7. Okay. 
S: 8. And set it on ~ts base, 
\[J puts down idAINTUBE, standing vertically, on the 
TABLE\] 
9. which is blue now, 
10. rzght? 
\[J has shifted focus to the TUBEBASE\] 
J: 11. Yeah. 
$, 12. Base is blue. 
13. Okay. 
14. Now 
15. You've got a bottom hole still 
16. to be filled, 
17. correct? 
J: 18. Yeah. 
\[J answers this with MAINTUBE still sittint on the 
TABLE; he shows no indication of what 
hole he thinks i8 meant - the one on 
the MAINTUBE. OUTLET2, or the one in 
the TUBEBASE\] 
\[J 
S. 
picks 
19. Okay. 
20. You have one red piece 
21. remamm8? 
up ldA/NTUBE assembly and looks at 
TUBEBASE, rotatine the MAINTUBE so 
that TUBP-BASE is pointed up, and 
sees the hole in it; he then looks at 
the SLIDEVALVE\] 
J: 22. Yeah. 
3. 23. Okay. 
24. Take that red piece. 
\[j takes SIJDEVALVE\] 
25. It's got four little feet on 
26. it? 
J: 27. Yeah. 
S; 28. And put the small end into 
29. that hole on the air tube-- 
30. on the big tube. 
\[J J; 31. On the very bottom? starts to put it into the bottom hole of 
TUBEBASE - though he indicates he is 
unsure of himself\] 
S: 32. On the bottom, 
33. Yes. 
Misfocus can also occur when the speaker 
inadvertently lefts to distinguish the proper focus 
because he did not notice a possible ambiguity; or 
when, through no fault of the speaker, the listener just 
fails to recognize a switch in focus indicated by the 
speaker. ~xcerpt 7 above is an example of the first 
type because S failed to notice that an amblguzty 
existed since he never explicitly brought the TUBEBASE 
either into or out of focus. He just assumed that J 
had the same perspective as hzm - a perspective in 
which uo ambiguity occurred. 
Wrong Context 
Context differs from focus. The context of a 
portion of a conversation is concerned with the po:nt 
of the discussion in that fragment and with the set of 
objects relevant to that discussion, though not 
attended to currently. Focus pertains to the elements 
which are currently being attended to in the context. 
For example, two people can share the same context 
but have different focus assignments wt~hm it - we're 
both talking about the water pump but you're 
describing the MA/NTUB£ and I'm descrlbmg the 
AIRCH,4MB£,q. Alternatively, we could JUst be uslng 
different contexts - I think you're talking about taking 
the pump apart but you're talking about replh^lng the 
pump with new parts - m both cases we m~v be 
sharing the same focus - the pump - but our conte~,s 
are totally off from one another. ~ The kinds of 
misunderstandings that can occur because of context 
problems are similar to those for focus problems: (1) 
the speaker might set up or be xn one context for a 
discussion and then proceed in another one without 
effectively letting the listener know of the change, (2) 
the listener may feel a change in context has taken 
place when in fact the speaker never Intended one, or 
(:3) the Listener fails to recognize an indicated context 
switch by the speaker. Context affects reference 
because it helps define the set of available oblects that 
are possible contenders for the referent of the 
speaker's descriptions. If the contexts of the speaker 
and listener differ, then m|sreference might result. 
Bad AnaloEy 
An analogy (see \[I0\] for • discusslon on 
analogies) is a useful way to help descrlbe an object by 
attemptlng to be more precise by using shared past 
expemence and knowledge - espec:ally shape and 
functional reformation. If that past experxence or 
knowledge doesn't contain the reformation the speaker 
assumes it does or isn't there, then trouble occurs. 
Thus. one more way referent confusion can occur Is by 
describing an oh}act using • poor analogy. An analogy 
used to describe an object might not be spec:fic 
4Groez \[14, lS\] would dem~ril~ this as o difference in 
"task DIane J ~ile Rai¢ltlNnt \[2e, 21\] m~uld say that the 
"c0mlmmjcativa gCNlie" dJffare¢l. 
2O9 
enough - confusing the listener because several pieces 
might conform to the analogy or, tn fact, none at all 
appear to fit because discovering a mapping between 
the analogous object and some piece in the 
environment Is too difficult. In Excerpt 8, J at first 
has trouble correctly satisfying A's functional analogy 
"stopper" in "the bag blue stopper", but finally selects 
what he considers to be the closest match to 
"stopper". 
Excerpt 8 (Telephone) 
A: I. Okay. Now. 
2. take the big blue 
3. stopper that's laying around 
\[J grabs ~diCI4AMBER\] 
4. .. and take the black 
5, ring-- 
J: 6. The big blue stopper? 
\[J is confused and tries to communicate it to A; he 
is holding the AIRCHAMBER here\] 
A. 7 Yeah. 
8. the blg blue stopper 
9. and the black ring 
\[J drops AIRCHAMBER and takes the O-RING and 
the TUBEBASE\] 
In other cases tt might be too specific - 
confusing the listener because none of the available 
referents appear to fit it. In Line 8 of Excerpt 6, 
"nozzle-looking" forms a poor shape analogy because 
the object being referred to actually Is an elbow- 
shaped spout. The "nozzle-looklng" part of the 
description convinced the listener that what he was 
looking for was something specific like a nozzle (which 
xs a small spout). Sometimes, when an oblect xs a clear 
representative of a specified analogy class, the 
apprent2ce may become confused, wondering why the 
expert bothered to form an analogy mstead of just 
directly describing the object as a member of the class. 
Hence, tt would not be surprising d the apprentice 
tgnoreu the best representatnve of the class for some 
less obvious exemplar. Thus, for example, It ts better to 
say "nozzle" instead of "nozzle-looking." In Excerpt 9, 
the description "hippopotamus face shape" (a shape 
analogy) tn Lines 2 and 3, and "champagne top" (a 
shape analogy) in Line 9. ere too speclhc and the 
hstener ts unable to easily find something close enough 
to match either of them. He can't discover a mapping 
between the oblect in the analogy and one in the real 
world. 
Excerpt 9 (Audiotape) 
M; I. take the bright plnk flat 
2. piece of hippopotamus face 
3. shape piece of plastic 
4. and you notice that the two 
5. holes on xt 
\[M is tr~tng to refer to BASEVALVE\] 
6. match 
7. along with the two 
8. peg holes on the 
9. champagne top sort of 
10. looking bottom that had 
II. threads on It 
\[M is tryin E to refer to TUBEBASE\] 
Description incompatibility 
Incompatible descriptions can lead to confusion 
also. A description is incompatible when (1) one or 
more of the specified conditions, i.e., the feature 
values, do not satisfy any of the pieces; (2) when one 
or more specified constraints do not hold (e.g.. saying 
"the loose one" when all objects are tightly attached). 
or (3) if no one object satisfies al_~l of the features 
specified in the description. In Lines 7 and 8 of 
Excerpt 9 above, M's use of "the two peg holes" leads 
to bewilderment for the listener because the described 
object has no holes in it. M actually meant "two pegs". 
2.2.2 Detecting miscommunicatlon 
Part of our research has been to examine how a 
listener discovers the need for a repair of an 
utterance or a description during communication. The 
incompatibility of a referent or action is one signal of 
possible trouble. The appearance of an obstacle that 
blocks one from achieving a goal is another indication 
of a problem. 
Incompatibillty 
Two kinds of incompat~btltty, action or referent. 
appear In the taxonomy of confusions. The strongest 
hint that there is a reference problem occurs when the 
listener finds no real world object to correspond to the 
speaker's description. This can occur when (1) one or 
more of the specified feature values xn the description 
are not satisfied by any of the pieces (e.g. saying "the 
orange cap" when none of the objects are orange~. {2) 
when one or more specified constraints do not hold 
(e.g., saying "the red plug that fits loosely" when all 
the red plugs attach tightly), or (3) If no one object 
satisfies all of the features specified m the description 
(I.e., ther'e-ts, for each feature, an object that exhibits 
the specified feature value, but no one object exhibits 
all of the values). An action problem xs likely ~f I l) the 
listener cannot perform the action specified by the 
speaker because of some obstacle; (2) the hstener 
performs the action but does not arrlve at its intended 
effect (I.e., a specified or default constramt lsnt 
satisfied); or (3) the current action affects a previous 
action tn an adverse way, yet the speaker has given no 
sign of any importance to this side-effect. 
Goal obstacle 
A goal obstacle occurs when a goal (or subgoa\[) 
one is trying to achieve ts blocked This blockage can 
result m confusion for the hstener because he did not 
expect the speaker to give him tasks that could not be 
achieved. Often. though, it points out for the hstener 
that some mlscommunication (such as mlsreference) has 
occurred. 
Goal redundancy 
Goal redundancy occurs when the requested goal 
(or subgoal) is already satisfied. In some sense, xt xs a 
special klnd of goal obstacle where the goal to be 
fulfilled is blocked because it is already satisfied. It is 
a simple goal obstacle because nothmg has to be done 
to get around it. However, it can lead to confusion on 
210 
the part of listeners because they may suspect they 
misunderstood what the speaker has requested since 
they wouldn't expect a reasonable speaker Lo request 
the performance of an already completed action. It 
provides a hint that miscommumcation has occurred. 
3 Repairing Reference Failures 
3. I Introduction 
The previous section dlustrated how task- 
oriented natural language mteractlons in the real 
world can induce contextually poor utterances. Given 
all the possibilities for confusion, when confusions do 
occur, they must be resolved If the task is to be 
performed. This section explores the problem of fixing 
reference failures. 
Reference Identification is a search process where 
a listener looks for something in the world that 
satisfies a speaker's uttered description. A 
computatlonal scheme for performing reference has 
evolved from work by other artificial intelligence 
researchers (e.g., see \[14\]). That tradltlonal approach 
succeeds if a referent ~s found, or falls d no referent 
ts found {see Figure 3(a)). However, a reference 
identlficatlon component must be more versatile than 
those constructed m the traditional manner. The 
excerpts provided m the prevlous section show that 
the traditional approach is wrong because people's real 
behavlor zs much more elaborate. In particular. 
hsteners often find the correct referent even when the 
speaker's descrlpt)on does not describe any object In 
the world. For example, a speaker could descrlbe a 
blue block as the "turquoise block." Most listeners 
would go ahead and assume that the blue block was the 
one the speaker meant. 
A key feature to reference identlficatlon is 
"negotlatlon." Negotlatlon in reference ldentlhcatlon 
comes in two forms. First. It can occur between the 
listener and the speaker. The listener can step back, 
expand greatly on the speaker's descrlptlon of a 
plausible referent, and ask for conhrmatlon that he 
has indeed found the correct referent. For example, a 
hstener could mltlate negotiation wlth 'Tin confused. 
Are you talking about the thlng that is klnd of flared 
at the top? Couple inches long. It's kind of blue." 
Second. negotiation can be wlth oneself. Thls type of 
negotiation, called self-negotlatlon. Ls the one that we 
are most concerned wlth in thls research. The listener 
conslders aspects of the speaker's descrzptlon, the 
context of the commumcatlon, and the listener's own 
abdltles. He then apphes that dehberatlon to determine 
whether one referent candldate :s better than another 
or. if no candidate Is found, what are the most likely 
places for error or confuslon. Such negotlatlon can 
result in the listener testing whether or not a 
partlcular referent works. For example, linguistic 
descrlptlons can influence a listener's perception of 
the world. The listener must ask himself whether he 
can percelve one of the oblects in the world the way 
the speaker described it. in some cases, the listener's 
perceptlon may overrule the descrlptlon because the 
listener can't percelve ~t the way the speaker 
described it. 
To repair the traditional approach we have 
developed an algorithm that captures for certain cases 
the listener's abdity to negotiate with himself for a 
referent It can look for a referent and. If It doesn't 
find one, it can try to find possible referent candidates 
that might work, and then loosen the speaker's 
description using knowledge about the speaker, the 
conversation, and the listener himself. Thus. the 
reference process becomes multi-step and resumable 
This computational model, which I call "FWIM" for "Find 
What I Mean", is more faithful to the data than the 
traditional model (see Figure 3(b)). 
Current I_ ~ RefePence ~u....= 
Component 
~mi~=t 
Current 
Reference -~ ~,,=¢..= 
Component 
~ ~J~milure 
Relaxation 1 Component 
T¢.-,,- u 
(a) Traditional (b) FWIM 
Figure 3: Approaches to reference \]dentdlcatlon 
One means of making sense of an approxlmate 
description is to delete or replace portlons of it that 
don't match objects In the heater's world. \[n our 
program we are uslng "relaxation" techniques to 
capture this behavior. Our reference identlhcatlon 
module treats descriptions as approximate It relaxes 
a description in order to find a referent when the 
hteral content of the description falls to provide the 
needed Information. Relaxation. however, is not 
performed blindly on the description We try to model 
a person's behavior by drawlng on sources of 
knowledge used by people. We have developed a 
computational model that can relax aspects of a 
descrlptlon using many of these sources of knowledge. 
Relaxation then becomes a form of commumcatlon 
repair \[4\] that hearers can use. 
3.2 The relaxation component 
When a description fails to denote a referent In 
the real world properly, It Is possible to repair tt by a 
relaxatlon process that ignores or modifies parts of the 
descrlptlon. Since a description can speclfy many 
features of an object, the order In which parts of It 
are relaxed Is crucial (i.e.. relaxing Ln different orders 
could yield matches to different objects) There are 
several kinds of relaxation possible One can ignore a 
constituent, replace It with something close, replace it 
with a related value, or change focus (i.e.. consider a 
different group of objects.). This section descrlbes the 
overall relaxatlon component that draws on knowledge 
sources about descriptions and the real world as it 
tries to relax an errorful description to one for which 
a referent can be sdentlfied. 
3.2.1 Find a referent using a reference mechamsm 
Identifying the referent of a description requires 
finding an element in the world that corresponds to the 
speaker's description (where every feature specified in 
the description is present In the element in the world 
but not necessarily vice versa). The initial task of our 
reference mechanism Is to determine whether or not a 
search of the (taxonomic) knowledge base that we use 
to model the world Is necessary. For example, the 
reference component should not bother searching - 
unless specifically requested to do so - for a referent 
for indefinite noun phrases (which usually describe new 
or hypothetical objects) or extremely vague 
descriptions (which do not clearly describe an oblect 
because they are composed of imprecise feature 
values). A number of aspects of discourse pragmattcs 
can be used in that determination (eg., the use of a 
delctlc In a definite noun phrase, such as "thls X" or 
"the last X", hints that the object was either mentioned 
previously or that it probably was evoked by some 
previous reference, and that it is searchable) but we 
will not examine them here. 
The knowledge base contains linguistic 
descriptions and a descrlptton of the listener's vlsual 
scene itself. In our Implementation and algorithms, we 
assume It is represented In KL-One \[3\], a system for 
describing taxonomic knowledge. KL-One is composed 
of CONCEPTs, ROLEs on concepts, end links between 
them. A CONCEPT Is like a set. representing those 
elements described by it. A SUPERC link ('==>") is 
used between concepts to show set Inclusion. For 
example, consider Figure 3. The SuperC from Concept B 
to Concept A is like stating BCA for two sets A and 
B An INDIVIDUAL CONCEPT ts used to guarantee that the 
subset speclhed by a concept Is unique The \[ndlvldual 
Concept D shown m the figure Is dehned to be a 
unique member of the subset specified by Concept 
C ROLEs on concepts are like normal attributes and 
slot hllers m other knowledge representation 
languages. They define a functlonal relatlonshlp 
between the concept and other concepts 
Concept 
C Individual 
Concept 
Figure 4: A KL-One Taxonomy 
Assuming that a search of the knowledge base Is 
considered necessary, then a reference search 
mechanism ts revoked. The search mechanism uses the 
KL-One Classther \[16\] to search the knowledge base 
taxonomy. Thls search Is constrained by a focus 
mechanlsm based on the one developed by Grosz \[14\]. 
The Classafler's purpose Is to discover all approprmte 
~ubsumptlon relationships between a newly formed 
descrlptton and all other descriptions In a gwen 
taxonomy. With respect to reference, this means that 
all possible (descriptions of) referents of the 
descrlptlon will be subsumed by tt after It has been 
classLhed rote the knowledge base taxonomy. If more 
than one candidate referent Is below (when a 
descrlptlon A Is subsumed by B. we say A ts "below" B) 
the classified description, then, unless a quantifier in 
the description specified more than one element, the 
speaker's description is ambiguous. If exactly one 
descr~ptlon Is below it, then the intended referent is 
assumed to have been found. Finally, if no referent is 
found below the classified descrxption, the relaxation 
component is invoked. We will only consider the last 
case in the rest of the paper. 
3.2.2 Collect votes for or against relaxing the 
description 
It is necessary to determine whether or not the 
lack of a referent for a description has to do with the 
description itself (i.e.. reference failure) or outside 
forces that are causing reference confusion. For 
example, the problem may be with the flow of the 
conversation and the speaker's and hsteners 
perspectives on it; it may be due to mcorrect 
attachment of a modifier; it may be due to the action 
requested; and so on. Pragmatic rules are Invoked to 
decide whether or not the descrxptlon should be 
relaxed. These rules will not be discussed here so we 
will assume that the problem lies in the speakers 
description. 
3.2.3 Perform the relaxation of the description 
If relaxation Is demanded, then the system must 
(1) find potential referent candidates, (2l determine 
which features in the speaker's description to relax 
and in what order, and use those ordered features to 
order the potential candidates with respect to the 
preferred ordering of features, and (3~ determine the 
proper relaxation techniques to use and apply them to 
the description. 
Find potential referent candidates 
Before relaxation can take place, potential 
candidates for referents (which denote elements in the 
listener's visual scene) must first be found These 
candidates are discovered by performing a "walk" tn 
the knowledge base taxonomy in the general vlclmty of 
the speakers classified description. A KL-One partial 
marcher is used to determme how close the candidate 
descriptions found during the walk are to the speakers 
description, The partial metcher generates a numerical 
score to represent how well the descrlptlons match 
(after first generating scores at the feature level to 
help determme how the features are to be aligned end 
how well they match). This score is based on 
information about KL-One and does not take mto 
account any information about the task domain. The 
ordering of features and candidates for relaxation 
described below takes Into account the task domain. 
The set of best descriptions returned by the marcher 
(as determined by some cutoff score) are selected as 
referent candidates. 
Order the features and candidates for relaxation 
At this peat the reference system inspects the 
speaker's description and the candidates, decides wtltch 
features to relax and in what order. 5 and generates a 
master ordering of features for relaxation. Once the 
feature order Is created, the reference system uses 
50f course, om=a one ~rticular candidate is selected. 
then deciding which features to relax is relatively tr(vial 
- one simply c(mporee feature by feature between the 
candidate description (the target) and the speaker's 
description (the ~ttern) and notes any discrepancies. 
212 
that ordering to determine the order in which to try 
relaxing the candidates. 
We draw pr;martly on sources of linguistic 
knowledge, pragmatic knowledge, discourse knowledge, 
domam knowledge, perceptual knowledge, hierarchical 
knowledge, and trial and error knowledge durmg this 
repair process. A detailed treatment of all of them can 
be found In \[12, 27, 13\]. These knowledge sources are 
consulted to determine the feature ordering for 
relaxation. We represent information from each 
knowledge source as a set of relaxation rules. These 
rules are written in a PROLOG-Iike language. Figure 5 
illustrates one such linguistic knowledge relaxation 
rule. This rule is motivated by the observation in the 
excerpts that speakers typ~cally add more important 
informatlon at the end of a descrlpt~on (where they are 
separated from the ma~n part of the descrlpt~on and 
thus provided more emphasis). Since the syntactic 
constituents often at the end are relatlve clauses or 
predicate complements, we created this more specdic 
relaxatlon rule. However. a more general and more 
applicable rule is that information presented at the 
end of a descrlptlon is usually more promment. 
Relox the features in the speaker's description in the 
order: odjectives, then I:repoeitiono! phroeee, ond 
finolly relctive ¢louses ond prediccte complements. 
E.g.. 
Rel ox-Feot ure-Be f ore(v 1 .v2) 
<- ObjectOeecr(d), 
Feat ureOeec r i ptor(v! ), 
FectureOescr iptor(v2), 
FecturelnOeecr i pt ion(vf .d). 
Feat urel nOesc r i pt i on(v2 .d). 
5"quo I (syntoc t ic-f orm(v t .d), "ADJ"). 
;'quo I (synt a¢t ic-f orm(v2.d), "REL-CLS") 
Figure 5: A sample relaxation rule 
Each knowledge source produces ~ts own partial 
ordermg of features. The partial ordermgs are then 
zntegrated to form a d~rected graph. For example. 
perceptual knowledge may say to relax color However. 
~f the color value was asserted ~n a relative clause. 
linguistic knowledge would rank color lower. ~.e.. 
placmg ~t later ~n the list of things to relax. 
Smce different knowledge sources generally have 
different partial orderlngs of features, these 
differences can lead to a conflict over which features 
to relax. It Is the job of the best candidate algorithm 
to resolve the d~sagreements among knowledge sources. 
It's goal ts to order the referent candidates, Ci, so 
that relaxation ~s attempted on the best candzdates 
first Those candidates are the ones that conform best 
to a proposed feature ordering. To start, the algorithm 
exammes pairs of candidates and the feature order~ngs 
from each knowledge source. For each candidate C i. 
the algorithm scores the effect of relaxlng the 
speaker's orlglnal descrlpt~on to C i. using the feature 
ordering from one knowledge source. The score 
reflects the goal of mln~mlz:ng the number of features 
relaxed whale try3ng to relax the features that are 
"earhest" sn the feature ordermg. It repeats ~ts 
scoring of C i for each knowledge source, and sums up 
its scores to form Ci's total score. The Ci's are then 
ordered by that score. 
Figure 6 provides a graphic description of th~s 
process. A set of objects ~n the real world are 
selected by the partial marcher as potent~a| candidates 
for the referent. These candidates are shown across 
the top of the figure. The lines on the right side of 
each box correspond to the set of features that 
describe that object. The speaker's descrlpt~on ts 
represented in the center of the figure. The set of 
specified features and their assigned feature value 
(e.g., the pair Color-Maroon) are also shown there. A 
set of partial orderings are generated that suggest 
which features in the speaker's description should be 
relaxed first - one ordering for each knowledge source 
(shown as "l~nguist~c," "Perceptual." and "H~erarchlcaI" 
in the figure). These are put together to form a 
directed graph that represents the possible, reasonable 
ways to relax the features specified tn the speakers 
description. Finally. the referent candidates are 
reordered using the information expressed ~n the 
speaker's description and in the directed graph of 
features. 
OQ/ecrl 
• *a pm-c~al FI -~ ¢o1¢*- f~ tl oe fz P~ 
¢ -) N|eeet.¢tnlceJ f3 -) F~I:I~ f2 F3 fZ oe f~ oe F,* 
F4 -) Size f3 fa 
f4 
5 
O~Nct4d Ct~ of/~rtu.s I~ ,*~,~r~;~ 
Figure 8: Reordering referent candldates 
Once a set of ordered, potential candldates are 
selected, the relaxation mechanlsm begms step 3 of 
relaxatlon; it trles to find proper relaxation methods to 
relax the features that have lust been ordered ~success 
tn flndlng such methods "justifies" relaxing the 
descrlptlon). It stops at the first candidate which zs 
reasonable. 
Determine which relaxation methods to apply 
Relaxation can take place wlth many aspects of a 
speaker's descrlptlon: wlth complex relatlons specified 
In the descrlptlon, wlth indlvldual features of a 
referent specified by the descrlptlon, and with the 
focus of attention in the real world where one attempts 
to find a match. Complex relatlons speclfted in a 
speaker's descrlptlon include spatlal relations (e.g.. 
"the outlet near the top of the tube">, comparatives 
(e.g. "the larger tube") and superlatlves (e.g., "the 
longest tube"). These can be relaxed. The slmpler 
features of an object (such as slze or color) that are 
speclfied in the speaker's descrlptton are also open to 
relaxation. 
Often the objects in focus In the real world 
implicitly cause other objects to be In focus \[14, 2{\]\]. 
The subparts of an object ~n focus, for example, are 
reasonable candidates for the referent of a fazhng 
description and should be checked. At other times, the 
speaker might attribute features of a subpart of an 
213 
object to the whole object (e.g., describing a plunger 
that Is composed of a red handle, a metal rod, a blue 
cap, and a green cup as "the green plunger"). In 
these cases, the relaxation mechanism utilizes the 
part-whole relation in object descriptions to suggest a 
way to relax the speaker's description. 
Relaxation of a description has a few global 
strategies that can be followed for each part of the 
description: (I) drop the errorful feature value from 
the description altogether, (2) weaken or tighten the 
feature value but keep its new value close to the 
specified one, or (S) try some other feature value. 
These strategies are realized through a set of 
procedures (or reLa=,,tion methods) that are organized 
hierarchically. Each procedure is an expert at relaxing 
its particular type of feature. For example, a 
Generat e- Similar- Feature-Values procedure is 
composed of procedures llke Generate-Similar-Shape- 
Values, Generate-Similar-Color-Values and Generate- 
Similar-Size-Values. Each of those procedures are 
specialists that attempt to first relax the feature value 
to one "near" the current one (e.g., one would prefer 
to first relax the color "red" to "pink" before relaxing 
it to "blue") and then. d that fails, to try relaxing it 
to any of the other possible values. If those fail. the 
feature would simply be ignored. 
3.3 An example on handling a misreference 
This section describes how a referent 
identification system can handle a mlsreference using 
the scheme outlined in the previous section. For the 
purposes of thls example, assume that the water pump 
objects currently in focus include the CAP. the 
MAINTUBE. the AIRCHAMBER and the STAND (see Figure 
l{a) for a picture of these parts) Assume also that 
the speaker tries to describe two of the objects. 
". two devices that are clear piastlc One of them has 
two openings on the outside with threads on the end, 
and its about five inches long. The other one tsa 
rounded piece with a turquoise base on it. Both are 
tubular. The rounded piece fits loosely over...". The 
reference system can find a unique referent for the 
first obJect but not for the second. The relaxation 
algorithm will be shown below to reduce the set of 
referent candidates for the second description down to 
two. It. then. requires the system/listener to try out 
those candidates to determine if one. or both, fits 
loosely. The protocols exhibit a similar result when the 
listener uses "fits loosely" to get the correct referent 
(eg.. Excerpt 6 exemplifies where the "fit" can confirm 
that the proper referent was found). 
Figure 7 provides a simplified and hnearlzed vlew 
of the actual KL-One representatlon of the speaker's 
descriptions after they have been parsed and 
semantically interpreted. A representation of each of 
the water pump objects that are currently under 
consideration is presented in Figure 8 Each provides a 
physical description of the object - in terms of its 
dimensions, the basic 3-D shapes composing it, and its 
physical features - and a basic functional description 
of the obJect. The first entry in each representation 
tn Figure 8 {that entry is shown in uppercase) defines 
the basic kind of entity being described {e.g.. "TUBE" 
means that the object being described is some kind of 
tube) The words in mixed case refer to the names of 
features and the words in uppercase refer to possible 
fillers of those features from things in the water pump 
world. The "Subpart" feature provides a place for an 
embedded description of an object that is a subpart of 
a parent object. Such subparts can be referred to on 
their own or as part of the parent object. The 
"Orientation" feature, used in the representations in 
Figure 8. provides a rotation and translation of the 
object from some standard orientation to the oblects 
current orientation in 3-D space. The standard 
orientation provides a way to define relative positions 
such as "top," "bottom," or "slde." 
Dlaerl: 
I DCVIC~ ITrensperency C~ARI 
*Composition PL.ASTIC) 
iSuSpert I(~PL~NilIi 
ISUDpiTI 10111)) 
ISulleirt {T~ADS Ilii-Politlon \[NDIt) 
IDlm4,nilonl I~nlth S.OII 
ilAiloli©il-~hlll T$111Jlilll 
(FIT-INTO IOuter (D\[VIC/ (Trlnsperencv C'~TARI 
;Compos,t ,on PLASTIC) 
Shape R~O ) 
( Ail@lo| I ¢li -.~hllpe 'r~.~ljt~R ) 
(Subpirl ~SA. ~¢ q('olor T'~T~QtrOl.~T);Ib) 
¢ Inner ) 
ir,tCond|t~on LOOSEI ) 
Figure 7: The speaker's descriptions 
The first step in the reference process ts the 
actual search for a referent in the knowiedge base 
The reference identification process is incremental in 
nature, l.e,, the listener c~n begin the search process 
before he hears the complete description This was 
observed throughout the videotape excerpts and the 
algorithm presented here is actually deslgned to be 
incremental. The KL-One Classifier compares the 
features specified in the speaker's descriptions (Descrl 
and the" "Outer" feature of Descr2 in Figure T) with the 
features speclhed for each element in the EL-One 
taxonomy that corresponds to one of the current 
objects of interest in the real world. Notice that some 
features are directly comparable. For example, the 
"Transparency" feature of Descrl and the 
"Transparency" feature of MAINTUBE are both equal to 
"CLEAR." Other features require further processing 
before they can be compared. The OPENING value of 
"Subpart" in Deecrl is thought of primarily as a 2-D 
cross-sectlon (such as a "hole"). while two CYLINDER 
subparts of MA/NTUBE are viewed as 13-D) cylinders 
that have the "Function" of being outlets, i.e., OUTLET- 
ATTACHMENT-POINTS. To compare OPENING and 
CYLINDER, the inference must be made that both things 
can describe the same thing (similar inferences are 
developed in (18\]). One way this inference can occur 
ts by recurslvely examining the subparts of MA/NTUBE 
wlth the partial matcher until the cylinders are 
examined at the 2-D level. At that level, an end of the 
cylinder will be defined as an OPENING With that 
examination, the MAINTUBE can be seen as described 
by Deeer I. 
Descr2 presents different problems. Descr2 refers 
to an obJect that is supposed to have a subpart that is 
TURQUOISE. The Classifier determines that Descr2 could 
not describe either the CAP or 3TAND because both are 
BLUE. It also could not describe the MAINTUBE 6 or .aIR 
CHAMBER since each has subparts that are either 
VIOLET or BLUE. The Classifier places Descr2 as best it 
can in the taxonomy, showing no connections between 
6Sin©e Deacr~ refers to ~AJNTUSE. MAINTUBE could be 
dropped as • potential referent candidate for Descr2. We 
will. hommver, leave it ae a potential candidate to make 
this eso~le more coelples. 
214 
(C~mpol, t ion PI,~I~T I C I 
C&P ( Tronzpiren¢? ~li~¢.~ ) 
I~*inlionl il,,llllltl .~l IDill#r .Sit 
i(llte~iit~on llotel~o~l iO.O 0.0 tO.O~ i 
Illinilit~on ~(l*O 0,0 O.Oli)l 
ITU~ 
ilA I I~ 
ors*or vlolrT* 
{C==pOIItl~ PLA~'r|C) 
(Transparency CI.\[ASl 
(Otllll~llO~l I~tnlth 4.|~1) 
(SuPport ICYLIND~R IO:ll~nSlOnS #Lenlt~ .~51 (Oil d I trier I.{ZS)) 
tOrlent@tlO8 *Rotillo8 10.0 0.0 0.0)1 
LlS iTrenslltton *0.0 0.0 3.?5)i) 
(F~n¢tlon OL~IrT-aTTAC)~NT-~OINT)I} 
(Subpert ICYLINO£R IOllmnsloss ILen|th 3.~l ~Olsmeter I.OI) 
T~&e~o~ (Orsentltlon /Rotation IO*O 0.0 o.oli 
(Trsnslet~o~ (0.0 0.0 .25ill)) 
($ubssrt (CYLINO~S (Oll~ll~l I~@nlth .~Si IOl~l~ter 1*12Si) 
(Orientation IRotlt|on (O.O O.O O*Oll 
~ff@l14~ ITrlfllliilo, IO.O 0.00*O))i 
(FUnctIOB '~AJDE~-A~'ACHId~NT-POINT))) 
(Subpirt (CY~IN~I\[R (Ol=~llO~l (\[.enltfl .3751 {OlllLl~ter .5)) 
(Orlentlt:on *Rotation (O.O O.O 90o0}) 
Ot~lle#~ (Tr@nsietlon (O.O ,~ 3.00))) 
(F~tnctlon O~'T~T-AlrTAC~4~,NT-POINY})i 
(Subpsrt (CYLiNOER (OllMnllOnl i~enlth ,375) IOlll/wter *S)) 
(Orientation IRotst=on (O.O O.O 90.O)) 
~Atllel2 ITrsnliotlon (O.O ,S .82~)) 
{runctlo~ OIJ~frT*ATTACiSlS~NT-POINTI))i 
~IR 
CH~R 
(C~lTAIlltll (Dimens,on| ¢t.I\[~GTH 2.75)) 
(Ccm@o=ttion PLASTICI 
(SuPport t H~.MISPH~ *Color VIOt.I~?) 
(Transparency CLI~XRI 
C~4ml~@f lOllll~flllOfll IOllmeler l.Oll 
To s (O@lehtl¢lon ISolation ¢0.0 0.00.Ol) 
I~rinliitlo~ 10.0 0.0 2.~ll))) 
(Subpirt (CY~INOER lColor VIOLET} 
(Trlnsperenov CLEAR! 
Chamoe~ (D~menllons ¢Len&th 1.01 (O~wter 2.25}1 
Bo@y (Orientation (Sot~tlO~ iO.O 0.0 0.01) 
(~rlnliOtlOn I0.0 0.0 .375)*))) 
(Sub~ert {C3"LINOER (Color SLU~i 
(Trlnspereney OPaQUe) 
(Ollenslonl (Lcnlth .3Y51 (nt~ds~ler 1.25() 
tOrlen¢lllo, (Rotation IO*O O.O 0.0)) 
Chcm~eP (Translation (O*O 0,0 0.0))) 
Bore=s= (runct*an CAP OUTLET-A~&CHM~J~-~)(NTI 
{~Dp~rt ;CYLINDER 4Color 8t.UE) 
IOl~nllOnl iLensth-*3TS) 
(Orlentatloa 
{Rogation (O*O O.O 0.0)) 
(Translation ~O.O o,0 0.0))) 
(F~tctlon 
OU'rtrT.al"~'~C)Od~lT-I=~lNT))))) 
{Sub,art ¢CY~{h'O/R (Color VIOLET) 
(Teonl~loen¢~ CLE,/I~I 
C~a~oft (Dimensions (Len¢(h .51 lOlmaet~r .37~ll 
Otl~\[rf (Orientation (Rotit*on ,0.0 0.0 ~0.0)) 
(Terns(iLion (.625 .825 .625})) 
(~ctlon OUTL~T-~TTAC;~q~NT-P~)INT)lt) 
:T~\[ ,D,mens,ons ¢LenCtn 2.7~)( 
4Compos,tton PLASTIC) 
(5uupsrt 1CYLINDER fColor BLUE1 
{Trsn|parehcy CLEARt 
r0p (D*mens;ons ¢LenGth 2.25) 40j~dlte(er .3TSl) 
(ortenta¢ion tRoLit*oh ¢0.0 0.0 0.0)) 
~TA;ND (Trenltscson (.$ O,0 .375})~ 
(Funct,on (X~L£T-&TTAC~.NT-POINTI)) 
tlub=*rt ;C~*LINO£R ¢CoIor BLUI~) 
(TrlnspereneF CL£AR) 
8@s@ I~l~eflslonl I~.~rneth .3~5) (Ollulet@@ l. OI) 
IOelcntet¢ofl iRotmllon {0.0 0.0 0.0)) 
~Teensiiliofl fO*O 0.0 0.0))) 
IF~not.on OUTLLT-*rI"AC)O4E~T-POINTt))I 
Figure 8: The objects in focus 
it and any of the objects currently in focus. At this 
point, a probable misreference is noted. The reference 
mechanism now tries to find potential referent 
candidates, using the taxonomy exploration routine 
described in Section 3.2.3. by examining the elements 
closest to Descr2 In the taxonomy and using the partial 
matcher to score how close each element is to Descr2. 7 
The matcher determines MA/NTUBE. STAND, and AIR 
7The partial mctcher scores are numerical scores computed 
from 0 set of role scores that indicate how well each 
feature of the two descriptions match Thosa feature 
scores are represented OS a scale: HIGH\[ST |+|, {> <(, 
{-(. {?l, {-( COWEST. 
CHAMBER as reasonable candidates 
comparing their features to Descr2. 
Scoring Descr2 to MAINTUBE. 
by aligning and 
o a TUBE is a kind of DEVICE: (>) 
o the Transparency of each is CLEAR; (-'{ 
o the Composltlon of each Is pLASTIC. (~-) 
o a TUBE Implies Analoglcal-Shape TUBULAR. 
which implies Shape CYLINDRICAL, which ~s a 
kind of Shape ROUND: (>) 
o the recurslve partlal matching of subparts: A 
BASE Is viewed as a kind of BOTTOM. 
Therefore, BASE In Descr2 could match to the 
subpart In MA/NTUBE that has a Translation 
of (0.0 0,0 0.0) - I.e., Threads of MAINTUBE. 
However, they mismatch since color 
TURQUOISE In Descr ~- differs from color VIOLET 
of MAINTUBE. (-) 
Scoring Descr2 to STAND: 
o a TUBE Is a kind of DEVICE, (>) 
o the Transparency of each is CLEAR. (-, 
o the Composltlon of each is PLASTIC. (-) 
o a TUBE *mphes Analoglcal-Shape TUBULAR. 
whlch imphes Shape CYLINDRICAL. which Ls a 
kind of Shape ROUND; (>) 
o the recurswe partial matching of zubparts. 
BASE in Descr2 could match to the subpart (n 
STAND that has a Translation of 10.0 00 0.0) 
- I.e.. Base of STAND. However. they 
mismatch since color TURQUOISE m Descr2 
differs from color BLUE of STAND (-) 
Scoring Descr2 to AIR CHAMBER: 
o a CONTAINER Is a kind of DEVICE. (>) 
o the Transparency of Descr2. CLEAR. matches 
the Transparency of ChamberTop. 
ChamberOutlet and ChamberBody of AIR 
CHAMBER but mismatches the Transparency 
of ChamberBottom of AIR CHAMBER. 
Therefore. the partial match is uncertain. (?) 
o the Composltlon of each is PLASTIC, (+) 
o the subparts of AIR CHAMBER have Shape 
HEMISPHERICAL and CYLINDRICAL which are 
each a klnd of Shape ROUND: {>) 
o the recurstve partial matching of subparts. 
BASE m Desor2 could match to the subpart in 
AIR CHAMBER that has a translation of (0.0 
0.0 0.0) - i.e., ChamberBottom o( .4IR 
CHAMBER. However. they mismatch since 
color TURQUOISE m Deacr2 differs from color 
BLUE of AIR CHAMBER {-) 
The above analysls using the partial matcher 
provldes no clear winner smce the differences are so 
close causing the scores generated for the candldates 
to be almost exactly the same (i.e.. the only difference 
was In the score for Transparency). All candidates. 
hence, will be retained for now. 
215 
At this point, the knowledge sources and their 
associated rules that were mentloned earlier apply. 
These rules attempt to order the feature values m the 
speaker's description for relaxation. First. we'll order 
the features m DescrZ using lingulstlc knowledge. 
Linguistic analysls of Deser2, "... are clear plastlc ... a 
rounded pace wlth a turquoise base ... Both are 
tubular ... fits loosely over .... " tells us that the 
features were specified using the following modifiers. 
o Adlect~ve: (Shape ROUND) 
o Prepositional Phrase: (Subpart (BASE (Color 
TURQUOISE))) 
o Predicate Complement: (Transparency CLEAR), 
IComposltion PLASTIC), (Analoglcal-Shape 
TUBULAR), (Fit LOOSE) 
Observations from the protocols (as described by the 
rules developed In \[13\]) has shown that people tend to 
relax first features specified as adlectlves, then as 
preposltlonal phrases and finally as relative clauses or 
predicate complements. Thls suggests relaxation of 
Descr2 in the order: 
\]Shape} < |Color.Subport| 
< |Tronsporency.COmpOSi t ;on.Analogical-ShaDe ,F; t | 
The set of features on the left side of a "<" symbol is 
relaxed before the set on the rlght side The order 
that the features inside the braces. ")~", are relaxed 
is left unspecified {i.e., any order of relaxation Is 
alrlght) Perceptual Information about the domain also 
provldes suggestlons. Whenever a feature has feature 
values that are close, then one should be prepared to 
relax any of them to any of the others (we call thls 
the "clustered feature value rule") \[n thls example. 
smce the colors are all very close - BLUE. TURQUOISE, 
and VIOLET - then Color may be a reasonable thing to 
relax. Hxerarchlcal Information about how closely 
related one feature value Is to another can also be 
used to determine what to relax. The Shape values are 
a good example. A CYLINDRICAL shape Is also a CONICAL 
shape, which Is also a 3-D ROUND shape. Hence. It Is 
very reasonable to match ROUNDED to CYLINDRICAL. All 
of these suggestions can be put together to form the 
order: 
~Sho~e.Co~or| < ~Su~l)art~ 
< |Trangporeflcy ,Compos i t i on. 
Ana| og i ca I--Shope. F i I: |. 
The referent candldates MAINTUBE. STAND, and 
.41R CHAMBER can be examined and possibly ordered for 
relaxation using the above feature ordering For this 
example, the relaxation of Descr2 to any of the 
candidates requires relaxing their SHAPE and COLOR 
features. Since they each require reiaxmg the same 
features, the candidates can not be ordered w, th 
respect to each other (i.e., none of the possible feature 
orders is better for relaxing the candidates). Hence. 
no one candidate stands out as the most likely 
referent. 
While no orderlng of the candidates was posslble. 
the order generated to relax the features In the 
speaker's description can be used to guide the 
relaxation, of each candldate. The relaxation methods 
mentioned at the end of the last section come Into use 
here. Generate-Simdar-Shape-Values can determine 
that HEMISPHERICAL and CYLINDRICAL shapes of the AIR 
CHAMBER are close to the 3D-ROUND shape.. This holds 
equally true for the cyhndrlcal shapes of the 
MAINTUBE and the STAND. Generate-Similar-Color- 
Values next trms relaxing the Color TURQUOISE. It 
determmes the colors BLUE and GREEN as the best 
alternates. Here only two clear winners exist - the 
AIR CHAMBER and the STAND - while the MAINTUBE is 
dropped as a candidate smce it Is reasonable to relax 
TURQUOISE to BLUE or to GREEN but not to VIOLET 
Subpart, Transparency, Analoglcal- Shape, and 
Composition provide no further help {though. the fact 
that the AIR CHAMBER has both CLEAR and OPAQUE 
subparts mght put it slightly lower than the ST,hVD 
whose subparts are all CLEAR. Thls difference. 
however, is not slgndicant.). Thls leaves trial and 
error attempts to try to complete the FIT action. The 
one (if any) that fits - and fits loosely - Is selected 
as the referent. The protocols showed that people 
often do just that - reducing their set of choices down 
as best they can and then taking each of the remalnmg 
chmces and trying out the requested action on them 
4 Conclusion 
Our goal m thls work Is to budd robust natural 
language understanding systems, allowmg them to 
detect and avold mlscommunlcatlon. The goal is not to 
make a perfect listener but a more tolerant one that 
could avold many mistakes, though still wrong on 
occasion. In Section 2, we mtroduced a taxonomy of 
mlscommunlcatlon problems that occur tn expert - 
apprentice dialogues. We showed that reference 
mistakes are one kind of obstacle to robust 
communication. To tackle reference problems, we 
descrlbed how to extend the succeed/fad paradigm 
followed by previous natural language researchers 
We represented real world objects hlerarchlcallv 
in a knowledge base using a representation language, 
KL-One. that follows in the tradition of semantlc 
networks and frames. In such a representatlon 
framework, the reference identification task looks for a 
referent by comparing the representation of the 
speakers Input to elements in the knowledge base by 
using a matching procedure. Failure to find a referent 
in previous reference identlhcatlon systems resulted In 
the unsuccessful termination of the reference task We 
claim that people behave better than this and exphcltly 
illustrated such cases in an expert-apprentlce domain 
about toy water pumps. 
We developed a theory of relaxation for 
recovering from reference failures that provides a 
much better model for human performance. When 
people are asked to identify objects, they go about it 
m a certain way. flnd candidates, adjust as necessary, 
re-try, and, if necessary, glve up and ask for help. We 
claim that relaxation is an Integral part of this process 
and that the particular parameters of relaxation differ 
from task to task and person to person. Our work 
models the relaxation process and provldes a 
computatlonal model for experimenting w~th the 
different parameters. The theory incorporates the 
same language and physical knowledge that people use 
m performing reference identification to guide the 
relaxation process. Thls knowledge Is represented as a 
set of rules and as data m a hierarchical knowledge 
base. Rule-based relaxation provided a methodical way 
to use knowledge about language and the world to find 
a referent. The hlererchxcal representation made It 
posslble to tackle issues of Impreclslon and over- 
specification In a speakers description. It allows one 
to check the position of a description in the hierarchy 
and to use that position to fudge Imprecision and 
over-speclfication and to suggest possible repairs to 
the descriptlon. 
216 
Interestingly. one would expect that "closest" 
match would suffice to solve the problem of finding a 
referent. We showed, however, that it doesn't usually 
provide you with the correct referent. Closest match 
isn't sufficient because there are many features 
associated wlth an object and, thus. determimng whlch 
of those features to keep and which to drop Is a 
difficult problem due to the combinatorlcs and the 
effects of context. The relaxation method described 
circumvents the problem by using the knowledge that 
people have about language and the physical world to 
prune down the search space. 
ACKNOWLEDGEMENTS 
I want to thank especially Candy Sidner for her 
inslghtful comments and suggestions during the course 
of thls work. I'd also like to acknowledge the helpful 
comments of George Hadden, Diane L~tman, Marc Vilam, 
Dave Waltz, Bonme Webber and Bill Woods on this paper. 
Many thanks also to Phil Cohen, Scott Fertig and Kathy 
Starr for providing me wlth thelr water pump dmlogues 
and for their invaluable observations on them. 
REFERENCES 
\[I\] Allen. James F. A Plan-Based Approach to Speech 
Act Recognztion. Ph.D. Th.. University of Toronto. 1979. 
\[2\] Appelt, Douglas E. Planning AIat~Tal Language 
Utterances to Satisf!/ Multiple Goals. Ph.D Th., 
Stanford Unlverslty, 1981. 
\[3\] Brachman, Ronald J. A 3tr~ctural Paradigm /or 
Representing Knowledge. Ph.D. Th., Harvard Umverslty, 
1977 Also, Technlcal Report No. 3605. Bolt Beranek 
and Newman Inc. 
\[4\] Brown, John Seely and Kurt VanLehn. "Repair 
Theory A Generative Theory of Bugs m Procedural 
Sk111s." Cognitive Science ~, 4 (1980), 379-426 
\[5\] Cohen. Philip R On Kno~vlng What to Sa?l. 
Planning Speech Acts. Ph.D. Th., University of Toronto, 
1978. 
\[8\] Cohen. P.. C Perrault and J. Allen. Beyond 
Question Answering. In KnowLedge Representation and 
Natural Language Processing. W. Lehnart and M. Ringle, 
Ed..Lawrence Erlbaum Associates, 1981. 
\[7\] Cohen. Phlhp R. The need for Referent 
Identlficatlon as a Planned Actlon. Proceedings of 
IJCAI-81. Vancouver. B.C., Canada, August. 1981, pp. 
31-35. 
\[8\] Cohen, Phlhp R, Scott Fertlg and Kathy Start. 
Dependencies of Discourse Structure on the Modahty of 
Communlcatlon. Telephone vs. Teletype. Proceedings of 
ACL, Toronto. Ont., Canada, June, 1982, pp. 28-35. 
\[9\] Cohen, Philip R. "The Pragmatlcs of Referring and 
the Modahty of Communlcatlon." Computational 
Linguistics 10, 2 (April-June 1984). 97-146. 
\[10\] Gentner. Dedre. The Structure of Analogical 
Models In Science. Bolt Beranek and Newman Inc.. July, 
1980. 
\[11\] Goodman. Bradley A. Mlscommunlcatlon an Task- 
Oriented Dialogues. KRNL Group Working Paper, Bolt 
BeraneK and Newman Inc., April 1982. 
\[12\] Goodman, Bradley A. Repairing Miscommunlcatlon: 
Relaxation m Reference. Proceedings of AAAI-83. 
Washlngton. b.C.. August, 1983, pp. 134--138. 
\[13\] Goodman, Bradley A. Communication and 
Miscomccr,~n~cation. Ph.D. Th., University of Illinols. 
Urbane, 1984. 
\[14\] Gross, Barbara J. The Representation and Use of 
Focus in Dialogue Under'standing. Ph.D. Th., University 
of Californla, Berkeley. 1977. Also, Technical Note 151. 
Stanford Research Instltute. 
\[15\] Gross. Barbara J. Focusing and descriptions in 
natural language dialogues. In Elements of Discourse 
Understanding, Joshi. Webber and Sags, Ed.,Cambrldge 
University Press, 1981, pp. 84-ID5. 
\[16\] Lipkis, Thomas. A \](L-ONE Classifier. Proceedings 
of the 1981 KL-One Workshop, June, 1982, pp. 128-145. 
Report No. 4842, Bolt Beranek and Newman Inc. Also 
Consul Note # 5, USC/Information Sciences Institute. 
October 1981. 
\[17\] L4tman, Diane J. and James F. Allen. A Plan 
Recogmtion Model for Clarlfication Subdialogues. 
Proceedings of Coling84, Stanford Umverslty, Stanford, 
CA., July, 1984, pp. 302-311. 
\[18\] Mark. William. Realization. Proceedings of the 
1981 \](L-One Workshop, June, 1982, pp. 78-89. Report 
No. 4842, Bolt Beranek and Newman Inc. 
\[19\] McKeown, \](athleen R. Recurslon In Text and Its 
Use in Language Generation. Proceedings of AAAI-83. 
Washington, D.C., August, 1983. pp. 270-273. 
\[20\] Relchman, Rachel. "Conversational Coherency." 
Cognitive Science 2. 4 (1978). 283-327. 
\[21\] Relchman. Rachel. Plain Speaking: A Theory and 
Grammar of 3pontaneo~s Discourse. Ph.D Th.. Harvard 
Umverslty, 1981. Also, Technical Report No. 4861, Bolt 
Beranek and Newman Inc. 
\[22\] Ringle. Martin and Bertram Bruce. Conversation 
Failure. In Knowledge Representation and Hatlzral 
Language Processing, W. Lehnart and M. RIngle. 
Ed.,Lawrence Erlbaum Assocmtes, 1981. 
\[:~3\] Sidner. C L.. and Israel, D.J. Recogmzmg 
antended meamng and speaker's plans. Proceedings of 
the Internatlone, l Joint Conference In Artlfictal 
Intelhgence. The International Joint Conferences on 
Artlficai Intelligence. Vancouver. B.C. August. 1981, pp. 
203-208. 
\[24\] Sldner, Candace Lee. To,yards cz Computational 
Theory of Definite AnaphoTa Comprehension in English 
Discourse. Ph.D. Th., Massachusetts Instltute of 
Technology, 1979. Also, Report No. TR-537, MIT AI Lab 
\[25\] Sidner, C. L.. M. Bates, R. J. Bobrow, 
R. J. Brachman, P. R. Cohen, D. J. Israel, J. Schmolze. 
B. L. Webber, W. A. Woods. Research an Knowledge 
Representatlon for Natural Language Understanding 
Report No. 4785, Bolt Beranek and Newman Inc.. 
November, 1981. 
\[26\] Sidner, C. L.. Bates, M.. Bobrow. R.. Goodman, B.. 
Haas, A.. Ingrla, R.. Israel, D.. McAllester. D.. Moser, M.. 
Schmolze, J.. Vilem, M. Research an Knowledge 
Representation for Natural Language Understanding - 
Annual Report. I September 1982 - 31 August 1983. 
Technical Report 5421. BBN Laboratories. Cambradge. 
MA, 1983. 
\[27\] Sidner. C., Goodman. B.. Haas, A.. Moser. M.. 
Stallard, D.. Vilem, M. Research m Knowledge 
Representation for Natural Language Understanding - 
Annual Report, I September 1983 - 31 August 1984. 
Technical Report 5894. BBN Laboratorles Inc., 
Cambrldge, MA, 1984. 
\[28\] Webber, Bonnle Lynn. A Forma~ App~'oach to 
/~.scourse Anapho1"a. Ph.D. Th., Harvard University. 
1978. Also, Techmcal Report No. 3761. Bolt Beranek 
and Newman Inc. 
217 
