File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2182_metho.xml

Size: 19,859 bytes

Last Modified: 2025-10-06 14:13:42

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-2182">
  <Title>COLLABORATION ON REFERENCE TO OBJECTS THAT ARE NOT MUTUALLY KNOWN</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 B: Okay.
</SectionTitle>
    <Paragraph position="0"> In this dialogue, speaker B is not confident that he will be able to identify the intersection at Lowell Street, and so suggests that the intersection might be marked.</Paragraph>
    <Paragraph position="1"> Speaker A replies with an elaboration of the initial expression, and B finds that he is now confident, and so accepts the reference.</Paragraph>
    <Paragraph position="2"> This type of reference is different from the type that has been studied traditionally by researchers who have usually assumed that the agents have mutual knowledge of the referent (Appelt, 1985a; Appelt and Kronfeld, 1987; Clark and Wilkes-Gibbs, 1986; Heeman and Hirst, 1992; Searle, 1969), are copreseut with the referent (Heeman and Hirst, 1992; Cohen, 1981), or have the referent in their focus of attention (Reiter and Dale, 1992). In these theories, the speaker has the intention that the hearer either know the referent or identijy it immediately.</Paragraph>
    <Paragraph position="3"> Although the type of reference that we wish to model does not rely oll these assumptions, we can nevertheless draw from these theories. Thus, we base our model on the work of Chtrk and Wilkes-Gibbs (1986), and Heeman and Hirst (1992) who both modeled (the first psychologically, and the second computationally) how people collaborate on reference to objects for which they have mutual knowledge. We will briefly discuss these models, before we describe our own.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="1118" type="metho">
    <SectionTitle>
2 COLLABORATION ON REFERENCE
</SectionTitle>
    <Paragraph position="0"> In their t'nndamental experiment, Clark and Wilkes-Gibbs (1986) demonstrated that conversants use a set of inherently collaborative procedures to establish the mutual belief that the hearer has understood a reference. In the experiment, two subjects were each given a set of hard-to-describe tangram figures that were kept hidden from the other. One subject was required to get the other subject to rearrange his set to match the ordering of her set, and to do so through conversation alone. Thus, the two subjects were obliged to collaborate on constructing descriptions of the figm'es that would allow them to be unambiguously identified; for example, the one that looks like an angel with a stick.</Paragraph>
    <Paragraph position="1"> Clark and Wilkes-Gibbs developed the following process model to explain their findings. To initiate the process, speaker A presents an initial version of a referring expression on which speaker B passes jndgment. 13 can either accept it, reject it, or postpone his decision until later. If B rejects o1&amp;quot; postpones, then  the expression must be refashioned by either A or 13.</Paragraph>
    <Paragraph position="2"> Refashionings are accomplished in three main ways: repairing the expression by correcting speech en'ors, expanding the expression by adding more qualifications, or replacing part or all of the expression with new qualifications. Each judgment/refashioning pair operates on the current referring expression, replacing it with a new one. This process continues until the expression, kept in the participants' common ground, is mutually accepted.</Paragraph>
    <Paragraph position="3"> This excerpt from Clark and Wilkes-Gibbs's data illustrates rejection (line 2), replacement (line 2), and acceptance (lines 3 and 4): Example 3  1 A: Okay, and the next one is tile person that looks like they're carrying something and it's sticking out to the left. It looks like a hat that's upside down.</Paragraph>
    <Paragraph position="4"> 2 B: The guy that's pointing to tile left again? 3 A: Yeah, poiuting to the left, that's it\[ /laughs/</Paragraph>
  </Section>
  <Section position="6" start_page="1118" end_page="1118" type="metho">
    <SectionTitle>
4 B: Okay.
</SectionTitle>
    <Paragraph position="0"> Heernan and Hirst (1992) rendered Clark and Wilkes-Gibbs's model computationally by casting it into the planning paradigm. Their model covers both the initiator of areferring action, and the recipie/~t wbo tries to understand the reference. In this moctel, the initiator has the goal of having tire recipient identify the referent, and so constructs a referring plan given a set of beliefs about what the recipient believes. The result of the initiator's plan is a set of surface speech actions, and hearing only these actions, the recipient tries to infer a plan in order to uuderstand the reference. Thus, referring expressions are represented as plan derivations, and an unsuccessful referring expression is an invalid plan in whose repair the agents collaborate.</Paragraph>
    <Paragraph position="1"> An agent can infer a plan even if it is invalid in that agent's view (Pollack, 1990). The evahmtion process attempts to find an instantiation of the variables such that all of the constraints are satisfied and the mental actions executable with respect to the hearer's beliefs about the speaker's beliefs.</Paragraph>
    <Paragraph position="2"> If tile recipient linds the initial referring expression phm invalid, then the agents will collaborate in its repair. Heeman and Hirst used plan repair techniques to refashion an expression, and used discourse phms, or recta-plans, to communicate the changes to it. Thus, a collaborative dialogue is modeled in terms of the evolution of tile referring plan.</Paragraph>
    <Paragraph position="3"> First, an agent lnust comnmnicatc that she has not understood a phm. Depending on how the referring plan coustrains tile choice of referent, she constructs an instance of either reject-plan or postpone-plan, whose resulting surface speech actions arc s-rojecl; and s-postpone respectively.</Paragraph>
    <Paragraph position="4"> Next, one agent or tile other must refashion the referring expression plan in the context of the judgment by either replacing some of its actions (by using replace-plan) or by adding new actions to it (by using expand-plan). Tile result of both plans is the surface speech action a-actions.</Paragraph>
    <Paragraph position="5"> Because the model can play the role of both the initiator and the recipient, and because it can perform both plan construction and inference, two copies of the model can converse with one another, acting alternately as speaker and hearer. Acting as hearer, oue copy of the system performs plan in ference on each set of surface speech actions that it observes, and updates the state of the collaboration. It then switches roles to become the speakel, and looks lot a goal to adopt, and constructs a plan that achieves it. After responding with the surface actions of the phm, it updates the state of the collaboration, presupposing that the other copy will accept the plan. The system repeats the process until it can lind no more goals to adopt, at which time it switches back to being the hearer and waits for a response from the other copy.</Paragraph>
  </Section>
  <Section position="7" start_page="1118" end_page="1119" type="metho">
    <SectionTitle>
3 CONFIDENCE AND SALIENCE
</SectionTitle>
    <Paragraph position="0"> A crucial assumption of Clark and Wilkes-Gibbs's work- -and of l~leeman and Hirst's model--is that the recipient of the initial referring expression already has some knowledge of ttle referent in question. In Clark and Wilkes-Gibbs's experiments, for: example, it is one of the tangram figures. In other words, tile hearer can understand a referring expression if its content uniquely describes an object that he knows about.</Paragraph>
    <Paragraph position="1"> Obviously, an agent cannot use this criterion to understand the reference to the building in Example l---he has never heard of the buildiug before. What criteria, then, does he base his understanding on? The basis of our model is that the hearer can accept a referring expression plan if (1) the plan contains a description that is use./M for making an identification plan that tile hearer can execute to identify the referent, and (2) the hearer is confident that the identification plan is adequate.</Paragraph>
    <Paragraph position="2"> The first condition, originally described by Appclt (1985b), is important because the success of the refen'ing action depends on the hearer formulating a uselifl identilication plan. We take the referring expression plan itself to be the identification plan. The mental actions in the plan will encode only useful descriptions. For the second condition to hold, the hearer must believe that the identification plan is good enough to tmiquely identify the referent when it becomes visible. This inw)lves giving enough information by using the most salient atributes of the referent.</Paragraph>
    <Paragraph position="3"> hl otrr model, each agerlt associates a numeric confidence value with each of tile attributes in the relSrring expression, and by composing these, computes a  level of confidence in the adequacy of the complete referring expression plan that can be interpreted as ranging from low confidence to high confidence. The present composition function is simple addition, but one could envision more complex systems to compute confidence, such as an algebra of confidence or a non-numeric system. If the overall confidence value exceeds some set value, the agent's confidence threshold, then the agent believes the plan is adequate. That is, if the agent is the initiator, she believes that the other will be able to understand the reference; if the agent is the other, he believes that he has understood the reference.</Paragraph>
    <Paragraph position="4"> Now, the confidence value of each attribute is equivalent to its salience within the context of the referring expression. Salience, for our purposes in directiongiving, is primarily visual prominence, but can also involve identifiability, familiarity, and functional importance (Devlin, 1976; Lynch, 1960). One approach is to encode the salient properties in a static hierarchy as Davis (1989), and Reiter and Dale (1992) have done. I But, ideally, salience should depend on the context sun-ounding the referent. For example, the height of a tall building would normally be salient, but not if it were surrounded by other tall buildings.</Paragraph>
    <Paragraph position="5"> This computation Would be quite complex, so we have adopted a middle ground between the simple context-independent approaches, and a full-blown contextual analysis. The middle ground involves taking the type of object into account when choosing attributes and landmarks that relate to it. For example, height and architectural style can be very salient features for describing a building, but not for describing an intersection, for which having a sign or traffic lights is important. This approach still allows us to encode salience in a hierarchy, but it is dependent on the referent.</Paragraph>
    <Paragraph position="6"> Table 1 shows an example of a simple salience hierarchy that an agent might have. The hierarchy is actually a set of partial orderings of attributes, represented by lambda expressions, indexed by object type. In the table, the confidence value of using architectural style to describe a building is 4. The confidence value of a tall building is 3, and so this attribute is less salient than architectural style. The other rows (for describing intersections) follow similarly. 2 Each agent has his own beliefs about salience. It is the difference in their beliefs that leads to the necessity for collaboration on reference. Ideally, the initiator should construct referring expressions with the recipients' (believed) beliefs about salience in mind, but we have chosen to avoid this complexity by making the simplifying assumption that the initiator is an expert IThese models assmne that all agents have identical beliefs, which is clearly insufficient for modeling collaborative dialogue. 2Given information about salience, we could construct such a hierarchy, but we do not presume that it would be easy to know what is salient.</Paragraph>
    <Paragraph position="7"> (and thus knows best what is salient).</Paragraph>
  </Section>
  <Section position="8" start_page="1119" end_page="1120" type="metho">
    <SectionTitle>
4 PLANS FOR REFERRING
</SectionTitle>
    <Paragraph position="0"> An agent uses his salience hierarchy for two related purposes: the first to determine what is salient in a particular situation, and the second to determine the adequacy of a description. So, the hierarchy is accessed during both plan construction and plan inference.</Paragraph>
    <Paragraph position="1"> In plan construction, the hierarchy is used for constructing initial referring expression plans, and for elaborating on inadequate plans by allowing an agent to choose the most salient properties of the referent first. The agent constructs an initial referring expression plan in almost the same way as in Heeman and Hirst's system. Mental actions in the intermediate plans of a referring expression plan allow the speaker to choose the most salient attributes that have not yet been chosen, and constraints in the surface speech actions make sure the speaker believes that each attribute is true. 3 Other mental actions in the intermediate plans add up the confidence values of the attributes, and a final constraint makes sum that the sum exceeds the agent's confidence threshold. So, for a referring plan to be valid, it must describe a unique object, and it must be adequate with respect to the speaker's beliefs.</Paragraph>
    <Paragraph position="2"> This means that attributes beyond those required \['or a unique description could be necessary. For example, to construct the reference to the building in Example l, the speaker consulted her salience hierarchy (in table 1) and determined that architectural style is salient. Hence, she described the building as Jknnylooking. This single attribute was enough to exceed her confidence threshold.</Paragraph>
    <Paragraph position="3"> During plan inference, the salience hierarchy is used when evaluating a recognized plan. The mental actions in the intermediate plans determine the confidence values of each attribute (from the heamr's salience hiemmhy), and add them up. The final constraint in the plan makes sure that the hearer's confidence threshold is exceeded. Thus, judging the adequacy of a referring expression plan falls out of the regular plan evaluation process. If the final constraint does not hold, then the invalidity is noted so that the plan can be operated on appropriately by the discourse plans.</Paragraph>
    <Paragraph position="4"> For example, after recognizing the reference in Example l, the hearer evaluates the plan. Assuming he believes the salience information in table 1, he computes the confidence value of 4. If this value exceeds his confidence threshold, then he will accept the plan.</Paragraph>
    <Paragraph position="5"> If not, he will believe that them is an error at the constraint that checks his confidence threshold.</Paragraph>
    <Paragraph position="6"> ~ln tleeman and Itirst's model, an attribute has to be mutually believed to be used. Here, mutual belief is not possible because the bearer has no knowledge of the referent, but mutual belief is an intended effect of using this plan.</Paragraph>
  </Section>
  <Section position="9" start_page="1120" end_page="1120" type="metho">
    <SectionTitle>
5 SUGGESTION AND ELABORATION
</SectionTitle>
    <Paragraph position="0"> If the recipient is not confident in the adequacy of the plan, he uses an instance of postpone-plan to inlbrm Ihe initiator that he is not conlident ol' its adequacy, thereby causing tile initiator to raise her own confidence threshold. Now, allhough he cannot refashion the expression himself, he does have the ability to help the initiator by suggesting a good way to expand it; suggestion is a conversational move in which an agent suggests a new attribute that he deems would increase his confidence in the cxpressiou's adequacy if the expression were expanded to include the attribute. Coutinuing with the example, if the hearer were uol confident about the adequacy of the.fitnnylooking building, hc might suggest that the initiator use height (as well as architectural style), by asking Is it tall?. Frmn this suggestkm the initiator might expand her expression to the tallftmny-looking building.</Paragraph>
    <Paragraph position="1"> So, in our sense, a suggestion is an illocutionary act of questioning; along with actually suggesting a way to expand a plan, the agent is asking whether or not tile referent has lhe suggested attribute.</Paragraph>
    <Paragraph position="2"> To decide what suggestion to make, tile agent uses an instance of suggest-expand-plan, which has a mental action in its decomposition that chooses the attribute that he believes is the most salient that has not been used ah'eady. Tile result of tile plan is the surface speech action, s-suggest, that communicates the suggestion.</Paragraph>
    <Paragraph position="3"> However, only the initiator of tile referring expression can actually elaborate a referring expression, because only she has tile knowledge to do so. Depending on whether the hearer of the expression makes a suggestion or not, the initiator has two options when elaborating a plan. If no suggestion was made, then she can expand the plan according to her own beliefs about the referent's attributes and their salience. On the other hand, ira suggestion was made, she could instead attempt to expand the plan by aflinning or denying the attribute suggested. \[f possible, she shouM use the suggestion to elaborate tile plan, lhus awfiding unwanted conversational implicature, but its use may not be enough to make tim plan adequate.</Paragraph>
    <Paragraph position="4"> Tile decomposition of expand-plan calls the plan constructor with the goal el'constructing a modif iers schema and with the suggested attribute as input-in a sense, continuing the construction of tim initial referring plan. The plan constructor attempts to find a plan with the surface speech actions for the snggested attribute in its yield, but this might not be possible. In any case, the speaker constructs an expansion that will make the plan adequate according to her beliefs. 4 The response to a suggestion depends, obviously, on whether or not the suggestion was used to expand tile plan. The speaker can (1) affirm that the plan was expanded with the suggestion by using the s-affirm speech act; (2) aflirm that the suggestion was used, along with additional attributes that weren't suggested, by using s-affirm and s-actions; or (3) deny the suggestion with s-deny, and inform the other by s-act ions as lo how the plan was expanded.</Paragraph>
    <Paragraph position="5"> By repeatedly using the postponement, elaboration, aud suggestion moves, the two agents collaborate through discourse on refashioning the referring expression until they mutually believe tlmt the recipient is conlidcnt that it is adequate.</Paragraph>
  </Section>
  <Section position="10" start_page="1120" end_page="1120" type="metho">
    <SectionTitle>
6 EXAMPLE
</SectionTitle>
    <Paragraph position="0"> We have implemented tile model in Prolog. Table 2 shows the input/output of two copies of the system cue gaging in a simplified version of Example 2. Note that the system generates and tmderstands utterances in the form of descriptions of the smTace speech actions, not surface natural language forms. The existence of a parscr and a generator that can map between the two forms is assumed. Complete details of this example and of the model are given by Edmonds (I 993).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML