File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/j88-3003_metho.xml
Size: 54,706 bytes
Last Modified: 2025-10-06 14:12:11
<?xml version="1.0" standalone="yes"?> <Paper uid="J88-3003"> <Title>MODELING THE USER'S PLANS AND GOALS</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 INFERRING AND MODELING THE TASK-RELATED PLAN </SectionTitle> <Paragraph position="0"> In order to reason about what the user wants to accomplish, the system must have knowledge about the goals that a user might pursue in a domain and plans for accomplishing these goals. We view a plan as the means by which an agent can accomplish a non-primitive task-related goal. Using an extension of the STRIPS formalism (Fikes et al. 1971), we represent a plan as a structure containing applicability conditions, preconditions, a plan body, and effects.</Paragraph> <Paragraph position="1"> Applicability conditions and preconditions both represent conditions that must exist before a plan can be executed. However, an agent can plan to satisfy preconditions, whereas it is generally anomalous to plan to satisfy applicability conditions; the latter determine whether it is reasonable to even consider a particular plan for achieving a desired goal. For example, suppose an agent wants to vacation on a particular island. If the island has an airport and the agent has money for a ticket, then the agent can plan to fly there. But the requirements that the island have an airport and that the agent have money for a ticket are intuitively different. If the agent does not have enough money for a ticket, he can plan to try and satisfy this requirement; but if the island does not have an airport, it is unreasonable for the agent to arrange for an airport to be built on the island so that he can fly there for a vacation.</Paragraph> <Paragraph position="2"> Of course, agents sometimes do unreasonable acts. If the agent in the above case is very wealthy, is adamant about vacationing on this particular island, and abhors boats, he may build an airport on the island and charter a plane to fly him there. Our plans are intended to represent normal plans that an agent might be expected to pursue, and the distinction between preconditions and applicability conditions is useful in preventing consideration of plans that would occur only in exceptional circumstances. How exceptional plans should be incorporated into a plan recognition system is an area for future work.</Paragraph> <Paragraph position="3"> Wilkins used preconditions similar to our applicability conditions in representing operators in the SIPE system (Wilkins 1984). What we call a precondition, Wilkins incorporated into the set of actions and goals comprising an operator. His reasons for having unplannable preconditions in his representation scheme were both to capture the appropriateness of applying an operator in a given situation and to connect different levels of detail in a hierarchical planner. A proposition, that at one level of abstraction was part of the specification of how an operator was to be performed, might appear at a lower level of abstraction as a precondition of an operator, indicating that further planning for the lower level operator is inappropriate unless the proposition is already satisfied. But mixing standard preconditions (conditions that must exist before an operator can be performed, but which can be planned for) with the set of goals and actions that constitute performing an operator fails to capture the intuitive difference between the two. For this reason, our representation scheme distinguishes among applicability conditions, preconditions that can be planned for, and how one goes about performing an action.</Paragraph> <Paragraph position="4"> The plan body contains a conjunction of subgoals, and the effects represent the results of successfully executing the plan. Arguments in plans are either constants, represented as uppercase strings, or typed variables, represented as lowercase strings preceded by the character &quot;_&quot; and followed by the characters &quot;:&&quot; and Computational Linguistics, Volume 14, Number 3, September 1988 25 Sandra Carberry Modeling the User's Plans and Goals an uppercase string giving the variable's type. Figure 1 presents a sample plan used by TRACK. Its plan body states that in order to learn the material of a section of a course, an agent must both learn from the person teaching the section and study the text used in the section. TRACK's plans are hierarchical, since many of the subgoals in the bodies of plans and many preconditions are non-primitive and therefore have associated plans which may be substituted for them. Thus a plan can be expanded to any desired degree of detail by replacing non-primitive preconditions and subgoals with their associated plans.</Paragraph> <Paragraph position="5"> At the outset of an information-seeking dialog, the system has little knowledge about the information-seeker's (IS) purpose in requesting information. In most cases, a complete plan for IS cannot be constructed during the first part of a dialog. Instead, potential goals must be inferred from individual utterances and integrated into the overall plan structure, thereby incrementally expanding and instantiating the system's model of IS's plan as the dialog progresses.</Paragraph> <Paragraph position="6"> Oftentimes there will be many domain goals that a single utterance might address. For example, if a student asks what time Dr. Smith arrives in the morning, he may want to either visit Dr. Smith or call him in his office. Furthermore, even if we can identify a single domain goal addressed by an utterance, there may be many ways in which that goal could be incorporated into an overall plan. For example, if a student asks whether Political Science 210 is offered in the spring, we might infer that the student wants to take Political Science 210. But how should this goal be built into the student's overall plan? Perhaps he is considering taking Political Science 210 in order to satisfy a breadth requirement or perhaps he wants to major or minor in Political Science. So the issue that must be addressed in dynamically inferring IS's underlying task-related plan from an ongoing dialog is the following: how can we identify which of many candidate goals is the actual goal which IS is addressing with a particular utterance, and how can we determine where this particular goal fits into IS' s overall plan? Two factors appear to provide a basis for a solution: 1) the organized nature of naturally occurring information-seeking dialogues, as exhibited in dialog transcripts, and 2) the assumption that IS and IP are working cooperatively to help IS achieve his plan construction goals. These two factors produce a structure in information-seeking dialogs. As a result, we can formulate focusing heuristics that specify how individual utterances should be related to the existing dialog context, as represented by the plan inferred for IS and his current focus of attention in that plan.</Paragraph> <Paragraph position="7"> Thus our approach is the following: 1. hypothesize from an individual utterance a set of domain-dependent candidate focused goals that may represent the information-seeker's focus of attention in the task; and 2. use focusing heuristics to select the candidate focused goal most apropos to the existing dialogue context and incorporate it into the model of the information-seeker's plan.</Paragraph> <Paragraph position="8"> In some cases, several candidate focused goals may be equally likely, and alternative versions of the context model may need to be built. In a cooperative, coherent dialog: in which the information-seeker successfully commtlnicates how his questions relate to what he wants to accomplish, subsequent utterances should enable the system to identify the particular context model that represents the user's plan. However, if the user asks a sequence of questions that have no definite relationship to one another, then we may have a computationally explosive situation. But this behavior violates our assumption of an overall cooperative, coherent dialog.</Paragraph> <Paragraph position="9"> We have given some preliminary consideration to a more robust process model containing a stack of disjoint contexts with potential relationships to one another.</Paragraph> <Paragraph position="10"> Further utterances could clarify these relationships and permit merger of the disjoint contexts into a single overall context model. Such a strategy would have the advantage of handling disconnected portions of dialogs without the computational explosion that can result from modeling all possible expanded contexts.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3.1 HYPOTHESIZING CANDIDATE FOCUSED GOALS AND PLANS </SectionTitle> <Paragraph position="0"> The first stage of processing analyzes an utterance without considering the preceding dialog. Plan-identification heuristics are used to hypothesize a set of domain-dependent goals and associated plans that might represent that aspect of the task on which IS's attention is currently focused. These heuristics are extensions of inference rules proposed by Allen (Allen et al. 1980).</Paragraph> <Paragraph position="1"> For example, if IS wants to know the values of an argument that cause a proposition to be true, then that proposition or its negation may be relevant to the plan that IS is considering. Therefore any goals whose plans might prompt such a request become candidate focused goals and their associated plans become candidate focused plans. Thus if IS asks, &quot;Who is teaching section 10 of French 112 in the spring of 1988?&quot; then IS wants to know the values of the argument _fac: &FACULTY such that the proposition Teaches(_fac:&FACULTY, FRENCH 112-10-SPRING88) is true. The plan for learning the material of a section of a course (Figure 1) contains the proposition Teaches(_fac:&FAC U 1 ,TY, _sect: &SECTIONS) as a constraint on a subgoal in its body. Substituting</Paragraph> <Paragraph position="3"> with the current focus of IS's attention believed to be Teaches(_.fac: &FACULTY, FRENCH112-10-SPRING88) addressed by IS's utterance. Making this substitution throughout the plan in Figure 1 produces a plan for learning the material of section 10 of French 112 in the spring of 1988. Therefore the goal Learn-Material(IS, FRENCH112-10-SPRING88, _syI:&SYLLABI) becomes a candidate focused goal, and the plan produced by substituting FRENCHI12-10-SPRING88 for _sect:&SECTIONS in Figure 1 becomes a candidate focused plan. The goal Learn-From-Person(IS, FRENCH 112-10-SPRING88,</Paragraph> <Paragraph position="5"> is the most recently considered subgoal in this candidate focused plan; thus it provides the greatest expectations for future utterances.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 CONTEXT PROCESSING </SectionTitle> <Paragraph position="0"> The second stage relates an utterance to the context established by the preceding dialog. We use a tree structure called a context model to represent the task-related plan inferred for IS from the preceding dialog. Each node in this tree represents a goal that the system believes IS is considering achieving and, except for the root, is a descendant of a higher-level goal whose associated plan contains the subgoal represented by the child node. In Figure 2, for example, learning the material for section 10 of French 112 appears as a descendant of the higher-level goal of earning credit in section 10 of French 112, representing the belief that IS is considering how he would go about learning the material of the section as part of a plan for earning credit in the section.</Paragraph> <Paragraph position="1"> One node in the tree is marked as the current focus of attention and indicates that aspect of the task on which IS's attention is currently centered. The path from the root of the context model to the current focus of attention is called the active path and represents the global context, or sequence of progressively lower-level goals that led to the subgoal currently under consider-Learn-Material(IS, FRENCH112-10-SPRING88, _syI:&SYLLABI) Initially, there is no existing context; each candidate focused goal and its plan become the root of a context model and are marked as current focused goals and plans. If there is only one context model and its root goal appears as part of only one domain-dependent plan, then we have further knowledge about what IS wants to do and can add this higher-level plan as the new root of the context model, with the old root as its child. We continue expanding the tree upward until more than one higher-level plan is possible. For example, if IS's first utterance was &quot;Who is teaching section 10 of French 112 in the spring of 19887&quot; then, as described in the previous section, Learn-Material(IS, FRENCH112-10-SPRING88, _syI:&SYLLABI) would become a candidate focused goal. This goal and its associated plan would be entered as the root of a context model and be marked as the current focus of attention. In addition, since only the plan for earning credit in section 10 of French 112 contains this goal, and since only the plan for earning credit in French 112 contains the goal of earning credit in a section of French 112, the context model is expanded upward to include these higher-level goals, producing the context model in plan that the system believes IS intended it to recognize is built into the context model. Section 5 discusses more robust user modeling, in which default inference rules might be used to expand the system's model of the user's plan, and addresses the problem of detecting and recovering from errors that might be introduced into the model.</Paragraph> <Paragraph position="2"> As each new utterance occurs, it must be related to the established context. A set of focusing heuristics are used to determine the most likely relationship between one of the hypothesized candidate focused plans and the context model, and to expand the context model to include it. Grosz (1977) introduced the concept of focusing in her work on identifying the referents of definite noun phrases in apprentice-expert dialogs. She noted that the focus of the discourse followed the plan for performing the apprentice's task. Our information-seeking dialogs differ from apprentice-expert dialogs in Computational Linguistics, Volume 14, Number 3, September 1988 27 Sandra Carberry Modeling the User's Plans and Goals that our dialogs are not constrained by the order of execution of the actions in the overall plans. However, we do find structure in the dialogs we are studying, and it is the basis for our focusing heuristics. This structure appears to be caused by two factors.</Paragraph> <Paragraph position="3"> The first is the organized nature of naturally occurring information-seeking dialogues. Dialogue transcripts indicate that humans generally ask all their questions that are relevant to a plan for one subgoal before moving on to ask questions about a plan for another subgoal of the overall task. One possible explanation for this behavior is that it may require less mental effort than switching back and forth among partially constructed plans for different subgoals.</Paragraph> <Paragraph position="4"> The second factor producing structure in our dialogs is their cooperative nature. Since the dialogs are cooperative and miscommunication can occur if both dialog participants are not focused on the same subset of knowledge (Grosz 1981), we expect IS to shift topic slowly between consecutive utterances and to adhere to the focusing constraints espoused by McKeown (1985).</Paragraph> <Paragraph position="5"> McKeown expanded on focus rules proposed by Sidner (1981) to explain how speakers should organize their utterances when faced with a choice of topic. In particular, McKeown claims that a speaker should move to a recently introduced topic if he has something further to say about it; otherwise he will have to reintroduce the topic at a later time. Similarly, the speaker should choose to finish discussion of the current topic before switching back to a previous one.</Paragraph> <Paragraph position="6"> Our focusing heuristics rely on these expectations about possible shifts in focus of attention in IS' s underlying task-related plan to identify which candidate focused plan is most apropos to the established dialog context and to determine how it fits into the context model. The following ordered list gives the focusing heuristics' preferences on the relationship between a candidate focused plan and the context model. Each relationship is illustrated under the assumption that the tree shown in Figure 3a is the context model immediately preceding the utterance, with node C (marked by an asterisk) representing the current focused goal/plan and node G representing the most recently considered subgoal in the current focused plan.</Paragraph> <Paragraph position="7"> 1. The candidate focused plan is part of the expansion of a plan for the most recently considered subgoal in the current focused plan; for example, if Figure 3b is an expansion of the context model shown in Figure 3a, then node C1 might represent such a candidate focused plan.</Paragraph> <Paragraph position="8"> 2. The candidate focused plan is part of an expansion of the current focused plan. For example, if Figure 3b is an expansion of the context model shown in Figure 3a, then node C2 might represent such a candidate focused plan (where C2 is part of a plan for the goal at node F, which is in turn part of a plan for the goal at node C).</Paragraph> <Paragraph position="9"> 3. The candidate focused plan is part of the expan-</Paragraph> <Paragraph position="11"> sion of a plan for a goal along the active path, with preference given to goals that are closest to the current focused goal on the active path. For example, if Figure 3b is an expansion of the context model shown in Figure 3a, then nodes C3 and C4 would both be part of the expansion of a plan for a goal along the active path; but if nodes C3 and C4 both represent candidate focused plans, then node C3 would be preferred, since it appears in an expansion of the plan associated with node B, which is closer to the current focused goal (represented by node C) than is node A.</Paragraph> <Paragraph position="12"> 4. The candidate focused plan is a plan whose expansion contains the goal associated with the root of the context model; Figure 3c illustrates such a relationship, where node C5 represents a candidate focused plan.</Paragraph> <Paragraph position="13"> 5. The candidate focused plan is part of the expansion of a higher-level plan, and this expansion also contains the goal associated with the root of the context model; Figure 3d illustrates such a relationship, where node C6 represents a candidate focused plan.</Paragraph> <Paragraph position="14"> In applying each rule, we use a breadth-first expansion of plans, so that the resulting shift in focus of attention will be as small as possible. For example, if nodes C7 and C8 in Figure 3e both represented candidate focused goals/plans, the second rule in the above list would prefer C7 to C8, since C7 is closer to the existing focus of attention in the dialog.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 AN EXAMPLE </SectionTitle> <Paragraph position="0"> To illustrate this plan inference process, let us consider a dialog segment containing four utterances by IS.</Paragraph> <Paragraph position="1"> Suppose IS begins with the statement &quot;I want to major in computer science.&quot; Since IS states that he wants to achieve a goal, majoring in CS, TRACK's plan identification heuristics hypoth-</Paragraph> <Paragraph position="3"> as candidate focused goals and their associated plans as candidate focused plans. Since there is no way of choosing between these, two context models would be built, each with one of the candidate focused goal/plan pairs as its root (Figure 4). The resulting current focused plan in each context model is preceded by an asterisk.</Paragraph> <Paragraph position="4"> Suppose that IS's next utterance is the query &quot;What are the prerequisites for taking CS180?&quot; Since IS wants to know the preconditions for the plan associated with the goal of taking CS180 (the introductory course for majors and minors in computer science),</Paragraph> <Paragraph position="6"> and its associated plan as the candidate focused goal/ plan pair. The focusing heuristics must determine how this relates to the preceding dialog, as represented by the context model. The strongest expectation is that IS is continuing with some aspect of the current focused plan; since the preceding utterance did not address any particular goal in this plan, there is no most recently considered subgoal. Since taking CS180 appears in an expansion of the plans for majoring in computer science, TRACK expands the context models as shown in Figure 5 and marks the plan for earning credit in CS180 as the new current focus of attention.</Paragraph> <Paragraph position="7"> Suppose that IS's next query is, &quot;What courses must I take in order to satisfy the foreign language requirement?&quot; Since IS is asking about the argument (courses) of a subgoal (taking courses) that is part of a plan for achieving a second goal, TRACK hypothesizes the second goal and its associated plan Satisfy-Language-Req(IS) as the candidate focused goal/plan pair. The focusing heuristics must now determine how this relates to the preceding dialog, as represented in the context model. The strongest expectation is that IS will continue with some aspect of the current focused plan. However, the candidate focused plan does not appear in an expansion of the current focused plan, indicating that IS has shifted focus to another aspect of the overall task. In fact, none of the first four focusing heuristics find a relationship between the candidate focused plan and the Computational Linguistics, Volume 14, Number 3, September 1988 29 Sandra Carberry Modeling the User's Plans and Goals * ~: !1. k~.EDITS) Figure 6. Context Model After Three Utterances.</Paragraph> <Paragraph position="8"> context model. However, the last focusing heuristic finds that there is a goal, Obtain-Degree(IS, BA) whose associated plan can be expanded to include both the candidate focused plan and the context model whose root is Satisfy-Major(IS, BA, CS), indicating that IS has shifted his attention to another subtask (satisfying the foreign language requirement) of a higher-level plan (obtaining a bachelor of arts degree), of which the old current focused plan (obtaining a computer science major) is also a part. Therefore this goal becomes the root of a new context model, as shown in Figure 6; Satisfy-Language-Req(IS) is marked as the new current focus of attention, as indicated by the asterisk preceding it. The other previous context model, whose root was Satisfy-Major(IS, BS, CS), is discarded, indicating that IS's third utterance has led us to deduce that he wants to pursue a bachelor of arts degree. Note that our plan inference process makes what Pollack (1987) terms the appropriate query assumptionwnamely, that IS does not ask queries that are inappropriate to his intended goal. This aspect of our plan inference process will be discussed further in Section 5, where we discuss a more robust plan recognition paradigm.</Paragraph> <Paragraph position="9"> Suppose that IS's next query is, &quot;Who is teaching section 10 of French 112 in the spring of 1988?&quot; As described earlier, since IS is asking about the teacher of a particular section of a course, he may be considering the subgoal of learning from that teacher; this subgoal appears in aplan for learning the material of a course, and therefore TRACK hypothesizes the plan associated with the goal Learn-Material(IS, FRENCH112-10-SPRING88, _syl:&SYLLABI) as one of the candidate focused plans. The focusing heuristics find that this candidate focused plan appears in an expansion of the most recently considered subgoal (taking courses) in the current focused plan, and therefore it is selected as the new focus of attention and the context model is expanded to include it (Figure 7).</Paragraph> <Paragraph position="10"> In this manner, our plan inference process dynamically infers from an ongoing dialog the underlying task-related plan motivating an information-seeker's queries and tracks his focus of attention in this plan structure.</Paragraph> </Section> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 APPLICATION OF CONTEXT MODELS </SectionTitle> <Paragraph position="0"> The context model is one component of a comprehensive user model, representing the system's acquired beliefs about the plan an information-seeker is trying to construct. The possible expansions of this plan provide expectations about information that IS might want, and these expectations can often be used to repair and disambiguate IS's subsequent utterances. We have developed strategies that use our context model to handle two forms of problematic input: pragmatically ill-formed utterances and intersentential elfipsis. This section describes our approach to the first of these; our framework for handling ellipsis is described in Carberry (1985).</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 PRAGMATIC ILL-FORMEDNESS </SectionTitle> <Paragraph position="0"> An utterance can be syntactically and semantically well formed, yet violate the structural properties of the listener's world model. This is not to say that the speaker necessarily holds an incorrect view of the world, or even one that differs from the listener's view, but only that the semantic representation of the speaker's utterance does not conform to the listener's world model. We shall say that such an utterance is pragmatically ill-formed.</Paragraph> <Paragraph position="1"> Consider, for example, the query IS: &quot;What is the area of the special weapons magazine of the Alamo?&quot; that appears in a dialog transcript of an information-seeker attempting to load cargo onto ships using the REL natural language interface (Thompson 1980). A semantic representation of this query will contain the proposition Area(SPECIAL-WEAPONS-MAG, _areaval:&SQ-FT) The system was unable to understand this query, since its semantic representation erroneously presumed that storage locations had an area attribute in the associated data base. ff a human information-provider had a similar problem in understanding the utterance, or considered the meaning of &quot;area&quot; ambiguous, he might be able to use the context established by the preceding dialog to identify what the information-seeker really wanted to know. For example, if IS's goal was to load cargo of the appropriate type into the various cargo holds, then he probably wanted to know the remaining capacity of the Special Weapons Magazine. On the other hand, if his goal was to assign ships to routes in order to handle the expected cargo shipping requirements, then IS probably wanted to know the total capacity of the Special Weapons Magazine. Similarly, if his goal was to assign workers to fill the storage holds, with one worker assigned to handle all cargo holds located in the same section of the ship, then IS probably wanted to know the location of the Special Weapons Magazine.</Paragraph> <Paragraph position="2"> Another example of a pragmatically ill-formed query illustrates the missing joins problem.</Paragraph> <Paragraph position="3"> IS: &quot;Who is teaching section 10 of French 112 in the spring of 1988?&quot; IP: &quot;Dr. Walker.&quot; IS: &quot;When's Mitchel meet?&quot; A semantic representation of the last query contains the proposition Meeting-Time(MITCHEL, _tme:&MEETING-TIMES) Suppose that in the system's world model, faculty teach sections of courses, chair committees, and present colloquia, and each of these has a scheduled meeting time, but there is no direct relationship between faculty and times. Then the above query will appear pragmatically ill-formed. Although this utterance might be an abbreviated version of any of the queries &quot;When does the section of French 112 taught by Dr. Mitchel meet?&quot; &quot;When does the committee chaired by Dr. Mitchel meet?&quot; &quot;When does the colloquium given by Dr. Mitchel meet?&quot; a human information-provider would be likely to recognize from the above dialog that IS wants the meeting time of the section of French 112 taught by Dr. Mitchel, and respond accordingly.</Paragraph> </Section> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 4.2 UNDERSTANDING PRAGMATICALLY ILL-FORMED QUERIES </SectionTitle> <Paragraph position="0"> If a natural language system's communication is to be regarded as natural, the system must be able to handle the full spectrum of utterances that humans understand with relative ease. But our analysis of naturally occurring dialog indicates that human listeners understand many utterances that would appear pragmatically ill-formed to current natural language systems. A number of researchers have investigated the problem of handling pragmatically ill-formed queries (Sowa 1976, Chang 1978, Mays 1980, Kaplan 1982), but their strategies were deficient in that they considered the queries in isolation, without using a model of the preceding dialog to address the speaker's intentions.</Paragraph> <Paragraph position="1"> Grice's theory of meaning (Grice 1969, Grice 1957) and maxim of relation (Grice 1975) suggest that the listener's beliefs about what the speaker is trying to do should be used to recognize the intent behind an ill-formed query. According to Grice's theory, a listener should believe that the speaker believes the listener can infer the intended meaning of an utterance--otherwise the speaker would not have made it. So given a pragmatically ill-formed query, a cooperative listener should attempt to deduce these intentions. Grice's maxim of relation suggests that the speaker's utterance is relevant to the existing dialog context, so the listener should use this context and the focus of attention immediately prior to the problematic utterance to attempt to deduce the speaker's intended meaning and enable the dialog to continue without interruption.</Paragraph> <Paragraph position="2"> Our strategy is based on this theory of meaning and intenticm. It uses the context model to suggest substitutions for the erroneous proposition appearing in the semantic representation of IS's pragmatically ill-formed query, thereby producing semantic representations for one or more revised queries, all of which are apropos to what IS is trying to accomplish. If more than one Computational Linguistics, Volume 14, Number 3, September 1988 31 Sandra Carberry Modeling the User's Plans and Goals revised query is proposed, then it must be determined whether any of these is significantly more likely than the others to represent the speaker's intentions or satisfy his perceived needs. Two criteria appear appropriate for comparing suggested revised queries. The first is the relevance of the revised query to the current focus of attention in the dialog. Since we have contended that some shifts in focus of attention in the plan structure are more likely than others, it is reasonable to hypothesize that the more expected the shift in focus of attention that would result from a revised query, the more likely is that query to represent the speaker's intentions. The second criteria for comparing suggested revised queries is the similarity of a revised query to the speaker's actual utterance. For example, color has less semantic similarity to area than does remaining capacity. Therefore substituting &quot;color&quot; for &quot;area&quot; in the example query &quot;What is the area of the special weapons magazine of the Alamo?&quot; is a more significant alteration of the query than is substituting &quot;remaining capacity&quot; for &quot;area.&quot; As a result, the revised query &quot;What is the color of the special weapons magazine of the Alamo?&quot; is less similar to the speaker's actual query than is the revised query &quot;What is the remaining capacity of the special weapons magazine of the Alamo?&quot; Therefore our pragmatic ill-formedness processor contains a suggestion mechanism and a selection mechanism. The suggestion mechanism proposes revised queries, all of which are relevant to IS's underlying task-related plan, and the selection mechanism uses the criteria of relevance and semantic similarity to select, from among multiple suggestions, the revised query deemed most likely to represent the speaker's intentions or satisfy his perceived needs.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.3 REPAIR STRATEGY </SectionTitle> <Paragraph position="0"> The suggestion mechanism uses two sets of substitution heuristics, one for making simple substitutions of a property, relation, function, or object class for that used by the speaker, and a second set for expanding relational paths to handle the missing joins problem.</Paragraph> <Paragraph position="1"> As an example of a simple substitution, suppose the dialogue preceding the query &quot;What is the area of the special weapons magazine of the Alamo?&quot; indicates that IS's current focused goal within his overall plan is to load cargo of the appropriate type into the various cargo holds. A subgoal in the plan associated with this goal would be specifying that the storage area must have room for the cargo :item. The property substitution heuristic would examine this plan and suggest substituting either of the appearing in the semantic representation of IS's query, producing suggested semantic representations equivalent to the two revised queries IS: &quot;What is the cargo type of the Special Weapons Magazine of the Alamo?&quot; IS: &quot;What is the remaining capacity of the Special Weapons Magazine of the Alamo?&quot; More formally, this heuristic is represented by the following rule: If IS's proposition erroneously presumes that a member Objl of CLASS1 has a property Attl, then replace property Attl with property Art2 if the following conditions hold: 1. a proposition specifying property Att2 on a member Obj2 of CLASS1 appears in an expansion of IS's context model.</Paragraph> <Paragraph position="2"> 2. Objl and Obj2 unify (Either Objl in IS's utterance or Obj2 in the plan proposition refers to a general member of CLASS1, or both refer to the same specific member of CLASS1).</Paragraph> <Paragraph position="3"> In the context of our student advisement dialogs, suppose a student wants to pursue an independent study project; such projects can be directed by full-time faculty but not by faculty who are extension or on 32 Computational Linguistics, Volume 14, Number 3, September 1988 Sandra Carberry Modeling the User's Plans and Goals sabbatical. The student might erroneously follow the utterance &quot;I want to take an independent study project.&quot; with the pragmatically ill-f0rmed query &quot;What is the classification of Dr. Smith?&quot; In a university world model, only students have a classification attribute; this attribute can have values such as Arts&Science-1988, Engineering-1989, and Business-1990. Faculty have attributes such as rank, status, age, and salary. Pursuing an independent study project under the direction of Dr. Smith has the precondition that Dr. Smith's status be full-time or part-time. Our substitution mechanism would analyze the plan for taking an independent study course, and the property substitution rule would suggest substituting the propo- null for the erroneous proposition Classification(DR. SMITH, _classval:&CLASSVALUES) appearing in the semantic representation of the student's query, resulting in a suggested revised semantic representation equivalent to the query &quot;What is the status of Dr. Smith?&quot; As an example of the second set of heuristics, the path expansion heuristics, consider again the query &quot;When's Mitchel meet?&quot; following the dialog that produced the context model shown in Figure 7. As mentioned earlier, the semantic representation of this query contains the erroneous proposition Meeting-Time(MITCHEL, _tme:&MEETING-TIMES) indicating a direct relationship between faculty and times. Our path expansion heuristics will analyze and expand the context model shown in Figure 7 and note that a plan for the goal Earn-Credit(IS, FRENCHll2, SPRING88, _cr2:&CREDITS) can include a path containing the sequence of goals shown in Figure 8. One path expansion heuristic notes that the propositions Teaches(_fac: &FACULTY, _sec 1 :&SECTIONS) Is-Meeting-Time(_sec 1 :&SECTIONS, _tme:&MEETINGTIMES) both appear on this path in the expanded plan, and suggests substituting the conjunction of the propositions Teaches(MITCHEL, _sec 1 :&SECTIONS) Is-Meeting-Time(_sec 1 :&SECTIONS, _tme: &MEETINGTIMES) for the erroneous proposition appearing in the semantic representation of IS's query, resulting in a revised semantic representation equivalent to the English query &quot;When do sections taught by Mitchel meet?&quot; The revised semantic representation no longer violates the system's world model. But it represents an incomplete query, in that it contains an ellipsis. Presumably the speaker wants to know only the sections of French 112 taught by Dr. Mitchel in the spring of 1988, not sections of any course taught by Dr. Mitchel during any semester. How the context model can be used to interpret elliptical utterances is discussed in Carberry (1985). Although we have only illustrated substituting a conjunction of two propositions for the erroneous proposition in the user's query, the path expansion heuristics can propose expansions of any length. Five other heuristics and other parts of the user's plan can suggest substitutions in addition to the ones shown in our examples. The important point is that all of the revised semantic representations resulting from these suggestions represent queries that are apropos to the plan that IS is constructing.</Paragraph> </Section> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> 4.3.2 SELECTING THE APPROPRIATE REVISION </SectionTitle> <Paragraph position="0"> As mentioned earlier, relevance to the current focus of attention and similarity to the speaker's actual utterance are used to select from among multiple suggestions. We use focusing heuristics, similar to those used for constructing the context model, to measure relevance of a revised query to the current focus of attention in the dialog, and generalization hierarchies for properties, relations, functions, and object classes to measure the semantic similarity of a substituted term and the term that it replaces. In the example Computational Linguistics, Volume 14, Number 3, September 1988 33 Sandra Carberry Modeling the User's Plans and Goals &quot;What is the area of the Special Weapons Magazine of the Alamo?&quot; both suggested revised queries have approximately the same relevance to the current dialog but, of the two properties cargo type and remaining capacity, remaining capacity is much closer semantically to the property area used by the speaker. Therefore our selection mechanism chooses the semantic representation equivalent to the query &quot;What is the remaining capacity of the Special Weapons Magazine of the Alamo?&quot; as the most appropriate interpretation representing IS's needs.</Paragraph> <Paragraph position="1"> Instead of computing semantic representations for all suggested revised queries and then selecting the best revision, we analyze nodes of the context model in order of decreasing relevance to the existing focus of attention, until a revision meeting an arbitrary level of acceptability is found. This acceptability level initially is set so that only revisions with extremely good evaluations will meet it, and it is steadily relaxed as larger parts of the context model are analyzed. Since one factor used by the evaluation metric is relevance to the existing focus of attention in the dialog, scores for newly suggested revisions will, in most cases, be worse than the scores for revisions suggested much earlier.</Paragraph> <Paragraph position="2"> Thus as more of the context model is analyzed, a revision that previously did not receive a good enough evaluation to terminate processing may now appear more likely to represent the user's intentions. The relaxed acceptability level allows such a revision to be selected as the appropriate interpretation.</Paragraph> <Paragraph position="3"> This processing mechanism is efficient, since only a small part of the user's expanded plan will usually be analyzed. It also avoids the problem of computational explosion. If processing time exceeds a preset maximum or the acceptability level is relaxed to some preset minimum level of goodness, then the system can terminate its search for an interpretation and is justified in believing that its failure to understand the user's utterance is not unnatural behavior.</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> 4.3.3 COMPARISON TO OTHER STRATEGIES </SectionTitle> <Paragraph position="0"> This approach is superior to previous strategies because it uses a model of the speaker to identify and address his perceived intentions and needs in making an utterance.</Paragraph> <Paragraph position="1"> As such, it not only reasons on the context model to suggest possible interpretations relevant to the user's goals and plans, but it also limits consideration to those interpretations that are reasonable given the established dialog context.</Paragraph> </Section> <Section position="10" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 IMPROVING PLAN RECOGNITION </SectionTitle> <Paragraph position="0"> Our research has shown how an information-seeker's underlying task-related plan can be dynamically inferred from an ongoing dialog, and how the resulting context model can be used to achieve better communication. However, the kinds of cooperative information-seeking dialogs handled by current models of plan recognition indicate that four critical assumptions have errors into the context model.</Paragraph> <Paragraph position="1"> These assumptions represent unrealistic constraints on real-world dialogs and must be removed. The first assumption, called the validplan assumption by Pollack (1987), limits the kinds of beliefs IS can already have about the domain--namely, it says that IS's knowledge may be incomplete but not erroneous. But IS is interacting with the system because IS does not know enough about the domain to construct his task-related plan by himself. Therefore, since IS is not an expert in the area, it is to be expected that some of his beliefs about the domain may be false, contradicting the first assumption. An implication of the valid plan assumption is what Pollack terms the appropriate query assumption-namely, that IS knows enough about how to solve his problem that he always asks relevant questions.</Paragraph> <Paragraph position="2"> The second assumption limits the questions IS can ask to those which the system can answer. But even an expert system has limited domain knowledge. Furthermore, in a rapidly changing world, knowledgeable users may have more accurate information about some aspects of the domain than does the system. For example, a student advisement system may not be altered immediately upon changing the teacher of a course. A cooperative system should recognize its limited knowledge and reason with it to provide whatever pertinent, helpful information it can.</Paragraph> <Paragraph position="3"> The third assumption restricts IS to utterances that are clear, precise, and accurate. For example, it eliminates the possibility that IS might say he is a junior, when in fact he is three credits short of junior standing, thereby leading the system to erroneously infer that IS is eligible for certain programs or awards. But human information-seekers are often imprecise, especially when they are not aware that small perturbations in the data can be significant.</Paragraph> <Paragraph position="4"> The fourth assumption says that the system never makes an error in inferring IS's plan. But even in the simplest cases, the system must hypothesize how individual utterances relate to one another. Such decisions select from among multiple possibilities and are a potential source of error.</Paragraph> <Paragraph position="5"> Pollack (1987) argues against plan inference systems making the first two assumptions, because they prevent 34 Computational Linguistics, Volume 14, Number 3, September 1988 Sandra Carberry Modeling the User's Plans and Goals the system from inferring plans which the user believes he can pursue but which are novel (to the system) or invalid. However, there is another implication of relaxing the appropriate query assumption that is not considered by Pollack: IS may ask an irrelevant question that seems perfectly reasonable to the system, thereby leading the system to develop incorrect beliefs about IS's objectives. Consider, for example, a student advisement system. If only B.A. degrees have a foreign language requirement, the query &quot;What courses must I take to satisfy the foreign language requirement in French?&quot; may lead the system to infer that IS is pursuing a bachelor of arts degree. If only B.S. degrees require a senior project, then a subsequent query such as &quot;How many credits of senior project are required?&quot; is problematic. Either the second query is inappropriate to IS's goal of obtaining a bachelor of arts degree (Pollack 1986), or the system's context model does not accurately reflect what IS wants to do. Note that, in either case, the user has a misconception; but in the latter case, the misconception went undetected and was allowed to introduce errors into the system's context model.</Paragraph> <Paragraph position="6"> Traditional natural language plan inference systems also make the third and fourth assumptions, which, together with the first two, guarantee that the underlying plan inferred by the system and the task-related plan under construction by IS are never at variance with one another. If we want systems capable of understanding and appropriately responding to naturally occurring dialog, natural language interfaces must be able to deal with situations where those assumptions are not true.</Paragraph> <Paragraph position="7"> Grosz (1981) claimed that miscommunication can occur if both dialog participants are not focused on the same subset of knowledge. Joshi (1982) contended that successful communication requires that the mutual beliefs of the dialog participants be consistent. Extending this to inferred plans, we claim that a successful cooperative dialog requires that the system's beliefs about IS's plan be consistent with what IS is actually considering doing. But clearly it is unrealistic to expect that the system's model will always be correct, given the different knowledge bases of the two participants and the imperfections of communication via dialog.</Paragraph> <Paragraph position="8"> Thus we need a repair mechanism that attempts to detect inconsistencies in the models and repair them whenever possible. This view is supported by the work of Pollack, Hirschberg, and Webber (1982). They suggested that expert-novice dialogs could be viewed as a negotiation process, during which not only an acceptable solution is negotiated, but also understanding of the terminology and the beliefs of the participants. The context model is one component of the system's beliefs, as is its belief that this model accurately reflects the plan under construction by IS.</Paragraph> </Section> <Section position="12" start_page="0" end_page="0" type="metho"> <SectionTitle> 5.2 AN APPROACH TO ROBUST PLAN RECOGNITION </SectionTitle> <Paragraph position="0"> Our analysis of naturally occurring dialog suggests that a plan recognition framework for handling disparate plans should include four phases: 1. Detecting clues to possible disparity between the system's context model and the user's actual goals and plans for accomplishing them. For example, expressions of surprise at the system's response and what appear to be major unsignaled shifts in focus of attention should lead the system to suspect that its context model might be in error.</Paragraph> <Paragraph position="1"> 2. Reasoning on the system's context model and the system's domain knowledge to hypothesize the source of these disparities.</Paragraph> <Paragraph position="2"> 3. Negotiating with the user to isolate the errors. The negotiation phase should be guided by the system's hypothesis about the source of errors in the context model.</Paragraph> <Paragraph position="3"> 4. Appropriately repairing the context model, as indicated by the negotiation dialog.</Paragraph> <Paragraph position="4"> We believe that the knowledge acquired from the dialog and how it was used to construct the context model are important factors in hypothesizing the cause of disparity between the system's context model and the actual plan under construction by the information-seeker. Natural language systems must employ various techniques such as focusing heuristics and default rules for understanding and relating dialog in order to do the kind of inferencing exhibited in dialogue transcripts and provide the most helpful responses. But confidence in individual components of the resultant context model appears to be important in hypothesizing errors. We contend that the system's context model should be enriched, so that its representation of the plan inferred for the user differentiates among its components according to the support that the system accords each component as a correct and intended part of that plan.</Paragraph> <Paragraph position="5"> The system can then reason on this enriched context model to hypothesize the most likely sources of suspected disparities.</Paragraph> <Paragraph position="6"> For example, if the system believes that the information-seeker intends the system to recognize from his utterance that G is a component of his plan, then the system can confidently add G to its context model.</Paragraph> <Paragraph position="7"> Components that the system adds to the context model because of the system's domain knowledge should be less strongly believed. This distinction resembles intended recognition versus keyhole recognition (Cohen et al. 1981). Intended recognition is the inference of those goals and plans that an agent intends to convey.</Paragraph> <Paragraph position="8"> Keyhole recognition is the inference of an agent's goals and pIans by unobtrusively observing the agent, as if through a keyhole. Intended recognition is essential in communicative situations (Cohen et al. 1981), since the listener must identify the intended meaning of a speaker's utterance.</Paragraph> <Paragraph position="9"> Our analysis of naturally occurring dialog suggests keyhole recognition is often critical to expand beliefs about what the information-seeker is trying to do and how it should be done. For example, if CS180 is an introductory course restricted to majors in computer science and electrical engineering, then the system might infer from the utterance &quot;Can you tell me what time CS180 meets?&quot; not only that the user wants to know the meeting time for CSI80, but also that the user is a computer science or electrical engineering major. However, the user may intend the system to recognize the first goal, but it is questionable whether the user actually intends the system to recognize that the user is pursuing a major in computer science or electrical engineering. This latter inference is based on the system's beliefs about who can take CS180--knowledge that the user may not have.</Paragraph> <Paragraph position="10"> Therefore, since the user may not have intended to communicate these components, they are more likely sources of error than components that the user intended the system to recognize.</Paragraph> <Paragraph position="11"> The particular rules used to add a component to the context model should affect the system's faith in that component as part of the information-seeker's overall plan. For example, since default inference rules and focusing heuristics select from among multiple possibilities, they add components that are likely sources of suspected errors.</Paragraph> <Paragraph position="12"> We believe that if a plan recognition system builds such an enriched context model, uses it to hypothesize the source of suspected errors in the model, and attempts to negotiate with the user to isolate and repair its model, the system will be able to handle a much larger set of dialogs than can current models of plan inference, and will be likely to produce responses resembling those found in transcripts of naturally occurring information-seeking dialogs.</Paragraph> </Section> <Section position="13" start_page="0" end_page="0" type="metho"> <SectionTitle> 6 CONCLUSIONS AND CURRENT RESEARCH </SectionTitle> <Paragraph position="0"> A cooperative natural language system must attempt to infer the underlying task-related plan motivating the information-seeker's queries and use this plan to provide cooperative, helpful responses. The system's model of this plan, which we call a context model, is one component of a user model. We have presented a strategy for dynamically inferring the context model from an ongoing dialog, and have shown how this model can be used to handle one class of problematic utterances--the set of utterances that violate the pragmatic rules of the system's world model. Our strategy, motivated by Grice's theory of meaning and maxim of</Paragraph> </Section> class="xml-element"></Paper>