File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/p86-1033_metho.xml
Size: 25,864 bytes
Last Modified: 2025-10-06 14:11:54
<?xml version="1.0" standalone="yes"?> <Paper uid="P86-1033"> <Title>LINGUISTIC COHERENCE: A PLAN-BASED ALTERNATIVE</Title> <Section position="3" start_page="215" end_page="217" type="metho"> <SectionTitle> REPRESENTING COHERENCE USING DISCOURSE PLANS </SectionTitle> <Paragraph position="0"> In a plan-based approach to language understanding, an utterance is considered understoo~ when it has been related to some underlying plan of the speaker.</Paragraph> <Paragraph position="1"> While previous works have explicitly represented and recognized the underlying task plans of a given domain (e.g., mount a tape) (Grosz \[5\], Allen and Perrault \[1\], Sidner and Israel \[21\]. Carberry \[2\], Sidner \[24\]), the ways that utterances could be related to such plans were limited and not of particular concern. As a result, only dialogues exhibiting a very limited set of utterance relationships could be understood.</Paragraph> <Paragraph position="2"> In this work, a set of domain-independent plans about plans (i.e. meta-plans) called discourse plans are introduced to explicitly represent, reason about, and generalize such relationships. Discourse plans are recognized from every utterance and represent plan introduction, plan execution, plan specification, plan debugging, plan abandonment, and so on. independently of any domain. Although discourse plans can refer to both domain plans or other discourse plans.</Paragraph> <Paragraph position="3"> domain plans can only be accessed and manipulated via discourse plans. For example, in the tape excerpt above &quot;Could you mount a magtape for me?&quot; achieves a discourse plan to introd,we a domain plan to mount a tape. &quot;It's tape 1&quot; then further specifies this domain plan.</Paragraph> <Paragraph position="4"> Except for the fact that they refer to other plans (i.e. they take other plans as arguments), the representation of discourse plans is identical to the usual representation of domain plans (Fikes and Nilsson \[4\], Sacerdoti \[18\]). Every plan has a header, a parameterized action description that names the plan. Action descriptions are represented as operators on a planner's world model and defined in terms of prerequisites, decompositions, and effects. Prerequisites are conditions that need to hold (or to be made to hold) in the world model before the action operator can be applied. Effects are statements that are asserted into the world model after the action has been successfully executed. Decompositions enable hierarchical planning. Although the action description of. the header may be usefully thought of at one level of abstraction as a single action achieving a goal, such an action might not be executable, i.e. it might be an abstract as opposed to primitive action. Abstract actions are in actuality composed of primitive actions and possibly other abstract action descriptions (i.e. other plans).</Paragraph> <Paragraph position="5"> Finally, associated with each plan is a set of applicability conditions called constraintsJ These are similar to prerequisites, except that the planner never attempts to achieve a constraint if it is false. The plan recognizer will use such general plan descriptions to recognize the particular plan instantiations underlying an utterance.</Paragraph> <Paragraph position="6"> plans (see Litman \[10\] for the complete set). The first discourse plan, INTRODUCE-PLAN, takes a plan of the speaker that involves the hearer and presents it to the hearer (who is assumed cooperative). The decomposition specifies a typical way to do this, via execution of the speech act (Searle \[19\]) REQUEST. The constraints use a vocabulary for referring to and describing plans and actions to specify that the only actions requested will be those that are in the plan and have the hearer as agent. Since the hearer is assumed cooperative, he or she will then adopt as a goal the 3These constraints should not be confused with the constraints of Stefik \[25\]. which are dynamical b formulated during hierarchical plan generation and represent the interactions between subprobiems.</Paragraph> <Paragraph position="7"> joint plan containing the action (i.e. the first effect). The second effect states that the action requested will be the next action performed in the introduced plan.</Paragraph> <Paragraph position="8"> Note that since INTRODUCE-PLAN has no prerequisites it can occur in any discourse context, i.e. it does not need to be related to previous plans.</Paragraph> <Paragraph position="9"> INTRODUCE-PLAN thus allows the recognition of topic changes when a previous topic is completed as well as recognition of interrupting topic changes (and when not linguistically marked as such, of incoherency) at any point in the dialogue. It also captures previously implicit knowledge that at the beginning of a dialogue an underlying plan needs to be recognized.</Paragraph> <Paragraph position="10"> The discourse plan in Figure 2, CONTINUE-PLAN, takes an already introduced plan as defined by the WANT prerequisite and moves execution to the next step, where the previously executed step is marked by the predicate LAST. One way of doing this is to request the hearer to perform the step that should occur after the previously executed step, assuming of course that the step is something the hearer actually can perform. This is captured by the decomposition together with the constraints. As above, the NEXT effect then updates the portion of the plan to be executed. This discourse plan captures the previously implicit relationship of coherent topic continuation in task-oriented dialogues (without interruptions), i.e. the fact that the discourse structure follows the task structure (Grosz \[5\]).</Paragraph> <Paragraph position="11"> Figure 3 presents CORRECT-PLAN, the last discourse plan to be discussed. CORRECT-PLAN inserts a repair step into a pre-existing plan that would otherwise fail. More specifically, CORRECT-PLAN takes a pre-existing plan having subparts that do not interact as expected during execution, and debugs the plan by adding a new goal to restore the expected interactions. The pre-existing plan has subparts laststep and nextstep, where laststep was supposed to enable the performance of nextstep, but in reality did not. The plan is corrected by adding newstep, which CORRECT-PLAN(speaker. hearer, laststep, newstep, nextstep, plan) WANT(hearer, plan) LAST(laststep. plan) REQUEST(speaker, hearer, newstep) REQUEST(speaker, hearer, nextstep) STEP(newstep. plan) AFTER(laststep. newstep, plan) AFTER(newstep. nextstep, plan) NEXT(newstep. plan) STEP(laststep. plan)</Paragraph> <Paragraph position="13"> enables the performance of nextstep and thus of the rest of plan. The correction can be introduced by a REQUEST for either nextstep or newstep. When nextstep is requested, the hearer has to use the knowledge that ne.rtstep cannot currently be performed to infer that a correction must be added to the plan. When newstep is requested, the speaker explicitly provides the correction. The effects and constraints capture the plan situation described above and should be self-explanatory with the exception of two new terms. MODIFIES(action2, actionl) means that action2 is a variant of action1, for example, the same action with different parameters or a new action achieving the still required effects.</Paragraph> <Paragraph position="14"> ENABLES(action1, action2) means that false prerequisites of action2 are in the effects of action1.</Paragraph> <Paragraph position="15"> CORRECT-PLAN is an example of a topic interruption that relates to a previous topic, To illustrate how these discourse plans represent the relationships between utterances, consider a naturally-occurring protocol (Sidner \[22\]) in which a user interacts with a person simulating an editing system to manipulate network structures in a knowledge representation language: 1) User: Hi. Please show the concept Person.</Paragraph> <Paragraph position="16"> 2) System: Drawing...OK.</Paragraph> <Paragraph position="17"> 3) User: Add a role called hobby.</Paragraph> <Paragraph position="18"> 4) System: OK.</Paragraph> <Paragraph position="19"> 5) User: Make the vr be Pastime.</Paragraph> <Paragraph position="20"> Assume a typical task plan in this domain is to edit a structure by accessing the structure then performing a sequence of editing actions. The user's first request thus introduces a plan to edit the concept person. Each successive user utterance continues through the plan by requesting the system to perform the various editing actions. More specifically, the first utterance would correspond to INTRODUCE-PLAN (User, System, show the concept Person, edit plan). Since one of the effects of INTRODUCE-PLAN is that the system adopts the plan, the system responds by executing the next action in the plan, i.e. by showing the concept Person. The user's next utterance can then be recognized as CONTINUE-PLAN (User, System, show the concept Person, add hobby role to Person. edit plan), and so on.</Paragraph> <Paragraph position="21"> Now consider two variations of the above dialogue. For example, imagine replacing utterance (5) with the User's &quot;No, leave more room please.&quot; In this case, since the system has anticipated the requirements of future editing actions incorrectly, the user must interrupt execution of the editing task to correct the system, i.e. CORRECT-PLAN(User. System, add hobby role to Person, compress the concept Person, next edit step, edit plan). Finally. imagine that utterance (5) is again replaced, this time with &quot;Do you know if it's time for lunch yet?&quot; Since eating lunch cannot be related to the previous editing plan topic, the system recognizes the utterance as a total change of topic, i.e. INTRODUCE-PLAN(User, System, System tell User if time for lunch, eat lunch plan).</Paragraph> </Section> <Section position="4" start_page="217" end_page="218" type="metho"> <SectionTitle> RECOGNIZING DISCOURSE PLANS </SectionTitle> <Paragraph position="0"> This section presents a computational algorithm for the recognition of discourse plans. Recall that the previous lack of such an algorithm was in fact a major force behind the last section's plan-based formalization of the linguistic relationships. Previous work in the area of domain plan recognition (Allen and Perrault \[1\], Sidner and Israel \[21\]. Carberry \[2\], Sidner \[24\]) provides a partial solution to the recognition problem. For example, since discourse plans are represented identically to domain plans, the same process of plan recognition can apply to both. In particular, every plan is recognized by an incremental process of heuristic search. From an input, the plan recognizer tries to find a plan for which the input is a step, 4 and then tries to find more abstract plans for which the postulated plan is a step, and so on. After every step of this chaining process, a set of heuristics prune the candidate plan set based on assumptions regarding rational planning behavior. For example, as in Allen and Perrault \[1\] candidates whose effects are already true are eliminated, since achieving these plans would produce no change in the state of the world. As in Carberry \[2\] and Sidner and Israel \[21\] the plan recognition process is also incremental; if the heuristics cannot uniquely determine an underlying plan, chaining stops.</Paragraph> <Paragraph position="1"> As mentioned above, however, this is not a full solution. Since the plan recognizer is now recognizing discourse as well as domain plans from a single utterance, the set of recognition processes must be coordiaPlan chaining can also be done ~ia effects and prerequisites. To keep the example in the next section simple, plans have been nated. 5 An algorithm for coordinating the recognition of domain and discourse plans from a single utterance has been presented in Litman and Alien \[9,11\]. In brief, the plan recognizer recognizes a discourse plan from every utterance, then uses a process of constraint satisfaction to initiate recognition of the domain and any other discourse plans related to the utterance.</Paragraph> <Paragraph position="2"> Furthermore, to record and monitor execution of the discourse and domain plans active at any point in a dialogue, a dialogue context in the form of a plan stack is built and maintained by the plan recognizer.</Paragraph> <Paragraph position="3"> Various models of discourse have argued that an ideal interrupting topic structure follows a stack-like discipline (Reichman \[17\], Polanyi and Scha \[15\], Grosz and Sidner \[7\]). The plan recognition algorithm will be reviewed when tracing through the example of the next section.</Paragraph> <Paragraph position="4"> Since discourse plans reflect linguistic relationships between utterances, the earlier work on domain plan recognition can also be augmented in several other ways. For example, the search process can be constrained by adding heuristics that prefer discourse plans corresponding to the most linguistically coherent continuations of the dialogue. More specifically, in the absence of any linguistic clues (as will be described below), the plan recognizer will prefer relationships that, in the following order: (1) continue a previous topic (e.g. CONTINUEPLAN) null (2) interrupt a topic for a semantically related topic (e.g. CORRECT-PLAN, other corrections and clarifications as in Litman \[10\]) ('3) interrupt a topic for a totally unrelated topic (e.g. INTRODUCE-PLAN).</Paragraph> <Paragraph position="5"> Thus, while interruptions are not generally predicted, they can be handled when they do occur. The heuristics also follow the principle of Occam's razor, since they are ordered to introduce as few new plans as possible. If within one of these preferences there are still competing interpretations, the interpretation that most corresponds to a stack discipline is preferred. 'For example, a continuation resuming a recently interrupted topic is preferred to continuation of a topic interrupted earlier in the conversation.</Paragraph> <Paragraph position="6"> Finally, since the plan recognizer now recognizes implicit relationships between utterances, linguistic clues signaling such relationships (Grosz \[5\], Reichman \[17\], Polanyi and Scha \[15\], Sidner \[24\], Cohen \[3\], Grosz and Sidner \[7\]) should be exploitable by the plan recognition algorithm. In other words, the plan recognizer should be aware of correlations between expressed so that chaining via decompositions is sufficient. 5Although Wilensky \[26\] introduced meta-plans into a natural language system to handle a totally different issue, that of concurrent goal interaction, he does not address details of coordination. null specific words and the discourse plans they typically signal. Clues can then be used both to reinforce as well as to overrule the preference ordering given above. In fact, in the latter case clues ease the recognition of topic relationships that would otherwise be difficult (if not impossible (Cohen \[3\], Grosz and Sidner \[7\], Sidner \[24\])) to understand. For example, consider recognizing the topic change in the tape variation earlier, repeated below for convenience: Could you mount a magtape for me? It's snowing like crazy.</Paragraph> <Paragraph position="7"> Using the coherence preferences the plan recognizer first tries to interpret the second utterance as a continuation of the plan to mount a tape, then as a related interruption of this plan. and only when these efforts fail as an unrelated change of topic. This is because a topic change is least expected in .the unmarked case. Now, imagine the speaker prefacing the second utterance with a clue such as &quot;incidentally,&quot; a word typically used to signal topic interruption.</Paragraph> <Paragraph position="8"> Since the plan recognizer knows that &quot;incidentally&quot; is a signal for an interruption, the search will not even attempt to satisfy the first preference heuristic since a signal for the second or third is explicitly present.</Paragraph> </Section> <Section position="5" start_page="218" end_page="221" type="metho"> <SectionTitle> EXAMPLE </SectionTitle> <Paragraph position="0"> This section uses the discourse plan representations and plan recognition algorithm of the previous sections to illustrate the processing of the following dialogue, a slightly modified portion of a scenario (Sidner and Bates \[23\]) developed from the set of protocols described above: User: Show me the generic concept called &quot;employee.&quot; System:OK. <system displays network> User: No, move the concept up.</Paragraph> <Paragraph position="1"> System:OK. <system redisplays network> User: Now, make an individual employee concept whose first name is &quot;Sam&quot; and whose last name is &quot;Jones.&quot; Although the behavior to be described is fully specified by the theory, the implementation corresponds only to the new model of plan recognition. All simulated computational processes have been implemented elsewhere, however. Litman \[10\] contains a full discussion of the implementation.</Paragraph> <Paragraph position="2"> Figure 4 presents the relevant domain plans for this domain, taken from Sidner and Israel \[21\] with minor modifications. ADD-DATA is a plan to add new data into a network, while EXAMINE is a plan to examine parts of a network. Both plans involve the subplan CONSIDER-ASPECT, in which the user considers some aspect of a network, for example by looking at it (the decomposition shown), listening to a description, or thinking about it.</Paragraph> <Paragraph position="3"> The processing begins with a speech act analysis of &quot;Show me the generic concept called 'employee'&quot; HEADER: ADD-DATA(user. netpiece, data, where E1 stands for &quot;the generic concept called 'employee.'&quot; As in Allen and Perrault \[1\], determination of such a literal 6 speech act is fairly straightforward. Imperatives indicate REQUESTS and the propositional content (e.g. DISPLAY) is determined via the standard syntactic and semantic analysis of most parsers.</Paragraph> <Paragraph position="4"> Since at the beginning of a dialogue there is no discourse context, the plan recognizer tries to introduce a plan (or plans) according to coherence preference (3). Using the plan schemas of the second section, the REQUEST above, and the process of forward chaining via plan decomposition, the system postulates that the utterance is the decomposition of INTRODUCE-PLAN( user, system. Dr, ?plan), where STEP(D1, ?plan) and AGENT(D1, system). The hypothesis is then evaluated using the set of plan heuristics, e.g. the effects of the plan must not already be true and the constraints of every recognized plan must be satisfiable. To &quot;satisfy the STEP constraint a plan containing D1 will be created. Nothing more needs to be done with respect to the second constraint since it is already satisfied. Finally, since INTRODUCE-PLAN is not a step in any other plan, further chaining stops.</Paragraph> <Paragraph position="5"> The system then expands the introduced plan containing D1, using an analogous plan recognition process. Since the display action could be a step of the CONSIDER-ASPECT plan, which itself could be a step of either the ADD-DATA or EXAMINE plans, the domain plan is ambiguous. Note that heuristics can not eliminate either possibility, since at the beginning of the dialogue any domain plan is a reasonable expectation. Chaining halts at this branch point and since no more plans are introduced the process of plan recognition also ends. The final hypothesis is that the 6See Litman \[10\] for a discussion of the treatment of indirect speech acts (Searle \[20\]).</Paragraph> <Paragraph position="6"> user executed a discourse plan to introduce either the domain plan ADD-DATA or EXAMINE.</Paragraph> <Paragraph position="7"> Once the plan structures are recognized, their effects are asserted and the postulated plans are expanded top down to include any other steps (using the information in the plan descriptions). The plan recognizer then constructs a stack representing each hypothesis, as shown in Figure 5. The first stack has PLAN1 at the top, PLAN2 at the bottom, and encodes the information that PLAN1 was executed while PLAN2 will be executed upon completion of PLAN1.</Paragraph> <Paragraph position="8"> The second stack is analogous. Solid lines represent plan recognition inferences due to forward chaining, while dotted lines represent inferences due to later plan expansion. As desired, the plan recognizer has constructed a plan-based interpretation of the utterance in terms of expected discourse and domain plans, an interpretation which can then be used to construct and generate a response. For example, in either hypothesis the system can pop the completed plan introduction and execute D1, the next action in both domain plans. Since the higher level plan containing DI is still ambiguous, deciding exactly what to do is an interesting plan generation issue.</Paragraph> <Paragraph position="9"> Unfortunately, the system chooses a display that does not allow room for the insertion of a new concept, leading to the user's response &quot;No, move the concept up.&quot; The utterance is parsed and input to the plan recognizer as the clue word &quot;no&quot; (using the plan recognizer's list of standard linguistic clues) followed by the REQUEST(user, system, Ml:MOVE(system, El, up)) (assuming the resolution of &quot;the concept&quot; to El). The plan recognition algorithm then proceeds in both contexts postulated above. Using the knowledge that &quot;no&quot; typically does not signal a topic continuation, the plan recognizer first modifies its default mode of processing, i.e. the assumption that the REQUEST is a CONTINUE-PLAN (preference 1) is overruled.</Paragraph> <Paragraph position="10"> Note, however, that even without such a linguistic clue recognition of a plan continuation would have ultimately failed, since in both stacks CONTINUE-PLAN's constraint STEP(M1, PLAN2/PLAN3) would have failed. The clue thus allows the system to reach reasonable hypotheses more efficiently, since unlikely inferences are avoided.</Paragraph> <Paragraph position="11"> Proceeding with preference (2), the system postulates that either PLAN2 or PLAN3 is being corrected, i.e., a discourse plan correcting one of the stacked plans is hypothesized. Since the REQUEST matches both decompositions of CORRECT-PLAN, there are two possibilities: CORRECT-PLAN(user, system, ?laststep, M1, ?nextstep, ?plan), and CORRECT-PLAN(user, system, ?laststep, ?newstep, M1, ?plan), where the variables in each will be bound as a result of constraint and prerequisite satisfaction from application of the heuristics. For example, candidate plans are only reasonable if their prerequisites were true, i.e. (in both stacks and corrections) WANT(system, '?plan) and LAST(?laststep, ?plan). Assuming the plan was executed in the context of PLAN2 or PLAN3 (after PLAN1 or PLANIa was popped and the DISPLAY performed), ?plan could only have been bound to PLAN2 or PLAN3. and ?laststep bound to DI. Satisfaction of the constraints eliminates the PLAN3 binding, since the constraints indicate at least two steps in the plan, while PLAN3 contains a single step described at different levels of abstraction. Satisfaction of the constraints also eliminates the second CORRECT-PLAN interpretation, since STEP( M1.</Paragraph> <Paragraph position="12"> PLAN2) is not true. Thus only the first correction on the first stack remains plausible, and in fact, using PLAN2 and the first correction the rest of the constraints can be satisfied. In particular, the bindings where Pl stands for PUT(system, ?data, ?loc).</Paragraph> <Paragraph position="13"> resulting in the hypothesis CORRECT-PLAN(user.</Paragraph> <Paragraph position="14"> system, D1, M1, Pl, PLAN2). Note that a final possible hypothesis for the REQUEST, e.g. introduction of a new plan. is discarded since it does not tie in with any of the expectations (i.e. a preference (2) choice is preferred over a preference (3) choice).</Paragraph> <Paragraph position="15"> The effects of CORRECT-PLAN are asserted (M1 is inserted into PLAN2 and marked as NEXT) and CORRECT-PLAN is pushed on to the stack suspending the plan corrected, as shown in Figure 6.</Paragraph> <Paragraph position="16"> The system has thus recognized not only that an interruption of ADD-DATA has occurred, but also that the relationship of interruption is one of plan correction. Note that unlike the first utterance, the plan referred to by the second utterance is found in the stack rather than constructed. Using the updated stack, the system can then pop the completed correction and resume PLAN2 with the new (next) step M1. The system parses the user's next utterance (&quot;Now, make an individual employee concept whose first name is 'Sam' and whose last names is 'Jones'&quot;) and again picks up an initial clue word, this time one that explicitly marks the utterance as a continuation and thus reinforces coherence preference (1). The utterance can indeed be recognized as a continuation of PLAN2, e.g. CONTINUE-PLAN( user, system, M1, MAKE1, PLAN2), analogously to the above detailed explanations. M1 and PLAN2 are bound due to prerequisite satisfaction, and MAKE1 chained through P1 due to constraint satisfaction. The updated stack is shown in Figure 7. At this stage, it would then be appropriate for the system to pop the completed CONTINUE plan and resume execution of PLAN2 by performing MAKEI.</Paragraph> </Section> class="xml-element"></Paper>