XML Viewer - p84-1030

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/84/p84-1030_metho.xml
Size: 22,351 bytes
Last Modified: 2025-10-06 14:11:36
<?xml version="1.0" standalone="yes"?>
<Paper uid="P84-1030">
  <Title>Mays, Eric. &amp;quot;Correcting Misconceptions About Data</Title>
  <Section position="3" start_page="0" end_page="139" type="metho">
    <SectionTitle>
I PRAGMATIC OVERSHOOT AND PRCBLEM LOCALIZATION
</SectionTitle>
    <Paragraph position="0"> Even if the syntactic and semantic content of a request is correct, so that a natural language front end can derive s coherent representation of its meaning, its praamatlc content or the structure of the underlying system may make aSy direct response to the request impossible or mlsleadln~ According to Sondbelmer end Welschedel (Sondhelmer, 1980), an input exhibits ~ ~ If the representation of its meaning is beyond the capabilities of the underlying system. Kaplan (1979), Mays (1980e), and Carberry (1984) have each worked on strategies for dealing wltb particular classes of such praamatlc failures. This paper addresses the problem of identifying the most si~ctflcant reason that a plan to achieve a user goal cannot be carried out.</Paragraph>
    <Paragraph position="1"> The approach to pragmatic fnilure taken In thls paper is to use a planner to verify the presumptions in a request. The presumptions behind a request become the subEoals of e plan to fulfill the request. Oslng Mays' (1980a) example, the query &amp;quot;Which faculty members take coursas?&amp;quot; Is here handled as an instance of an IDENTIFY-SET-~EHS * This material Is based upon work supported by the National Sclence Foundation under grants LST-8009673 and IST-8311~00.</Paragraph>
    <Paragraph position="2"> * * Unix is a trademark of Bell Laboratories.</Paragraph>
    <Paragraph position="3"> goal, and the pragmatlcs of the query are checked by looklng for a plan to achieve that goal. Determining both that faculty members and courses do exist and that faculty members can take courses are subEoals within that plan. A presuppositlonal failure is noted if the planner is unable to complete a plan for the goal.</Paragraph>
    <Paragraph position="4"> Furthermore, PSr~formation for recovery processing or expleaatory responses can be derived directly from the fniled plan by identifying whatever blocked goal in the planning tree of subgoals Is most nignif~cant. Thus, in the example above, if the planner failed because It was unable to show that faculty can take courses, the helpful response would be to explain this presumption failure. We concentrate here on identifying the signifleant blocks rather than on generating natural language responses.</Paragraph>
    <Paragraph position="5"> The examples in this paper will be drawn from a pleaning System intended to function as the pragmatic overshoot component of a cooperative natural language interface to the Unix operating system.</Paragraph>
    <Paragraph position="6"> We chose Unix, much as Wilensky (1982) did for his Unix Consultant, as a fomiliar domain that was still complex enough to require interesting planning~ In this system, the praRmatics of a user request are tested by building a tree of plan structures whose leaves are elementary facts available to the operating system. For instance, the following planning tree Is built in response to the request to print a file:  sands, and OR children by vertical bars. Initial question marks precede plea variables.) If a singie node In thls planning tree fails, say (IS-TEXT-FILE ?file), that Information can be used In explnining the failure to the user.</Paragraph>
    <Paragraph position="7">  The failure of certain nodes could also trigger recovery processing, as in the following example, where the failure of (UP-AND-RUNNING ?device) triggers the suggestion of an alternative device: User: Please send the file to the laser printer.</Paragraph>
    <Paragraph position="8"> System: The laser printer is dowm Is the line printer satisfactory? This planning scheme offers a way of recognizing and responding to such temporarily unfulfillable requests as well as to other pragmatic failures from requests unfulfillable in context, which is an important, though largely untouched, problem.</Paragraph>
    <Paragraph position="9"> A difficulty arises, however, when more than one of the planning tree precondition nodes fail. Even in a tree that was entirely made up of AND nodes, multiple failures would require either a llst of responses, or else scme way of choosing which of the failures is most meaningful to report. In a plan tree containing OR nodes, where there are often many alternative ways that have all failed of achieving particular goals, it becomes even more important that the system be able to identify which of the failures is most significant. This process of identifying the significant failures is called &amp;quot;problem localization&amp;quot;, and this paper describes heuristics and strategies that can be used for problem localization in failed planning trees.</Paragraph>
  </Section>
  <Section position="4" start_page="139" end_page="139" type="metho">
    <SectionTitle>
II HEURISTICS FOR PROBLEM LOCALIZATION
</SectionTitle>
    <Paragraph position="0"> The basic heuristics for problem localization can be derived by considering how a human expert would respond to someone who was pursuing an imposaible goal. Hot finding any suosessful plan, the expert tries to explain the block by showing that every plan must fail. Thus, if more than one branch of an AND node in a plan fails, the most significant one to be reported is the one that the user is least likely to be able to change, since it makes the strongest case. (The planner must check all the branches of an AND node, even after one fails, to know which is most significant to report.) For instance, if all three of the children of PRINT-FILE in our example fail, (I~-TEXT-FILE ?file) is the one that should be reported, since it is least llkely that the user can affect that node.</Paragraph>
    <Paragraph position="1"> If the READ-PERM failure were reported first, the user would waste time changing the read permission of a non-text file. Unix's actual behavior, which reports the first problem that it happens to discover in trying to execute the co@mend, is often frustrating for exactly that reason. This heuristic of reporting the most serious failure at an AND node is closely related to ABSTRIP's use of &amp;quot;crltlcallty&amp;quot; numbers to divide a planner into levels of abstraction, so that the most critical features are dealt with first (Sacerdoti, 1974).</Paragraph>
    <Paragraph position="2"> The situation is different at OR nodes, where only a single child has to sueseed. Here the most serious failure can safely be ignored, as long as some other branch can be repaire~ Thus the most si~iflcant branch at an OR node should be the one the user is most likely to be able to affect. In * our example, READ-PERM-USER should usually be reported rather than READ-PERM-SUPER-USER, if both have failed, since most users have more hope of changing the former than the letter. There is a duality here between the AND and OR node heuristics that is llke the duality in the minimax evaluation of a move in a game tree, where one picks the best score at nodes where the choice is one's own, and the worst score at nodes where the opponent gets to choose.</Paragraph>
  </Section>
  <Section position="5" start_page="139" end_page="139" type="metho">
    <SectionTitle>
III STRATEGIES FOR PR~LEM LOCALIZATION
</SectionTitle>
    <Paragraph position="0"> Identification of the most significant failure requires the addition to the planner of knowledge about significance to be used in problea loealizatio~ Many mechanisms are possible, ranging from fixed, pre-set ordering of the children of nodes up through complex knowledge-based mechanlqms that include knowledge about the user,s probable goals.</Paragraph>
    <Paragraph position="1"> In this paper, we suggest a combination of statist-Ical &amp;quot;surprise scores&amp;quot; and speclal-purpose rules.</Paragraph>
    <Section position="1" start_page="139" end_page="139" type="sub_section">
      <SectionTitle>
Statistical ~UslnISurorise Scores
</SectionTitle>
      <Paragraph position="0"> This strategy relies on statistics that the system keeps dynamically onthe number of times that each branch of each plan has succeeded or failed. These are used to define a success ratio for each branch. For example, the PRINT-FILE plan might be annotated as follows:</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="139" end_page="139" type="metho">
    <SectionTitle>
SUCCESSES RATIO
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="7" start_page="139" end_page="140" type="metho">
    <SectionTitle>
FAILURES
</SectionTitle>
    <Paragraph position="0"> From these ratios, we derive surprise scores to provide some measure of how usual or unusual it is for a particular node to have succeeded or failed in the context of the goal giving rise to the node. The surprise score of a successful node is defined as 1.0 minus the success ratio, so that the success of a node llke I~-TEXT-FILE, that almost always succeeds, is less surprising than the success of UP-AND-RUNNING. Failed nodes get negative surprise scores, with the absolute value of the score again reflecting the amount of surprise.</Paragraph>
    <Paragraph position="1"> The surprise score of a failed node is set to the negative of the success ratio, so that the failure of IE-TEXT-FILE would be more surprising than that of UP-AND-RUNNING, and that would be reflected by a more strongly negative score.</Paragraph>
    <Paragraph position="2"> Here is an example of our PRINT-FILE plan instantiated for an unlucky user who has failed on all but two preconditions, with surprise scores added:  Note tbat the success of USER-READ-PERM-BIT-SET is not very surprising, since that node almost always succeeds; the failure of a node llke READ-PERM-SUPER-USER, which seldom succeeds, is much less surprising than the failure of UP-AND-RUNNING.</Paragraph>
    <Paragraph position="3"> We suggest keeping statistics and deriving surprise scores because we believe that they provide a useful if imperfect handle on judging the signlflcence cf failed nodes. Regarding OR nodes, strongly negative surprise scores identify branches that in the past experience of the system have usually succeeded, and these are the best guesses to be likely to succeed again. Thus READ-PERM-USER, the child of READ-PERM with the most strongly negative score, turns out to be the most likely to be tractable. The negatlve surprise scores at a failed OR node give a profile of the typical success ratios; to select the nodes that are generally most likely to succeed, we pick the most surprising failures, those with the most strongly negatlve surprise scores.</Paragraph>
    <Paragraph position="4"> At AND nodes, on the other hand, the goal is to identify the branch that is most critical, that is, least likely to succeed. Surprisingly, we find that the most critical branch tends in thls case also to be the most surprlalng failure. In our example, IS-TEXT-FILE, which the user can do nothing about, is the most surprising failure under PRINT-FILE, READ-PERM is next most surprising, and UP-AND-RUNNING, for which simply waiting often works, comes last. Therefore at AND nodes, llke at OR nodes, we will report the child wlth the most negative surprise score; at AND nodes, this tends to identify the most critical failures, while at OR nodes, it tends to select the most hopeful. Note that the combined effect of the AND and OR strategies is to choose from among all the failed nodes those that were statistically most likely to succeed.</Paragraph>
    <Paragraph position="5"> The main advantage of the statistical surprise score strategy is its low cost, both to design and execute. Another nice feature is the selfadjusting character of the surprise scores, based as they are on success statistics that the system updates on an onEolng basis. For example, the likelihood of GROUP-READ-PERM being reported would depend on how often that feature was used at a partlcular site. The main difficulty is that surprise scores are only a rough guide to the actual siEnlficance of a failed node. The true significance of a failure in the context of a particular command may depend on world knowledge that is beyond the grasp of the planning system (e.~, the laser printer is down for days this time rather than hours), or even on a part of the planning context itself that is not reflected in the statistical averages (e.~, READ-PERM-SUPER-USER is much more likely to succeed when READ-PERM is called as part of a system d,-,p ceamand than when it is called as part of PRINT-FILE). To get a more accurate grasp on the significance of particular failures, more knowledge-intenslve strategies must be employed.</Paragraph>
    <Paragraph position="6"> ~. Svecial-Purnose Problem Localization Rules As a mechanism for adding extra knowledge, we propose supplementing the surprise scores with conditlon-action rules attached to particular nodes in the planning tree. The cendltlons in these rules can test the success or failure of other nodes in the tree or determine the hi~er-level planning context, while the actions alter the problem localization result by changing the surprise scores attached to the nodes.</Paragraph>
    <Paragraph position="7"> The speclal-purpose rules which we have found useful so far add information about the criticality of particular nodes. Consider the following planaing tree, which is somewhat more successful than the previous one:</Paragraph>
  </Section>
  <Section position="8" start_page="140" end_page="141" type="metho">
    <SectionTitle>
SURPRISE
SUCCESS/FAILURE SCORE
</SectionTitle>
    <Paragraph position="0"/>
    <Paragraph position="2"> Relying on surprise scores alone, the most significant child of READ-PERM would be READ-PERM-USER, since its score is most strongly negative. However, since IS-OWNER has failed, a node which most users are powerless to change, it is clearly not helpful to choose READ-PERM-USER as the path to report. This is an example of the general rule that if we know that one child of an AND node is critlcal, we should include a rule to suppress that AND node whenever that child fails. Thus we attach the  by which the score should be reduced. The-rule's affect is to change READ-PERM-USER's score to-.17, which prevents it from being selected.</Paragraph>
    <Paragraph position="3"> With READ-PERM-USER suppressed, the surprise scores would then select READ-PERM-GROUP, which is a reasonable choice, but probably not the best one.</Paragraph>
    <Paragraph position="4"> While the failure of IS-~NER makes us less interested in READ-PERM-USER, the very surprising success of AUTHORIZED-SUPER-USER should draw the system's attention to the READ-PERM-SUPER-USER branch. We can arrange for this by attaching to  This rule would change READ-PERM-SUPER-USER's score from -.02 to -.79, and thus cause it to be the branch of READ-PEBM selected for reportln~ While our current rules are ell in these two forms, either suppressing or enhancing a parent's score on the basis of a critical child's failure or success, the mechanlam of special-purpose rules could be expanded to handle more complex forms of deduction. For example, it mlght be useful to add rules that calculate a criticality score for each node, working upward frem preassigned scores assigned to the leaves. If the rules could access information about the state of the system, they could also use that in Judging criticality, so that an UP-AND-RUNNING failure would be more critical If the device was expected to be down for a long time.</Paragraph>
    <Section position="1" start_page="141" end_page="141" type="sub_section">
      <SectionTitle>
OtheF Problem Localization
</SectionTitle>
      <Paragraph position="0"> While our System depends on surprise scores and rules, an entire range of strategies is possible. The simplest strategy would be to hand-code the problam localization into the plans themselves by the ordering of the branches. At AND nodes, the children that are more critical would be listed first, while at OR nodes, the lees critical, more hopeful, children would come first. In such a blocked tree, the first failed child could be selected below each node. A form of this hand-coded strategy is in force in a~y planner that stops exploring an AND node when a single child blocks; that effectively selects the first child tested as the significant failure in every case, since the others are not even explored. Hand-coding is an alternative to surprise scores for providing an initial comparative ranking of the children at each node, but it also would need supplementingwlth a strategy that can take account of unusual situations, such as our specisi-purpose rules.</Paragraph>
      <Paragraph position="1"> It might be possible to improve the parfor~mance of a surprise score System without adding the complexity of special-purpose rules by using a formula that allows the surprising success or failure of a child to Inarease or decrease the chances oPS its parent being reported. While such a formula could perhaps do much of the work now done by special-purpose rules, it seams a harder approach to control, and one more likely to be sensitive to inaccuracies in the surprise scores themselves.</Paragraph>
      <Paragraph position="2"> Proper Level p..~Deta.4.1 One final question concerns identifying the proper level of detail for helpful responses. The strategies discussed so far have all focused on choosing which of multiple blocked children to report, so that they identify a path frem the root to a leaf. Yet the leaves of the planning tree may well be too detailed to represent helpful responses. A selection strategy could report the node containing the appropriate level of detail for a given user. Modeling the expertise oPS a user and using that to select an appropriate description of the problem are significant problems in natural * language generation which we have not addressed.</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="141" end_page="142" type="metho">
    <SectionTitle>
IV RELATED APPLICATION ARE~
</SectionTitle>
    <Paragraph position="0"> While developed here in the context of a pragmatice planner, strategies for problem localization could have wide applicability. For instance, the MYCIN-llke &amp;quot;How?&amp;quot; and &amp;quot;why?&amp;quot; questions (Shortllffe, 1976) used in the explanation components of many expert systems already use either the already-built successful proof tree or the portion currently being explored as a source of explanation~ Swattout (1983) adds extra knowledge that allows the system to Justify its answers in the user's terms, but the user must still direct the exploration. An effective problem localization facility would allow the System to answer the question &amp;quot;Why not?e; that is, the user could ask why a certain goal was not substantiated, and the System would reply by identifying the surprising nodes that are likely to be the slgnlflcant causes of the failure. Such &amp;quot;Why not? n questions could be useful not only in explanation but also in debugEin~ / In the same way, since the execution of a PRO-LCQ progr-m can be seen as the exploration of and AND-OR tree, effective problem localization techniques could be useful in debugging the failed trees that result frem incorrect logic programs.</Paragraph>
    <Paragraph position="1"> Another example is recovery processing in top-down paralng, such as using au~nented transition networks (Woods, 1970). When an ATN fails to parse a sentence, the blocked parse tree is quite similar to a blocked planning tree. Weischedel (1983) suEaests an approach to understanding ill-formed input that makes use of meta-rules to relax some of' the constraints on ATN arcs that blocked the original parse. Recovery processing in that model requires searching the blocked parse tree for nodes to which meta-rules can be applied. A problem localization strategy could be used to sort the  llst of blocked nodes, so that the most llkely candidatea would be tested first. The statistics of success ratios here would describe likely paths through the grammar. Nodes that exhibit surprising failure would be prime candidates for mets-rule processiag~ Before problem lor~alization can be applied in these related areas, further work needs to be done to see how many of the heuristics and strategies that apply to problem localization in the planning context can be carried over. The larger and more complex trees of an ATN or PROLO~. program may well require development of further strategies. Ho~ever, the nature of the problem is such that even an imperfect result is likely to be useful.</Paragraph>
  </Section>
  <Section position="10" start_page="142" end_page="142" type="metho">
    <SectionTitle>
V IMPLEMENTATION DE~CRIPTION
</SectionTitle>
    <Paragraph position="0"> The examples in this paper are taken frem an Interlisp implementation of a planner which does prs~atics checking for a limited set of Unixdo, sin requests. The problem localization c~ponent uses a combination of surprise scores and special purpose rules, as desoA'ibed. The statistics were derived by running the planner on a test set of commands in a simulated Unix environment.</Paragraph>
  </Section>
  <Section position="11" start_page="142" end_page="142" type="metho">
    <SectionTitle>
VI CONCLUSIONS
</SectionTitle>
    <Paragraph position="0"> In planning-based pra~matlcs processing, problem localization addresses the largely untouched problem of providing helpful responses to requests unfulfillable in context. Problem localization in the planning context requires identifying the most hopeful and tractable choice at OR nodes, but the most critical and problematic one at AND nodes.</Paragraph>
    <Paragraph position="1"> Statistical surprise scores provide a cheap but effective base strategy for problem localization, and condition-action rules are an appropriate mechanism for adding further sophistlcatio~ Further work should address (1) applying recovery strategies to the localized problem, if any recovery is appropriate; (2) investigating other applications, such as expert systems, back~ard-chnining inference, and top-down parsing; and (3) exploring natural language generation to report a block at an appropriate level of detail.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML