File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-2185_metho.xml
Size: 19,661 bytes
Last Modified: 2025-10-06 14:15:00
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2185"> <Title>An Interactive Domain Independent Approach to Robust Dialogue Interpretation</Title> <Section position="3" start_page="0" end_page="1131" type="metho"> <SectionTitle> 2 Interactive Repair In Depth </SectionTitle> <Paragraph position="0"> As mentioned above, ROSE repairs extragralnmatical input in two phases. The first phase, Repair Hypothesis Formation, is responsible for assembling a ranked set of ten or fewer hypotheses about the meaning of the ungrammatical utterance expressed in the source language. This phase is itself divided into two stages, Partial Parsing and Combination.</Paragraph> <Paragraph position="1"> The Partial Parsing stage is similar to the concept of the listener &quot;casting his net&quot; for comprehensible fragments of speech. A robust skipping parser (Lavie, 1995) is used to obtain an analysis for islands of the speaker's sentence. In the Combination stage, the fragments from the partial parse are assembled into a ranked set of alternative meaning representation hypotheses. A genetic programming (Koza, 1992; Koza, 1994) approach is used to search for different ways to combine the fragments in order to avoid requiring any hand-crafted repair rules. Our genetic programming approach has been shown previously to be orders of magnitude more efficient, than the nainimum distance parsing approach (Ros6 and Lavie, 1997). In the second phase, Interaction with the user, the system generates a set of queries, negotiating with the speaker in order to narrow down to a single best meaning representation hypothesis. Or, if it determines based on the user's responses to its queries that none of its hypotheses are acceptable, it requests a rephrase.</Paragraph> <Paragraph position="2"> Inspired by (Clark and Wilkes-Gibbs, 1986; Clark and Schaefer, 1989), the goal of the Interaction Phase is to minimize collaborative effort between the system and the speaker while maintaining a high level of interpretation accuracy. It uses this principle in determining which portions of the speaker's utterance to question. Thus, it focuses its interaction on those portions of the speaker's meaning that it is particularly uncertain about. In its questioning, it attempts to display the state of the system's understanding, acknowledging information conveyed by the speaker as it becomes clear. The interaction process can be summarized as follows: The system first assesses the state of its understanding of what the speaker has said by extracting features that distinguish the top set of hypotheses from one another.</Paragraph> <Paragraph position="3"> It then builds upon this understanding by cycling through the following four step process: selecting a feature; generating a natural language query from this feature; updating its list of alternative hypotheses based on the user's answer; and finally updating its list of distinguishing features based on the remaining set of alternative hypotheses.</Paragraph> <Section position="1" start_page="1129" end_page="1130" type="sub_section"> <SectionTitle> 2.1 Extracting Distinguishing Features </SectionTitle> <Paragraph position="0"> In the example in Figure 1, the Hypothesis Formation phase produces three alternative hypotheses.</Paragraph> <Paragraph position="1"> The hypotheses are ranked using a trained evaluation function, but the hypothesis ranked first is not guaranteed to be best. In this case, the hypothesis ranked as second is the best hypothesis. The hypotheses are expressed in a frame-based feature structure representation. Above each hypothesis is the corresponding text generated by&quot; the system for the associated feature structure.</Paragraph> <Paragraph position="2"> In order for the system to return the correct hypothesis, it must use interaction to narrow down the list of alternatives to the best single one. The first task of the Interaction Mechanism is to determine what the system knows about what the speaker has said and what it is not certain about. It does this by comparing the top set of repair hypotheses and extracting a set of features that distinguish them from one another. The set of distinguishing features corresponding to the example set of alternative hypotheses can be found in Figure 2.</Paragraph> <Paragraph position="3"> The meaning representation's recursive structure is made up of frames with slots that can be filled either with other frames or with atomic fillers. These compositional structures can be thought of as trees, with the top level frame being the root of the tree and branches attached through slots. The features Sentence: What did you say 'bout what was your schedule for the twenty sixth of May?</Paragraph> <Paragraph position="5"> used in the system to distinguish alternative meaning representation structures from one another specify paths down this tree structure. Thus, the distinguishing features that are extracted are always anchored in a frame or atomic filler, marked by an f in Figure 2. Within a feature, a frame may be followed by a slot, marked by an s. And a slot may be followed by a frame or atomic filler, and so on.</Paragraph> <Paragraph position="6"> These features are generated by comparing the set of feature structures returned from the Hypothesis Formation phase. No knowledge about what the features mean is needed in order to generate or use these features. Thus, the feature based approach is completely domain independent. It can be used without modification with any frame-based meaning representation.</Paragraph> <Paragraph position="7"> When a feature is applied to a meaning representation structure, a value is obtained. Thus, features can be used to assign meaning representation structures to classes according to what value is obtained for each when the feature is applied. For example, the feature ((f *schedule) (s who)(:f *you)), distinguishes structures that contain the filler *you in the ~:ho slot in the *schedule frame from those that do not. When it is applied to structures that contain the specified frame in the specified slot, it returns true. When it is applied to structures that do not, it returns false. Thus, it groups the first and third hypotheses in one class, and the second hypothesis in another class. Because the value of a feature that ends in a frame or atomic filler can have either true or false as its value, these are called yes/no features. When a feature that ends in a slot., such as ((f *schedule)(s who)), is applied to a feature structure, the value is the filler in the specified slot. These features are called wh-features.</Paragraph> <Paragraph position="8"> Each feature is associated with a question that the system could ask the user. The purpose of the generated question is to determine what the value of the feature should be. The system can then keep those hypotheses that are consistent with that feature value and eliminate from consideration the rest. Generating a natural language question from a feature is discussed in section 2.3.</Paragraph> </Section> <Section position="2" start_page="1130" end_page="1131" type="sub_section"> <SectionTitle> 2.2 Selecting a Feature </SectionTitle> <Paragraph position="0"> Once a set of features is extracted, the system enters a loop in which it selects a feature from the list, generates a query, and then updates the list of alternative hypotheses and remaining distinguishing features based on the user's response. It attempts to ask the most informative questions first in order to limit the number of necessary questions. It uses the following four criteria in making its selection: * Askable: Is it possible to ask a natural question from it? * Evaluatable: Does it ask about a single repair or set of repairs that always occur together? * In Focus: Does it involve information from the common ground? * Most Informative: Is it likely to result in the greatest search space reduction? First, the set of features is narrowed down to those features that represent askable questions. For example, it is not natural to ask about the filler of a particular slot in a particular frame if it is not known whether the ideal meaning representation structure contains that frame. Also, it is awkward to generate a wh-question based on a feature of length greater than two. For example, a question corresponding to ((f *how) (s what) (f *interval) (s end) ) might be phrased something like &quot;How is the time ending when?&quot;. So even-lengthed features more than two elements long are also eliminated at this stage.</Paragraph> <Paragraph position="1"> The next criterion considered by the Interaction phase is evaluatability. In order for a Yes/No question to be evaluatable, it must confirm only a single repair action. Otherwise, if the user responds with &quot;No&quot; it cannot be determined whether the user is rejecting both repair actions or only one of them. Next, the set of features is narrowed down to those that can easily be identified as being in focus. In order to do this, the system prefers to use features that overlap with structures that all of the alternative hypotheses have in common. Thus, the system encodes as much COlmnon ground knowledge in each question as possible. The structures that all of the alternative hypotheses share are called non-controversial substructures. As the negotiation continues, these tend to be structures that have been confirmed through interaction. Including these sub-structures has the effect of having questions tend to follow in a natural succession. It also has the other desirable effect that the system's state of understanding the speaker's sentence is indicated to the speaker.</Paragraph> <Paragraph position="2"> The final piece of information used in selecting between those remaining features is the expected search reduction. The expected search reduction indicates how much the search space can be expected to be reduced once the answer to the corresponding question is obtained from the user. Equation 1 is for calculating S/, the expected search reduction of feature number f.</Paragraph> <Paragraph position="4"> L is the number of alternative hypotheses. As mention above, each feature can be used to assign the hypotheses to equivalence classes, l{,! is the number of alternative hypotheses in the /th equivalence class of feature f. If the value for feature f associated with the class of length l{,\] is the correct value, l{, 1 will be the new size of the search space.</Paragraph> <Paragraph position="5"> In this case, the actual search reduction will be the current number of hypotheses, L, minus the number of alternative hypotheses in the resulting set, l{,I.</Paragraph> <Paragraph position="6"> Intuitively, the expected search reduction of a feature is the sum over all of a feature's equivalence classes of the percentage of hypotheses in that class times the reduction in the search space assuming the associated value for that feature is correct.</Paragraph> <Paragraph position="7"> The first three criteria select a subset of the current distinguishing features which the final criterion then ranks. Note that all of these criteria can be evaluated without the system having any understanding about what the features actually mean.</Paragraph> </Section> <Section position="3" start_page="1131" end_page="1131" type="sub_section"> <SectionTitle> 2.3 Generating Query Text </SectionTitle> <Paragraph position="0"> The selected feature is used to generate a query for the user. First, a skeleton structure is built from the feature, with top level frame equivalent to the frame at the root of the feature. Then the skeleton structure is filled out with the non-controversial substructures. If the question is a Yes/No question, it includes all of the substructures that would be non-controversial assuming the answer to the question is &quot;Yes&quot;. Since information confirmed by the previous question is now considered non-controversial, the result of the previous interaction is made evident in how the current question is phrased. An example of a question generated with this process can be found in Figure 3.</Paragraph> <Paragraph position="1"> If the selected feature is a wh-feature, i.e., if it is an even lengthed feature, the question is generated in the form of a wh-question. Otherwise the text is generated declaratively and the generated text is inserted into the following formula: &quot;Was something like XXX part of what you meant?&quot;, where XXX is filled in with the generated text. The set of alternative answers based on the set of alternative hypotheses is presented to the user. For wh-questions, a final alternative, &quot;None of these alternatives are acceptable&quot;, is made available. Again, no particular domain knowledge is necessary for the purpose of generating query text from features since the sentence level generation component from the system can be used as is.</Paragraph> </Section> <Section position="4" start_page="1131" end_page="1131" type="sub_section"> <SectionTitle> 2.4 Processing the User's Response </SectionTitle> <Paragraph position="0"> Once the user has responded with the correct value for the feature, only the alternative hypotheses that have that value for that feature are kept, and the rest are eliminated. In the case of a wh-question, if the user selects &quot;None of these alternatives are acceptable&quot;, all of the alternative hypothesized structures are eliminated and a rephrase is requested. After this step, all of the features that no longer partition the search space into equivalence classes are also eliminated. In the example, assume the answer to the generated question in Figure 3 was &quot;Yes&quot;. Thus, the result is that two of the original three hypotheses are remaining, displayed in Figure 4, and the remaining set of features that still partition the search space can be found in Figure 5.</Paragraph> <Paragraph position="1"> If one or more distinguishing features remain, the cycle begins again by selecting a feature, generating a question, and so on until the system narrows down to the final result. If the user does not answer positively to any of the system's questions by the time it runs out of distinguishing features regarding a particular sentence, the system loses confidence in its set of hypotheses and requests a rephrase.</Paragraph> </Section> </Section> <Section position="4" start_page="1131" end_page="1132" type="metho"> <SectionTitle> 3 Using Discourse Information </SectionTitle> <Paragraph position="0"> Though discourse processing is not essential to the ROSE approach, discourse information has been found to be useful in robust interpretation (Ramshaw, 1994; Smith, 1992). In this section we discuss how discourse information can be used for focusing the interaction between system and user on the task level rather than oil the literal meaning of the user's utterance.</Paragraph> <Paragraph position="1"> A plan-based discourse processor (Ros6 et al., 1995) provides contextual expectations that guide the system in the manner in which it formulates Sentence: What about any time but the ten to twelve slot on Tuesday the thirtieth?</Paragraph> <Section position="1" start_page="1132" end_page="1132" type="sub_section"> <SectionTitle> Hypothesis 1: </SectionTitle> <Paragraph position="0"> &quot;How about from ten o'clock till twelve o'clock Query Without discourse: Was something like &quot;how about from ten o'clock till twelve 'clock&quot; part of what you meant? Query With discourse: Are you suggesting that Tuesday November the thirtieth from ten a.m. till twelve a.m. is a good time to meet? queries to the user. By computing a structure for the dialogue, the discourse processor is able to identify the speech act performed by each sentence. Additionally, it augments temporal expressions from context. Based on this information, it computes the constraints on the speaker's schedule expressed by each sentence. Each constraint associates a status with a particular speaker's schedule for time slots within the time indicated by the temporal expression. There are seven possible statuses, ineluding accepted, suggested, preferred, neutral, dispreferred, busy, and rejected.</Paragraph> <Paragraph position="1"> As discussed above, the Interaction Mechanism uses features that distinguish between alternative hypotheses to divide the set of alternative repair hypotheses into classes. Each member within the same class has the same value for the associated feature. By comparing computed status and augmented temporal information for alternative repair hypotheses within the same class, it is possible to determine what common implications for the task each member or most of the members in the associated class have. Thus, it is possible to compute what implications for the task are associated with the corresponding value for the feature. By comparing this common information across classes, it is possible to determine whether the feature makes a consistent distinction on the task level. If so, it is possible to take this distinguishing information and use it for refocusing the associated question on the task level rather than on the level of the sentence's literal meaning.</Paragraph> <Paragraph position="2"> In the example in Figure 6, the parser is not able to correctly process the &quot;but&quot;, causing it to miss the fact that the speaker intended any other time besides ten to twelve rather than particularly ten to twelve. Two alternative hypotheses are constructed during the Hypothesis Formation phase. However, neither hypothesis correctly represents the meaning of the sentence. In this case, the purpose of the interaction is to indicate to the system that neither of the hypotheses are correct and that a rephrase is needed. This will be accomplished when the user answers negatively to the system's query since the user will not have responded positively to any of the system's queries regarding this sentence.</Paragraph> <Paragraph position="3"> The system selects the feature ((f *how)(s when) (f *interval)) to distinguish the two hypotheses from one another. Its generated query is thus &quot;Was something like HOW ABOUT FROM</Paragraph> </Section> </Section> <Section position="5" start_page="1132" end_page="1132" type="metho"> <SectionTitle> TEN OCLOCK TILL TWELVE OCLOCK part of </SectionTitle> <Paragraph position="0"> what you meant?&quot;. The discourse processor returns a different result for each of these two representations. In particular, only the first hypothesis contains enough information for the discourse processor to compute any scheduling constraints since it contains both a temporal expression and a top level semantic frame. It would create a constraint associating the status of suggested with a representation for Tuesday the thirtieth from ten o'clock till twelve o'clock. The other hypothesis contains date information but no status information. Based on this difference, the system can generate a query asking whether or not the user expressed this constraint. Its query is &quot;Are you suggesting that Tuesday, November the thirtieth from ten a.m. till twelve a.m. is a good time to meet?&quot; The suggested status is associated with a template that looks like &quot;Are you suggesting that XXX is a good time to meet?&quot; The XXX is then filled in with the text generated from the temporal expression using the regular system generation grammar.</Paragraph> </Section> class="xml-element"></Paper>