File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/p04-2002_metho.xml
Size: 18,068 bytes
Last Modified: 2025-10-06 14:08:59
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-2002"> <Title>Minimizing the Length of Non-Mixed Initiative Dialogs</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Non-mixed initiative dialogs </SectionTitle> <Paragraph position="0"> In recent years, dialog researchers have focused much attention on the study of mixed-initiative behaviors in natural language dialogs. In general, mixed initiative refers to the idea that control over the content and direction of a dialog may pass from one participant to another. 1 Cohen et al. (1998) provides a good overview of the various definitions of dialog initiative that have been proposed. Our work adopts a definition similar to Guinn (1999), who posits that initiative attaches to specific dialog goals.</Paragraph> <Paragraph position="1"> This paper considers non-mixed-initiative dialogs, which we shall take to mean dialogs with the following characteristics: 1. The dialog has two participants, the leader and the follower, who are working cooperatively to achieve some mutually desired dialog goal.</Paragraph> <Paragraph position="2"> 2. The leader may request information from the follower, or may inform the follower that the dialog has succeeded or failed to achieve the dialog goal.</Paragraph> <Paragraph position="3"> fact in direct response to a request for information from the leader, or inform the leader that it cannot fulfill a particular request.</Paragraph> <Paragraph position="4"> The model assumes the leader knows sets of questions a0a2a1a4a3</Paragraph> <Paragraph position="6"> such that if all questions in any one set a5a14a1 are answered successfully by the follower, the dialog goal will be satisfied. The sets will be referred to hereafter as rule sets. The leader's task is to find a rule set a5 whose constituent questions can all be successfully answered. The method is to choose a sequence of questions a0a18a1a33a21a34a19a3a35a21a25a12a13a0a18a1 a26 a19a3 a26 a12a18a17a18a17a18a17a36a12a13a0a18a1a38a37a25a19a3a39a37 which will lead to its discovery. null For example, in a dialog in a customer service setting in which the leader attempts to locate the follower's account in a database, the leader might request the follower's name and account number, or might request the name and telephone number. The corresponding rule sets for such a dialog would be a9a10a40a42a41a32a43a45a44a46a40a48a47a50a49a51a12a13a40a42a41a32a43a45a52a54a53a56a55a32a57a58a49a10a44a59a55a22a23 and a9a10a40a45a41a36a43a45a44a46a40a48a47a50a49a51a12a13a40a42a41a32a43a45a60a62a61a25a61a25a63a34a44a59a55a22a23 . One complicating factor in the leader's task is that a question a0 a1a64a19a3 in one rule set may occur in several other rule sets so that choosing to ask a0a14a1a33a19a3 can have ramifications for several sets.</Paragraph> <Paragraph position="7"> We assume that for every question a0a36a1a33a19a3 the leader knows an associated probability a65 a1a33a19a3 that the follower has the knowledge necessary to answer a0a32a1a33a19a3 .2 These probabilities enable us to compute an expected length for a dialog, measured by the number of questions asked by the leader. Our goal in selecting a sequence of questions will be to minimize the expected length of the dialog.</Paragraph> <Paragraph position="8"> The probabilities may be estimated by aggregating the results from all interactions, or a more sophisticated individualized model might be maintained for each participant. Some examples of how these probabilities might be estimated can be 2In addition to modeling the follower's knowledge, these probabilities can also model aspects of the dialog system's performance, such as the recognition rate of an automatic speech recognizer.</Paragraph> <Paragraph position="9"> found in (Conati et al., 2002; Zukerman and Albrecht, 2001).</Paragraph> <Paragraph position="10"> Our model of dialog derives from rule-based theories of dialog structure, such as (Perrault and Allen, 1980; Grosz and Kraus, 1996; Lochbaum, 1998). In particular, this form of the problem models exactly the &quot;missing axiom theory&quot; of Smith and Hipp (1994; 1995) which proposes that dialog is aimed at proving the top-level goal in a theorem-proving tree and &quot;missing axioms&quot; in the proof provide motivation for interactions with the dialog partner. The rule sets a5a32a1 are sets of missing axioms that are sufficient to complete the proof of the top-level goal.</Paragraph> <Paragraph position="11"> Our format is quite general and can model other dialog systems as well. For example, a dialog system that is organized as a decision tree with a question at the root, with additional questions at successor branches, can be modeled by our format.</Paragraph> <Paragraph position="12"> As an example, suppose we have top-level goal a63a66a6 and these rules to prove it:</Paragraph> <Paragraph position="14"> The corresponding rule sets are</Paragraph> <Paragraph position="16"> If all of the questions in either a5 a6 or a5 a15 are satisfied, a63a66a6 will be proven. If we have values for the probabilities a65 a6a11a6a18a12 a65 a6a16a15 , and a65 a15a25a6 , we can design an optimum ordering of the questions to minimize the expected length of dialogs. Thus if a65 a6a11a6 is much smaller than a65 a15a25a6 , we would ask a0 a15a25a6 before asking a0a7a6a11a6 . The reader might try to decide when a0a14a6a16a15 should be asked before any other questions in order to minimize the expected length of dialogs.</Paragraph> <Paragraph position="17"> The rest of the paper examines how the leader can select the questions which minimize the over-all expected length of the dialog, as measured by the number of questions asked. Each question-response pair is considered to contribute equally to the length. Sections 3, 4, and 5 describe polynomial-time algorithms for finding the optimum order of questions in three special instances of the question ordering optimization problem.</Paragraph> <Paragraph position="18"> Section 6 gives a polynomial-time method to approximate optimum behavior in the general case of a57 rule sets which may have many common questions. null</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Case: One rule set </SectionTitle> <Paragraph position="0"> Many dialog tasks can be modeled with a single rule set a5 a8 a9a10a0a7a6a18a12a13a0a2a15a14a12a18a17a18a17a18a17a2a12a13a0a18a20 a23 . For example, a leader might ask the follower to supply values for each field in a form. Here the optimum strategy is to ask the questions first that have the least probability of being successfully answered.</Paragraph> <Paragraph position="1"> Theorem 1. Given a rule set a5 a8 a9a10a0 a6 a12a18a17a18a17a18a17a36a12a13a0 a20 a23 , asking the questions in the order of their probability of success (least first) results in the minimum expected dialog length; that is, for a1 a8</Paragraph> <Paragraph position="3"> a1 is the probability that the follower will answer question a0a36a1 successfully. null A formal proof is available in a longer version of this paper. Informally, we have two cases; the first assumes that all questions a0a36a1 are answered successfully, leading to a dialog length of a57 , since a57 questions will be asked and then answered.</Paragraph> <Paragraph position="4"> The second case assumes that some a0a36a1 will not be answered successfully. The expected length increases as the probabilities of success of the questions asked increases. However, the expected length does not depend on the probability of success for the last question asked, since no questions follow it regardless of the outcome. Therefore, the question with the greatest probability of success appears at the end of the optimal ordering. Similarly, we can show that given the last question in the ordering, the expected length does not depend upon the probability of the second to last question in the ordering, and so on until all questions have been placed in the proper position. The optimal ordering is in order of increasing probability of success. null</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Case: Two independent rule sets </SectionTitle> <Paragraph position="0"> We now consider a dialog scenario in which the leader has two rule sets for completing the dialog task.</Paragraph> <Paragraph position="1"> Definition 4.1. Two rule sets a5a22a6 and a5a10a15 are independent if a5a51a6a10a9a59a5a10a15 a8a12a11 . If a5a7a6a10a9a50a5a10a15 is non-empty, then the members of a5a48a6a10a9a46a5a10a15 are said to be common to a5a51a6 and a5a10a15 . A question a0 is unique to rule set a5 if a0a14a13 a5 and for all a5a16a15a18a17a8 a5 , a0a20a19a13 a5a21a15 In a dialog scenario in which the leader has multiple, mutually independent rule sets for accomplishing the dialog goal, the result of asking a question contained in one rule set has no effect on the success or failure of the other rule sets known by the leader. Also, it can be shown that if the leader makes optimal decisions at each turn in the dialog, once the leader begins asking questions belonging to one rule set, it should continue to ask questions from the same rule set until the rule set either succeeds or fails. The problem of selecting the question that minimizes the expected dialog length a22a24a23a26a25a28a27 becomes the problem of selecting which rule set should be used first by the leader.</Paragraph> <Paragraph position="2"> Once the rule set has been selected, Theorem 1 shows how to select a question from the selected rule set that minimizesa22a24a23a26a25a28a27 .</Paragraph> <Paragraph position="3"> By expected dialog length, we mean the usual definition of expectation</Paragraph> <Paragraph position="5"> Thus, to calculate the expected length of a dialog, we must be able to enumerate all of the possible outcomes of that dialog, along with the probability of that outcome occurring, and the length associated with that outcome.</Paragraph> <Paragraph position="6"> Before we show how the leader should decide which rule set it should use first, we introduce some notation.</Paragraph> <Paragraph position="7"> The expected length in case of failure for an ordering a55 a8 a0a7a6a18a12a18a17a18a17a18a17a10a12a13a0a2a20 of the questions of a rule set a5 is the expected length of the dialog that would result if a5 were the only rule set available to the leader, the leader asked questions in the order given by a55 , and one of the questions in a5 failed.</Paragraph> <Paragraph position="8"> The expected length in case of failure is counts for the fact that we are counting only cases in which the dialog fails. We will let a36 a1 represent the minimum expected length in case of failure for rule set a5a10a1 , obtained by ordering the questions of a5a14a1 by increasing probability of success, as per Theorem 1.</Paragraph> <Paragraph position="9"> The probability of success a41 of a rule set</Paragraph> <Paragraph position="11"> of probability of success of a rule set assumes that the probabilities of success for individual questions are mutually independent.</Paragraph> <Paragraph position="12"> Theorem 2. Let a0 a8 a9a2a5a51a6a2a12a11a5a36a15a10a23 be the set of mutually independent rule sets available to the leader for accomplishing the dialog goal. For a rule set</Paragraph> <Paragraph position="14"> be the number of questions in a5a32a1 , anda36a32a1 be the minimum expected length in case of failure. To minimize the expected length of the dialog, the leader should select the question with the least probability of success from the rule set a5a32a1 with the least</Paragraph> <Paragraph position="16"> Proof: If the leader uses questions from a5a42a6 first, the expected dialog lengtha22a24a23a26a25a6 a27 is</Paragraph> <Paragraph position="18"> The first term, a41a48a6a11a57 a6 , is the probability of success for a5a7a6 times the length of a5a48a6 . The second term,</Paragraph> <Paragraph position="20"> a27 , is the probability that a5a48a6 will and a5 a15 will succeed times the length of that dialog. The third term, a23a2 a3 a41a48a6 a27a39a23a2 a3 a41a36a15 a27a39a23a36a22a6a8a1 a36a14a15 a27 , is the probability that both a5a48a6 and a5a36a15 fail times the associated length. We can multiply out and rearrange terms to get</Paragraph> <Paragraph position="22"> common terms, we find that a23a5 a6 a12a11a5 a15 a27 is the correct</Paragraph> <Paragraph position="24"> Thus, if the above inequality holds, thena22a24a23a26a25a6 a27a45a44 a27 , and the leader should ask questions from a5a32a15 first. We conjecture that in the general case of a47 mutually independent rule sets, the proper ordering of rule sets is obtained by calculating a57 a1a46a1 a36a32a1a23</Paragraph> <Paragraph position="26"> for each rule set a5a36a1 , and sorting the rule sets by those values. Preliminary experimental evidence supports this conjecture, but no formal proof has been derived yet.</Paragraph> <Paragraph position="27"> Note that calculating a41 and a36 for each rule set takes polynomial time, as does sorting the rule sets into their proper order and sorting the questions within each rule set. Thus the solution can be obtained in polynomial time.</Paragraph> <Paragraph position="28"> As an example, consider the rule sets a5 a6 a8</Paragraph> <Paragraph position="30"> the same for both rule sets. However, a36 a6 a8 a2 a17a51a55 a2</Paragraph> <Paragraph position="32"> both rule sets, we discover that asking questions from a5 a15 first results in the minimum expected dialog length.</Paragraph> <Paragraph position="33"> 5 Case: Two rule sets, one common question We now examine the simplest case in which the rule sets are not mutually independent: the leader has two rule sets a5a51a6 and a5a10a15 , and a5a7a6a10a9a59a5a10a15 a8 a9a10a0a62a61a25a23 . In this section, we will use a22a24a23a26a25a29a63a48a27 to denote the minimum expected length of the dialog (computed using Theorem 1) resulting from the leader using only a5a2a1 to accomplish the dialog task. The notation</Paragraph> <Paragraph position="35"> a27 will denote the minimum expected length of the dialog resulting from the leader using only the rule set a5a10a1 a3 a9a10a0a62a61a25a23 to accomplish the dialog task. For example, a rule set a5a48a6a29a8 a9a10a0a14a6a18a12a13a0a2a15a32a12a13a0a62a61a25a23 witha65 a6a29a8</Paragraph> <Paragraph position="37"> Theorem 3. Given rule sets a5a22a6 a8 a9a10a0a62a61a24a12a13a0a14a6a18a12a18a17a18a17a18a17a10a12a13a0a18a20 a23 and a5a36a15 , such that a5a51a6 a9 a5a10a15a30a8 a9a10a0a69a61a25a23 , if the leader asks questions from a5a48a6 until a5a51a6 either succeeds or fails before asking any questions unique to a5a14a15 , then the ordering of questions of a5 a6 that results in the minimum expected dialog length is given by ordering the questions a0a2a1 by increasing a70 a1 , where a59 are defined in Section 5.</Paragraph> <Paragraph position="38"> increasing probability of success given that the position of a0a69a61 is fixed. Then we show that given the correct ordering of unique questions of a5a42a6 , a0a62a61 should appear in that ordering at the position where a72 falls in the corresponding sequence of questions probabilities of success. Space considerations preclude a complete listing of the proof, but an outline follows.</Paragraph> <Paragraph position="39"> Figure 1 shows an expression for the expected dialog length for a dialog in which the leader asks questions from a5a48a6 until a5a51a6 either succeeds or fails before asking any questions unique to a5 a15 . The expression assumes an arbitrary ordering a55a54a8 dialog terminates. If a question occurring after a0 a61 fails, the rest of the dialog has minimum expected lengtha22a29a23a26a25a63a36a64a26 a27 .</Paragraph> <Paragraph position="40"> If we fix the position of a0 a61 , we can show that the questions unique to a5a48a6 must be ordered by increasing probability of success in the optimal ordering. The proof proceeds by showing that switching the positions of any two unique questions a0 a71 and a0a10a9 in an arbitrary ordering of the questions of a5 a6 , where a0 a71 occurs before a0a11a9 in the original ordering, the expected length for the new ordering is less than the expected length for the original ordering if and only if a65 a9 a44 a65 a71 .</Paragraph> <Paragraph position="41"> After showing that the unique questions of a5a42a6 must be ordered by increasing probability of success in the optimal ordering, we must then show how to find the position of a0 a61 in the optimal ordering. We say that a0 a61 occurs at position a40 in ordering a55 if a0a69a61 immediately follows a0 a71 in the ordering. a22a29a23a26a25a71a27 is the expected length for the ordering with a0a69a61 at position a40 . We can show that if by a process similar to that used in the proof of Theorem 2. Since the unique questions in a5a42a6 are ordered by increasing probability of success, finding the optimal position of the common question a0 in the ordering of the questions of a5 a6 corresponds to the problem of finding where the value a72 falls in the sorted list of probabilities of success of the unique questions of a5 a6 . If the value immediately precedes the value of a65 a1 in the list, then the common question should immediately precede a0a2a1 in the optimal ordering of questions of a5a51a6 .</Paragraph> <Paragraph position="42"> Theorem 3 provides a method for obtaining the optimal ordering of questions in a5a22a6 , given that a5a51a6 is selected first by the leader. The leader can use the same method to determine the optimal ordering of the questions of a5a32a15 if a5a10a15 is selected first. The two optimal orderings give rise to two different expected dialog lengths; the leader should select the rule set and ordering that leads to the minimal expected dialog length. The calculation can be done in polynomial time.</Paragraph> </Section> class="xml-element"></Paper>