XML Viewer - h01-1015

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/h01-1015_intro.xml
Size: 10,118 bytes
Last Modified: 2025-10-06 14:01:04
<?xml version="1.0" standalone="yes"?>
<Paper uid="H01-1015">
  <Title>DATE: A Dialogue Act Tagging Scheme for Evaluation of Spoken Dialogue Systems</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2. CONVERSATIONAL DOMAINS
</SectionTitle>
    <Paragraph position="0"> The CONVERSATIONAL-DOMAIN dimension characterizes each utterance as primarily belonging to one of three arenas of conversational action. The first arena is the domain task, which in this case is air travel booking, and which we refer to below as ABOUT-TASK. The second domain of conversational action is the management of the communication channel, which we refer to as ABOUT-COMMUNICATION. This distinction has been widely adopted [19, 2, 9]. In addition, we identify a third domain of talk that we refer to as ABOUT-SITUATION-FRAME. This domain is particularly relevant for distinguishing human-computer from human-human dialogues, and for distinguishing dialogue strategies across the 9 COMMUNICATOR systems. Each domain is described in this section.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 About-Task
</SectionTitle>
      <Paragraph position="0"> The ABOUT-TASK domain reflects the fact that many utterances in a task-oriented dialogue originate because the goal of the dialogue is to complete a particular task to the satisfaction of both participants. Typically an about-task utterance directly asks for or presents task-related information, or offers a solution to a task goal.</Paragraph>
      <Paragraph position="1"> As Figure 1 shows, most utterances are in the ABOUT-TASK dimension, reflecting the fact that the primary goal of the dialogue is to collaborate on the task of making travel arrangements. The task column of Figure 1 specifies the subtask that each task-related utterance contributes to. DATE includes a large inventory of sub-tasks in the task/subtask dimension in order to make fine-grained distinctions regarding the dialogue effort devoted to the task or its subcomponents. Section 4 will describe the task model in more detail.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 About-Communication
</SectionTitle>
      <Paragraph position="0"> The ABOUT-COMMUNICATION domain reflects the system goal of managing the verbal channel and providing evidence of what has been understood [29, 8, 25]. Although utterances of this type occur in human-human dialogue, they are more frequent in human-computer dialogue, where they are motivated by the need to avoid potentially costly errors arising from imperfect speech recognition.</Paragraph>
      <Paragraph position="1"> In the COMMUNICATOR corpus, many systems use a conservative strategy of providing feedback indicating the system's understanding of the information provided by the user after each user turn. A typical example is the repetition of the origin and destination cities in Figures 1 and 6. This type of repetition is the IMPLICIT-CONFIRMATIONspeech-act (see Section 3 below). Some systems used a variable confirmation strategy where some information items may be confirmed as they are understood, but the system requests explicit confirmation of all task parameters before searching the database for matching flights. An example is in Figure 2.</Paragraph>
      <Paragraph position="2"> Here the system asks for explicit confirmation in SYS3 before going to the database. This is the first opportunity that the user has for making a correction, which he does in USER3. The system then again asks for explicit confirmation of its new understanding, which the user provides in USER4. After the user informs the system that it is a one-way flight in USER6, the system accesses the database. These explicit confirmations have the goal of avoiding a costly database lookup, where the retrieval is conditioned on the wrong parameters.</Paragraph>
      <Paragraph position="3"> All implicit and explicit confirmation speech-acts are categorized as ABOUT-COMMUNICATION because they are motivated by the potential errors that the system might make in understanding  the caller, or in diagnosing the causes of misunderstandings. In general, any utterance that reflects the system's understanding of something the user said is classified as ABOUT-COMMUNICATION.</Paragraph>
      <Paragraph position="4"> A second set of ABOUT-COMMUNICATIONutterances are APOLOGIES that the system makes for misunderstandings (see Section 3 below), i.e. utterances such as I'm sorry. I'm having trouble understanding you.,orMy mistake again. I didn't catch that. or I can see you are having some problems.</Paragraph>
      <Paragraph position="5"> The last category of ABOUT-COMMUNICATION utterances are the OPENINGS/CLOSINGSby which the system greets or says good-bye to the caller. (Again, see Section 3 below.)</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 About Situation-Frame
</SectionTitle>
      <Paragraph position="0"> The SITUATION-FRAME domain pertains to the goal of managing the culturally relevant framing expectations. The term is inspired by Goffman's work on the organization and maintenance of social interaction [13, 14]. An obvious example of a framing assumption is that the language of the interaction will be English [13, 14]. Another is that there is an asymmetry between the knowledge and/or agency of the system (or human travel agent) and that of the user (or caller): the user cannot issue an airline ticket.</Paragraph>
      <Paragraph position="1"> In developing the DATE tagging scheme, we compared human-human travel planning dialogues collected by CMU with the human-machine dialogues of the June 2000 data collection and noticed a striking difference in the ABOUT-FRAME dimension. Namely, very few ABOUT-FRAME utterances occur in the human-human dialogues, whereas they occur frequently enough in human-computer dialogues that to ignore them is to risk obscuring significant differences in habitability of different systems. In other words, certain differences in dialogue strategies across sites could not be fully represented without such a distinction. Figure 3 provides examples motivating this dimension.</Paragraph>
      <Paragraph position="2"> Dialogue acts that are ABOUT-FRAME are cross-classified as one of three types of speech-acts, PRESENT-INFO, INSTRUCTION or APOLOGY. They are not classified as having a value on the TASK-SUBTASK dimension. Most of the ABOUT-FRAME dialogue acts fall into the speech-act category of INSTRUCTIONS, utterances directed at shaping the user's behavior and expectations about how to interact with a machine. Sites differ regarding how much instruction is provided up-front versus within the dialogue; most sites have different utterance strategies for dialogue-initial versus dialogue- null I heard you ask about fares. I can only price an itinerary. I cannot provide information on published fares for individual flights.</Paragraph>
      <Paragraph position="3">  INSTRUCTION First, always wait to hear the beep before you say anything null INSTRUCTION You can always start over again completely just by saying: start over.</Paragraph>
      <Paragraph position="4"> INSTRUCTION Before we begin, let's go over a few simple instructions. INSTRUCTION Please remember to speak after the tone. If you get confused at any point you can say start over to cancel your current itinerary.</Paragraph>
      <Paragraph position="5">  APOLOGY Sorry, an error has occurred. We'll have to start over. APOLOGY I am sorry I got confused. Thanks for your patience. Let us try again.</Paragraph>
      <Paragraph position="6"> APOLOGY Something is wrong with the flight retrieval.</Paragraph>
      <Paragraph position="7"> APOLOGY I have trouble with my script.</Paragraph>
      <Paragraph position="8">  formation; further, the same utterances that can occur up-front also occur dialogue-medially. A second site gives no up-front framing information, but it does provide framing information dialoguemedially. Yet a third site gives framing information dialogue-initially, but not dialogue-medially. The remaining sites provide different kinds of general instructions dialogue-initially, e.g. (Welcome. ...You may say repeat, help me out, start over, or, that's wrong, you can also correct and interrupt the system at any time.) versus dialoguemedially: (Try changing your departure dates or times or a nearby city with a larger airport.) This category also includes statements to the user about the system's capabilities. These occur in response to a specific question or task that the system cannot handle: I cannot handle rental cars or hotels yet. Please restrict your requests to air travel. See Figure 3.</Paragraph>
      <Paragraph position="9"> Another type of ABOUT-FRAME utterance is the system's attempt to disambiguate the user's utterance; in response to the user specifying Springfield as a flight destination, the system indicates that this city name is ambiguous (I know of three Springfields, in Missouri, Illinois and Ohio. Which one do you want?). The system's utterance communicates to the user that Springfield is ambiguous, and goes further than a human would to clarify that there are only three known options. It is important for evaluation purposes to distinguish the question and the user's response from a simple question-answer sequence establishing a destination. A direct question, such as What city are you flying to?, functions as a REQUEST-INFO speech act and solicits information about the task. The context here contrasts with a direct question in that the system has already asked for and understood a response from the caller about the destination city. Here, the function of the system turn is to remediate the caller's assumptions about the frame by indicating the system's confusion about the destination. Note that the question within this pattern could easily be reformulated as a more typical instruction statement, such as Please specify which Springfield you mean,orPlease say Missouri, Illinois or Ohio..</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML