File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/p01-1048_intro.xml

Size: 4,634 bytes

Last Modified: 2025-10-06 14:01:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="P01-1048">
  <Title>Predicting User Reactions to System Error</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Data
</SectionTitle>
    <Paragraph position="0"> The TOOT corpus was collected using an experimental SDS developed for the purpose of comparing differences in dialogue strategy. It provides access to train information over the phone and is implemented using an internal platform combining ASR, text-to-speech, a phone interface, and modules for specifying a finite-state dialogue manager, and application functions. Subjects performed four tasks with versions of TOOT, which varied confirmation type and locus of initiative (system initiative with explicit system confirmation, user initiative with no system confirmation until the end of the task, mixed initiative with implicit system confirmation), as well as whether the user could change versions at will using voice commands. Subjects were 39 students, 20 native speakers of standard American English and 19 non-native speakers; 16 subjects were female and 23 male. The exchanges were recorded and the system and user behavior logged automatically. Dialogues were manually transcribed and user turns automatically compared to the corresponding ASR (one-best) recognized string to produce a word accuracy score (WA) for each turn.</Paragraph>
    <Paragraph position="1"> Each turn's concept accuracy (CA) was labeled by the experimenters from the dialogue recordings and the system log; if the recognizer correctly captured all the task-related information given in the user's original input (e.g. date, time, departure or arrival cities), the turn was given a CA score of 1, indicating a semantically correct recognition.</Paragraph>
    <Paragraph position="2"> Otherwise, the CA score reflected the percentage of correctly recognized task concepts in the turn.</Paragraph>
    <Paragraph position="3"> For the study described below, we examined 2328 user turns from 152 dialogues generated during these experiments. 194 of the 2320 turns were rejected by the system.</Paragraph>
    <Paragraph position="4"> To identify the different turn categories in the corpus, two authors independently labeled each turn as to whether or not it constituted a correction of a prior system failure (a CA error or a rejection) and what turn was being corrected, and whether or not it represented an aware site for a prior failure, and, if so, the turn which the system had failed on. Labeler disagreement was subsequently resolved by consensus. The fragment in Figure 1, produced with a version of TOOT in which the user has the initiative with no confirmation until the end of the task, illustrates these labels. This example illustrates cases of corraware, in which both the user's awareness and correction of a mis-recognition occur in the same turn (e.g. turns 1159 and 1160, after system prompts for information already given in turn 1158). It also illustrates cases in which aware sites and corrections occur in different turns. For example, after the immediate explicit system confirmation of turn 1162, the user first becomes aware of the system errors (turn 1163), then separately corrects them (turn 1164); turn 1163 is thus an aware turn and turn 1164 a corr. When no immediate confirmation of an utterance occurs (as with turn 1158), it may take several turns before the user becomes aware of any Turn Turn ID Aware of Corr of Type S: How may I help you? U: Get me the train to New York City from Baltimore on Sunday at eight thirty pm 1158 NA NA NORM S: Which city do you want to go to? U: New York City 1159 1158 1158 CORRAWARE S: Which city do you want to leave from? U: Baltimore 1160 1158 1158 CORRAWARE S: Do you want me to find the trains from Baltimore to New York City today at anytime now? U: No 1161 1158 NA AWARE S: How may I help you? U: Get me the train to New York City from Baltimore on Sunday at eight thirty pm 1162 NA 1158 CORR S: Do you want me to find the trains from Baltimore to New York City today at anytime now? U: No 1163 1162 NA AWARE S: How may I help you? U: Get me the train on Sunday at eight thirty pm 1164 NA 1162 CORR  misrecognition errors. For example, it is not until turn 1161 that the user first becomes aware of the error in date and time from 1158; the user then corrects the error in 1162. So, 1161 is classified as an aware and 1162 as a corr. Note that corr turns represent 13% of the turns in our corpus, awares represent 14%, corrawares account for 16%, and norm turns represent 57% of the turns in the corpus. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML