File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/w01-1610_intro.xml
Size: 3,526 bytes
Last Modified: 2025-10-06 14:01:18
<?xml version="1.0" standalone="yes"?> <Paper uid="W01-1610"> <Title>Labeling Corrections and Aware Sites in Spoken Dialogue Systems</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Compared to many other systems, spoken dialogue systems (SDS) tend to have more diculties in correctly interpreting user input. Whereas a car will normally go left if the driver turns the steering wheel in that direction or a vacuum cleaner will start working if one pushes the on-button, interactions between a user and a spoken dialogue system are often hampered by mismatches between the action intended by the user and the action executed by the system. Such mismatches are mainly due to errors in the Automatic Speech Recognition (ASR) and/or the Natural Language Understanding (NLU) component of these systems. To solve these mismatches, users often have to put considerable eort in trying to make it clear to the system that there was a problem, and trying to correct it by re-entering misrecognized or misinterpreted information. Previous research has already broughttolightthatitis not always easy for users to determine whether their intended actions were carried out correctly or not, in particular when the dialogue system does not give appropriate feedback about its internal representation at the right moment.</Paragraph> <Paragraph position="1"> In addition, users' corrections may miss their goal, because corrections themselves are more dicult for the system to recognize and interpret correctly,whichmay lead to so-called cyclic (or spiral) errors. That corrections are dicult for ASR systems is generally explained by the fact that they tend to be hyperarticulated |higher, louder, longer ...than other turns (Wade et al., 1992;; Oviatt et al., 1996;; Levow, 1998;; Bell and Gustafson, 1999;; Shimojima et al., 1999), where ASR models are not well adapted to handle this special speaking style.</Paragraph> <Paragraph position="2"> The current paper focuses on user corrections, and looks at places where people rst become aware of a system problem (\aware sites&quot;). In other papers (Swerts et al., 2000;; Hirschberg et al., 2001;; Litman et al., 2001), we have already given some descriptive statistics on corrections and aware sites and we have been looking at methods to automatically predict these two utterance categories. One of our major ndings is that prosody, which had already been shown to be a good predictor of misrecognitions (Litman et al., 2000;; Hirschberg et al., 2000), is also useful to correctly classify corrections and aware sites.</Paragraph> <Paragraph position="3"> In this paper, we will elaborate more on the exact labeling scheme we used, and add further descriptive statistics. More in particular, we addressthe questionwhetherthere ismuch variance in the way people react to system errors, and if so, to what extent this variance can be explained on the basis of particular properties of the dialogue system. In the following section we rst provide details on the TOOT corpus that we used for our analyses.</Paragraph> <Paragraph position="4"> Then we give information on the labels for corrections and aware sites, and on the actual labeling procedure. The next section gives the results of some descriptive statistics on properties of corrections and aware sites and on their distributions. We will end the paper with a general discussion of our ndings.</Paragraph> </Section> class="xml-element"></Paper>