File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1044_intro.xml

Size: 2,260 bytes

Last Modified: 2025-10-06 14:02:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1044">
  <Title>Combining Acoustic and Pragmatic Features to Predict Recognition Performance in Spoken Dialogue Systems</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Relation to Previous Work
</SectionTitle>
    <Paragraph position="0"> (Litman et al., 2000) use acoustic-prosodic information extracted from speech waveforms, together with information derived from their speech recognizer, to automatically predict misrecognized turns in a corpus of train-timetable information dialogues.</Paragraph>
    <Paragraph position="1"> In our experiments, we also use recognizer confidence scores and a limited number of acoustic-prosodic features (e.g. amplitude in the speech signal) for hypothesis classification. (Walker et al., 2000) use a combination of features from the speech recognizer, natural language understanding, and dialogue manager/discourse history to classify hypotheses as correct, partially correct, or misrecognized. Our work is related to these experiments in that we also combine confidence scores and higher-level features for classification. However, both (Litman et al., 2000) and (Walker et al., 2000) consider only single-best recognition results and thus use their classifiers as &amp;quot;filters&amp;quot; to decide whether the best recognition hypothesis for a user utterance is correct or not. We go a step further in that we classify n-best hypotheses and then select among the alternatives. We also explore the use of more dialogue and task-oriented features (e.g. the dialogue move type of a recognition hypothesis) for classification.</Paragraph>
    <Paragraph position="2"> The main difference between our approach and work on hypothesis reordering (e.g. (Chotimongkol and Rudnicky, 2001)) is that we make a decision regarding whether a dialogue system should accept, clarify, reject, or ignore a user utterance. Furthermore, our approach is more generally applicable than preceding research, since we frame our methodology in the Information State Update (ISU) approach to dialogue management (Traum et al., 1999) and therefore expect it to be applicable to a range of related multimodal dialogue systems.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML