File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/h92-1085_intro.xml

Size: 3,710 bytes

Last Modified: 2025-10-06 14:05:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1085">
  <Title>Automatic Detection and Correction of Repairs in Human-Computer Dialog*</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> Spontaneous spoken language often includes speech that is not intended by the speaker to be part of the content of the utterance. This speech must be detected and deleted in order to correctly identify the intended meaning. This broad class of disfluencies encompasses a number of phenomena, including word fragments, interjections, filled pauses, restarts, and repairs. We are analyzing the repairs in a large subset (over ten thousand sentences) of spontaneous speech data collected for the DARPA spoken language program. We have categorized these disfluencies as to type and frequency, and are investigating methods for their automatic detection and correction. Here we report promising results on detection and correction of repairs by combining pattern matching, syntactic and semantic analysis, and acoustics. null The problem of disfluent speech for language understanding systems has been noted but has received limited attention, ttindle \[5\] attempts to delimit and correct repairs in spontaneous human-human dialog, based on transcripts containing an &amp;quot;edit signal,&amp;quot; or external and reliable marker at the &amp;quot;expunction point,&amp;quot; or point of interruption. Carbonell and Hayes \[4\] briefly describe re*This research was supported by the Defense Advanced Research Projects Agency under Contract ONR N00014-90-C-0085 with the Office of Naval Research. It was also supported by a Grant, NSF IRI-8905249, from the NationM Science Foundation.</Paragraph>
    <Paragraph position="1"> The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the Defense Advanced Research Projects Agency of the U.S. Government, or of the National Science Foundation.</Paragraph>
    <Paragraph position="2"> lElizabeth Shriberg is also affiliated with the Department of Psychology at the University of California at Berkeley.</Paragraph>
    <Paragraph position="3"> covery strategies for broken-off and restarted utterances in textual input. Ward \[13\] addresses repairs in spontaneous speech, but does not attempt to identify or correct them. Our approach is most similar to that of Hindle. It differs, however, in that we make no assumption about the existence of an explicit edit signal. As a reliable edit signal has yet to be found, we take it as our problem to find the site of the repair automatically.</Paragraph>
    <Paragraph position="4"> It is the case, however, that cues to repair exist over a range of syllables. Research in speech production has shown that repairs tend to be marked prosodically \[8\] and there is perceptual evidence from work using lowpass-filtered speech that human listeners can detect the occurrence of a repair in the absence of segmental information \[9\].</Paragraph>
    <Paragraph position="5"> In the sections that follow, we describe in detail our corpus of spontaneous speech data and present an analysis of the repair phenomena observed. In addition, we describe ways in which pattern matching, syntactic and semantic anMysis, and acoustic analysis can be helpful in detecting and correcting these repairs. We use pattern matching to determine an initial set of possible repairs; we then apply information from syntactic, semantic, and acoustic analyses to distinguish actual repairs from false positives.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML