File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/a00-2025_intro.xml

Size: 2,579 bytes

Last Modified: 2025-10-06 14:00:43

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-2025">
  <Title>Minimizing Word Error Rate in Textual Summaries of Spoken Language</Title>
  <Section position="4" start_page="0" end_page="186" type="intro">
    <SectionTitle>
2 Related work * &amp;quot;
</SectionTitle>
    <Paragraph position="0"> (Waibel et al., 1998) report results of their summarization system on automatically transcribed SWITCHBOARD data (Godfrey et al., 1992), the word error rate being about 30%. In a question-answer test with summaries of five dialogues, subjects could identify most of the key concepts using a summary size of only five turns. However, the results vary widely across five different dialogues tested in this experiment (between 20% and 90% accuracy).</Paragraph>
    <Paragraph position="1"> (Valenza et al., 1999) went one step further and report that they were able to reduce the word error rate in summaries (as opposed to full texts) by using speech recognizer confidence scores. They combined inverse frequency weights with confidence scores for each recognized word. Using summaries composed of  one 30-gram per minute (approximately 15% length of the full text), the WER dropped from 25% for the full text to 10% for these summaries. They also conducted a qualitative study where human subjects were given summaries of n-grams of different length and also summaries with speaker utterances as minimal units, either giving a high weight to the inverse frequency scores or to the confidence scores. The utterance summaries were considered best, followed closely by 30-gram summaries, both using high confidence score weights. This suggests that not only does the WER drop by extracting passages that are more likely to be correctly recognized but also do summaries seem to be &amp;quot;better&amp;quot; which are generated that way.</Paragraph>
    <Paragraph position="2"> While the results of (Valenza et al., 1999) are indicative for their approach, we want to investigate the benefits of using speech recognizer confidence scores in more detail and particularly find out about the trade-off between WER and summarization accuracy when we vary the influence of the confidence scores. To our knowledge, this paper addresses this trade-off for the first time in a clear, numerically describable way. To be able to obtain numerical values for summary accuracy, we had our corpus annotated for relevance (section 5) and devised an evaluation scheme that allows the calculation of summary accuracy for both human and machine generated transcripts (section 4).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML