File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/n06-1047_intro.xml

Size: 2,245 bytes

Last Modified: 2025-10-06 14:03:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1047">
  <Title>Incorporating Speaker and Discourse Features into Speech Summarization</Title>
  <Section position="3" start_page="0" end_page="367" type="intro">
    <SectionTitle>
2. Previous Work
</SectionTitle>
    <Paragraph position="0"> In the field of speech summarization in general, research investigating speech-specific characteristics has focused largely on prosodic features such as F0 mean and standard deviation, pause information, syllable duration and energy. Koumpis and Renals [1] investigated prosodic features for summarizing voicemail messages in order to send voicemail summaries to mobile devices. Hori et al. [6] have developed an integrated speech summarization approach, based on finite state transducers, in which the recognition and summarization components are composed into a single finite state transducer, reporting results on a lecture summarization task. In the Broadcast News domain, Maskey and Hirschberg [7] found that the best summarization results utilized prosodic, lexical, and structural features, while Ohtake et al. [8] explored using only prosodic features for summarization. Maskey and Hirschberg similarly found that prosodic features alone resulted in good quality summaries of  Broadcast News.</Paragraph>
    <Paragraph position="1"> In the meetings domain (using the ICSI corpus), Murray et al. [2] compared text summarization approaches with feature-based approaches using prosodic features, with human judges favoring the feature-based approaches. Zechner [9] investigated summarizing several genres of speech, including spontaneous meeting speech. Though relevance detection in his work relied largely on tf.idf scores, Zechner also explored cross-speaker information linking and question/answer detection, so that utterances could be extracted not only according to high tf.idf scores, but also if they were linked to other informative utterances.</Paragraph>
    <Paragraph position="2"> Similarly, this work aims to detect important utterances that may not be detectable according to lexical features or prosodic prominence, but are nonetheless linked to high speaker activity, decision-making, or meeting structure.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML