File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/n06-1047_intro.xml
Size: 2,245 bytes
Last Modified: 2025-10-06 14:03:26
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-1047"> <Title>Incorporating Speaker and Discourse Features into Speech Summarization</Title> <Section position="3" start_page="0" end_page="367" type="intro"> <SectionTitle> 2. Previous Work </SectionTitle> <Paragraph position="0"> In the field of speech summarization in general, research investigating speech-specific characteristics has focused largely on prosodic features such as F0 mean and standard deviation, pause information, syllable duration and energy. Koumpis and Renals [1] investigated prosodic features for summarizing voicemail messages in order to send voicemail summaries to mobile devices. Hori et al. [6] have developed an integrated speech summarization approach, based on finite state transducers, in which the recognition and summarization components are composed into a single finite state transducer, reporting results on a lecture summarization task. In the Broadcast News domain, Maskey and Hirschberg [7] found that the best summarization results utilized prosodic, lexical, and structural features, while Ohtake et al. [8] explored using only prosodic features for summarization. Maskey and Hirschberg similarly found that prosodic features alone resulted in good quality summaries of Broadcast News.</Paragraph> <Paragraph position="1"> In the meetings domain (using the ICSI corpus), Murray et al. [2] compared text summarization approaches with feature-based approaches using prosodic features, with human judges favoring the feature-based approaches. Zechner [9] investigated summarizing several genres of speech, including spontaneous meeting speech. Though relevance detection in his work relied largely on tf.idf scores, Zechner also explored cross-speaker information linking and question/answer detection, so that utterances could be extracted not only according to high tf.idf scores, but also if they were linked to other informative utterances.</Paragraph> <Paragraph position="2"> Similarly, this work aims to detect important utterances that may not be detectable according to lexical features or prosodic prominence, but are nonetheless linked to high speaker activity, decision-making, or meeting structure.</Paragraph> </Section> class="xml-element"></Paper>