File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/w04-2906_relat.xml

Size: 1,844 bytes

Last Modified: 2025-10-06 14:15:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2906">
  <Title>Assessing Prosodic and Text Features for Segmentation of Mandarin Broadcast News</Title>
  <Section position="3" start_page="0" end_page="0" type="relat">
    <SectionTitle>
2 Related Work
</SectionTitle>
    <Paragraph position="0"> Most prior research on automatic topic segmentation has been applied to clean text only and thus used textual features. Text-based segmentation approaches have utilized term-based similarity measures computed across candidate segments (Hearst, 1994) and also discourse markers to identify discourse structure (Marcu, 2000).</Paragraph>
    <Paragraph position="1"> The Topic Detection and Tracking (TDT) evaluations focused on segmentation of both text and speech sources.</Paragraph>
    <Paragraph position="2"> This framework introduced new challenges in dealing with errorful automatic transcriptions as well as new opportunities to exploit cues in the original speech. The most successful approach (Beeferman et al., 1999) produced automatic segmentations that yielded retrieval results comparable to those with manual segmentations, using text and silence features. (Tur et al., 2001) applied both a prosody-only and a mixed text-prosody model to segmentation of TDT English broadcast news, with the best results combining text and prosodic features.</Paragraph>
    <Paragraph position="3"> (Hirschberg and Nakatani, 1998) also examined automatic topic segmentation based on prosodic cues, in the domain of English broadcast news.</Paragraph>
    <Paragraph position="4"> Work in discourse analysis (Nakatani et al., 1995; Swerts, 1997) in both English and Dutch has identi ed features such as changes in pitch range, intensity, and speaking rate associated with segment boundaries and with boundaries of different strengths.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML