File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/w04-2318_relat.xml
Size: 2,706 bytes
Last Modified: 2025-10-06 14:15:45
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2318"> <Title>Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue</Title> <Section position="4" start_page="0" end_page="0" type="relat"> <SectionTitle> 2 Related Work </SectionTitle> <Paragraph position="0"> Cues for and automatic segmentation of discourse structure have been most extensively studied for written and spoken monologue. For written narrative, discourse segment boundaries have been identified based on textual topic similarity with a variety of approaches based on Hearst's Textiling(Hearst, 1994). More complex rhetorial structure theory trees have also been extracted based heavily on cue phrases and discourse markers by (Marcu, 2000).</Paragraph> <Paragraph position="1"> In spoken monologue, prosodic cues to discourse structure and segmentation have been explored by (Nakatani et al., 1995; Swerts, 1997). Increases in pitch range, amplitude, and silence duration appear to signal discourse segment boundaries across different domains - voicemail, broadcast news, descriptive narrative - and across different languages, such as English and Dutch.</Paragraph> <Paragraph position="2"> Comparable prosodic cues have been applied to the related task of news story segmentation, in conjunction with textual cues to topicality, by (Tur et al., 2001), where large pitch differences between pre- and post- boundary positions play the most significant role among prosodic cues.</Paragraph> <Paragraph position="3"> In spoken dialogue, research has focused on the identification of dialogue acts and dialogue games. Integration of textual and prosodic cues, such as particular pitch accent or contour types, have been found useful for identifying act type(Shriberg et al., 1998; Taylor et al., 1998). Specific classes of dialogue act, such as corrections (request repair), have received particular interest in work by (Levow, 1998; Swerts et al., 2000) in the context of human-computer error resolution. Recent work on the ICSI multi-party meeting recorder data has demonstrated some very preliminary results on multi-party segmentation (Galley et al., 2003); prosodic information in this case was limited to silence duration.</Paragraph> <Paragraph position="4"> With the exception of work on error resolution, most work on dialogue has focused human-human interaction and on identification of particular act or game types. Here we concentrate on the general question of discourse segmentation in voice-only human-computer interaction. We ask whether the cues to segment structure identified for monologue are robust to the change in number and type of conversational participant.</Paragraph> </Section> class="xml-element"></Paper>