File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/p03-1071_concl.xml
Size: 1,793 bytes
Last Modified: 2025-10-06 13:53:35
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-1071"> <Title>Discourse Segmentation of Multi-Party Conversation</Title> <Section position="7" start_page="0" end_page="0" type="concl"> <SectionTitle> 6 Conclusions </SectionTitle> <Paragraph position="0"> We presented a domain-independent segmentation algorithm for multi-party conversation that integrates features based on content with features based on form. The learned combination of features results in a significant increase in accuracy over previous 10Note that rules are not always meaningful in isolation and it is likely that a subordinate rule in the tree to this one would do further tests on silence to determine if a topic boundary exists. menter on the test data.</Paragraph> <Paragraph position="1"> approaches to segmentation when applied to meetings. Features based on form that are likely to indicate topic shifts are automatically extracted from speech. Content based features are computed by a segmentation algorithm that utilizes a metric of lexical cohesion and that performs as well as state-of-the-art text-based segmentation techniques. It works both with written and spoken texts. The text-based segmentation approach alone, when applied to meetings, outperforms all other segmenters, although the difference is not statistically significant.</Paragraph> <Paragraph position="2"> In future work, we would like to investigate the effects of adding prosodic features, such as pitch ranges, to our segmenter, as well as the effect of using errorful speech recognition transcripts as opposed to manually transcribed utterances.</Paragraph> <Paragraph position="3"> An implementation of our lexical cohesion segmenter is freely available for educational or research purposes.11</Paragraph> </Section> class="xml-element"></Paper>