File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/w00-0402_concl.xml

Size: 1,526 bytes

Last Modified: 2025-10-06 13:52:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-0402">
  <Title>Mining Discourse Markers for Chinese Textual Summarization</Title>
  <Section position="9" start_page="17" end_page="18" type="concl">
    <SectionTitle>
8. Conclusion
</SectionTitle>
    <Paragraph position="0"> We discuss in this paper the use of discourse markers in Chinese text summarization. Discourse structure trees with nodes representing RST (Rhetorical Structure Theory) relations are built and summarization is achieved by trimming  unimportant sentences on the basis of the relative saliency or rhetorical relations. In order to study discourse markers for use in the automatic summarization of Chinese, we have designed and implemented the SIFAS system. We investigate the relationships between various linguistic features and different aspects of discourse marker usage on naturally occurring text. An encoding scheme that captures the essential features of discourse marker usage is introduced. A heuristic-based algorithm for automatic tagging of discourse markers is designed to alleviate the burden of a human encoder in developing a large corpus of encoded texts and to discover potential problems in automatic discourse marker tagging. A study on applying machine learning techniques to discourse marker disambiguation is also conducted. C4.5 is used to generate decision tree classifiers. Our results indicate that machine learning is a promising approach to improving the accuracy of discourse marker tagging.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML