File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-1204_concl.xml

Size: 1,612 bytes

Last Modified: 2025-10-06 13:53:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-1204">
  <Title>Evaluation of Features for Sentence Extraction on Different Types of Corpora</Title>
  <Section position="8" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6 Conclusion
</SectionTitle>
    <Paragraph position="0"> We have shown evaluation results for our sentence extraction system and analyzed its features for different types of corpora, which included corpora differing in both language (Japanese and English) and type (newspaper articles and lectures). The system is based on four major features, and it achieved some of the top results at evaluation workshops in 2001 for summarizing Japanese newspaper articles (TSC) and English newspaper articles (DUC). For Japanese lectures, the sentence extraction system also obtained comparable results when the sentence boundary was given.</Paragraph>
    <Paragraph position="1"> Our analysis of the features used in this sentence extraction system has shown that they are not necessarily independent of one another, based on the results of their rank correlation coefficients. The analysis also indicated that the categorization of feature values matches the distribution of key sentences better than sequential feature values.</Paragraph>
    <Paragraph position="2"> There are several features that were not described here but are also used in sentence extraction systems, such as some specific lexical expressions and syntactic information. In our future work, we will analyze and use these features to improve the performance of our sentence extraction system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML