File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/c04-1129_concl.xml

Size: 2,125 bytes

Last Modified: 2025-10-06 13:53:57

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1129">
  <Title>Syntactic Simpli cation for Improving Content Selection in Multi-Document Summarization</Title>
  <Section position="8" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6 Conclusions and Future Work
</SectionTitle>
    <Paragraph position="0"> We have demonstrated that simplifying news reports by removing parenthetical information results in better sentence clustering and consequently better summarization. We have further demonstrated that using a reference rewriting module to introduce parentheticals as a post-process does not signi cantly affect the score on an automated content-evaluation metric; indeed we believe that a more sophisticated rewriting module might indeed improve performance on content selection. In addition, the summaries produced by our summarizer closely resemble human summaries in surface features such as average sentence length and the distribution of relative clauses and appositives.</Paragraph>
    <Paragraph position="1"> The results in this paper might be useful to generative approaches to summarization. It is likely that the improved clustering will make operations like information fusion (Barzilay, 2003; Dalianis and Hovy, 1996) within clusters more reliable. We plan to examine whether this is indeed the case.</Paragraph>
    <Paragraph position="2"> We feel that the performance of our summarizer is encouraging (it performs at 90% of human performance as measured by Rouge) as it is conceptually very simple it selects informative sentences from the largest clusters and does not contain any theoretically inelegant optimizations, such as excluding overly long or short sentences.</Paragraph>
    <Paragraph position="3"> Our approach of extracting out parentheticals as a pre-process also provides a framework for reference rewriting, by allowing the summarizer to select background information independently of the main content. We believe that there is a lot of research left to be carried out in generating references in open domains and will address this issue in future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML