File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/n04-4001_abstr.xml

Size: 1,157 bytes

Last Modified: 2025-10-06 13:43:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="N04-4001">
  <Title>Using N-Grams to Understand the Nature of Summaries</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Although single-document summarization is a well-studied task, the nature of multi-document summarization is only beginning to be studied in detail. While close attention has been paid to what technologies are necessary when moving from single to multi-document summarization, the properties of human-written multi-document summaries have not been quantified. In this paper, we empirically characterize human-written summaries provided in a widely used summarization corpus by attempting to answer the questions: Can multi-document summaries that are written by humans be characterized as extractive or generative? Are multi-document summaries less extractive than single-document summaries? Our results suggest that extraction-based techniques which have been successful for single-document summarization may not be sufficient when summarizing multiple documents.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML