File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/n04-4001_abstr.xml
Size: 1,157 bytes
Last Modified: 2025-10-06 13:43:30
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-4001"> <Title>Using N-Grams to Understand the Nature of Summaries</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> Although single-document summarization is a well-studied task, the nature of multi-document summarization is only beginning to be studied in detail. While close attention has been paid to what technologies are necessary when moving from single to multi-document summarization, the properties of human-written multi-document summaries have not been quantified. In this paper, we empirically characterize human-written summaries provided in a widely used summarization corpus by attempting to answer the questions: Can multi-document summaries that are written by humans be characterized as extractive or generative? Are multi-document summaries less extractive than single-document summaries? Our results suggest that extraction-based techniques which have been successful for single-document summarization may not be sufficient when summarizing multiple documents.</Paragraph> </Section> class="xml-element"></Paper>