File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/n04-4001_relat.xml

Size: 1,865 bytes

Last Modified: 2025-10-06 14:15:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="N04-4001">
  <Title>Using N-Grams to Understand the Nature of Summaries</Title>
  <Section position="3" start_page="0" end_page="2" type="relat">
    <SectionTitle>
2 Related Work
</SectionTitle>
    <Paragraph position="0"> Jing (2002) previously examined the degree to which single-document summaries can be characterized as extractive. Based on a manual inspection of 15 human-written summaries, she proposes that for the task of single-document summarization, human summarizers use a &amp;quot;cut-and-paste&amp;quot; approach in which six main operations are performed: sentence reduction, sentence combination, syntactic transformation, reordering, lexical paraphrasing, and generalization or specification.</Paragraph>
    <Paragraph position="1"> The first four operations are reflected in the construction of an HMM model that can be used to decompose human summaries. According to this model, 81% of summary sentences contained in a corpus of 300 human-written summaries of news articles on telecommunications were found to fit the cut-and-paste method, with the rest believed to have been composed from scratch.</Paragraph>
    <Paragraph position="2">  Another recent study (Lin and Hovy, 2003) investigated the extent to which extractive methods may be sufficient for summarization in the single-document case. By computing a performance upper-bound for pure sentence extraction, they found that state-of-the-art extraction-based systems are still 15%-24%  away from this limit, and 10% away from average human performance. While this sheds light on how much gain can be achieved by optimizing sentence extraction methods for single-document summarization, to our knowledge, no one has assessed the potential for extraction-based systems when attempting to summarize multiple documents.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML