File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/a00-2024_intro.xml

Size: 5,200 bytes

Last Modified: 2025-10-06 14:00:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-2024">
  <Title>Cut and Paste Based Text Summarization</Title>
  <Section position="3" start_page="178" end_page="179" type="intro">
    <SectionTitle>
2 Cut and paste in summarization
2.1 Related work in professional
</SectionTitle>
    <Paragraph position="0"> summarizing Professionals take two opposite positions on whether a summary should be produced by cutting and pasting the original text. One school of scholars is opposed; &amp;quot;(use) your own words... Do not keep too close to the words before you&amp;quot;, states an early book on abstracting for American high school students (Thurber, 1924). Another study, however, shows that professional abstractors actually rely on cutting and pasting to produce summaries: &amp;quot;Their professional role tells abstractors to avoid inventing anything. They follow the author as closely as possible and reintegrate the most important points of a document in a shorter text&amp;quot; (Endres-Niggemeyer et al., 1998). Some studies are somewhere in between: &amp;quot;summary language may or may not follow that of author's&amp;quot; (Fidel, 1986). Other guidelines or books on abstracting (ANSI, 1997; Cremmins, 1982) do not discuss the issue.</Paragraph>
    <Paragraph position="1"> Our cut and paste based summarization is a computational model; we make no claim that humans use the same cut and paste operations.</Paragraph>
    <Section position="1" start_page="178" end_page="179" type="sub_section">
      <SectionTitle>
2.2 Cut and paste operations
</SectionTitle>
      <Paragraph position="0"> We manually analyzed 30 articles and their corresponding human-written summaries; the articles and their summaries come from different domains ( 15 general news reports, 5 from the medical domain, 10 from the legal domain) and the summaries were written by professionals from different organizations.</Paragraph>
      <Paragraph position="1"> We found that reusing article text for summarization is almost universal in the corpus we studied. We defined six operations that can be used alone, sequentially, or simultaneously to transform selected sentences from an article into the corresponding summary sentences in its human-written abstract:  (1) sentence reduction Remove extraneous phrases from a selected sentence, as in the following example 1: 1 All the examples in this section were produced by human professionals  The deleted material can be at any granularity: a word, a phrase, or a clause. Multiple components can be removed.</Paragraph>
      <Paragraph position="2"> (2) sentence combination Merge material from several sentences. It can be used together with sentence reduction, as illustrated in the following example, which also uses paraphrasing: null Text Sentence 1: But it also raises serious questions about the privacy of such highly personal information wafting about the digital world.</Paragraph>
      <Paragraph position="3"> Text Sentence 2: The issue thus fits squarely into the broader debate about privacy and security on the internet, whether it involves protecting credit card number or keeping children from offensive information.</Paragraph>
      <Paragraph position="4"> Summary sentence: But it also raises the issue of privacy of such personal information and this issue hits the head on the nail in the broader debate about privacy and security on the internet.</Paragraph>
      <Paragraph position="5">  (3) syntactic transformation  In both sentence reduction and combination, syntactic transformations may be involved. For example, the position of the subject in a sentence may be moved from the end to the front.</Paragraph>
      <Paragraph position="6"> (4) lexical paraphrasing Replace phrases with their paraphrases. For instance, the summaries substituted point out with note, and fits squarely into with a more picturesque description hits the head on the nail in the previous examples.</Paragraph>
      <Paragraph position="7"> (5) generalization or specification Replace phrases or clauses with more general or specific descriptions. Examples of generalization and specification include: Generalization: &amp;quot;a proposed new law that would require Web publishers to obtain parental consent before collecting personal information from children&amp;quot; --+ &amp;quot;legislation to protect children's privacy on-line&amp;quot; Specification: &amp;quot;the White House's top drug official&amp;quot; ~ &amp;quot;Gen. Barry R. McCaffrey, the White House's top drug official&amp;quot;  p .....</Paragraph>
      <Paragraph position="8"> ,_e_ yr _', -I , Co-reference ~, I ......... I  (6) reordering  Change the order of extracted sentences. For instance, place an ending sentence in an article at the beginning of an abstract.</Paragraph>
      <Paragraph position="9"> In human-written abstracts, there are, of course, sentences that are not based on cut and paste, but completely written from scratch. We used our decomposition program to automatically analyze 300 human-written abstracts, and found that 19% of sentences in the abstracts were written from scratch. There are also other cut and paste operations not listed here due to their infrequent occurrence.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML