File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/p99-1072_metho.xml

Size: 8,181 bytes

Last Modified: 2025-10-06 14:15:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="P99-1072">
  <Title>Improving Summaries by Revising Them</Title>
  <Section position="3" start_page="558" end_page="559" type="metho">
    <SectionTitle>
2 The Revision Program
</SectionTitle>
    <Paragraph position="0"> The summary revision program takes as input a source document, a draft summary specification, and a target compression rate. Using revision rules, it generates a revised summary draft whose compression rate is no more than above the target compression rate. The initial draft summary (and background) are specified in terms of a task-dependent weighting function which indicates the relative importance of each of the source document sentences. The program repeatedly selects the highest weighted sentence from the source and adds it to the initial draft until the given compression percentage of the source has been extracted, rounded to the nearest sentence. Next, for each rule in the sequence of revision rules, the program repeatedly applies the rule until it can no longer be applied. Each rule application results in a revised draft. The program selects sentences for rule application by giving preference to higher weighted sentences.</Paragraph>
    <Paragraph position="1"> 1Note that professional abstractors do not attempt to fully &amp;quot;understand&amp;quot; the text - often extremely technical material, but use surface-level features as above as well as the overall discourse structure of the text (Cremmins 1996). 2However, recent progress on this problem (Marcu 1997) is encouraging.</Paragraph>
    <Paragraph position="2"> A unary rule applies to a single sentence. A binary rule applies to a pair of sentences, at least one of which must be in the draft, and where the first sentence precedes the second in the input.</Paragraph>
    <Paragraph position="3"> Control over sentence complexity is imposed by failing rule application when the draft sentence is too long, the parse tree is too deep 3, or if more than two relative clauses would be stacked together. The program terminates when there are no more rules to apply or when the revised draft exceeds the required compression rate by more than 5.</Paragraph>
    <Paragraph position="4"> The syntactic structure of each source sentence is extracted using Apple Pie 7.2 (Sekine 1998), a statistical parser trained on Penn Tree-bank data. It was evaluated by (Sekine 1998) as having 79% F-score accuracy (parseval) on short sentences (less than 40 words) from the Treebank. An informal assessment we made of the accuracy of the parser (based on intuitive judgments) on our own data sets of news articles suggests about 66% of the parses were acceptable, with almost half of the remaining parsing errors being due to part-of-speech tagging errors, many of which could be fixed by preprocessing the text. To establish coreference between proper names, named entities are extracted from the document, along with coreference relations using SRA's NameTag 2.0 (Krupka 1995), a MUC-6 fielded system. In addition, we implemented our own coreference extension: A singular definite NP (e.g., beginning with &amp;quot;the&amp;quot;, and not marked as a proper name) is marked by our program as coreferential (i.e., in the same coreference equivalence class) with the last singular definite or singular indefinite atomic NP with the same head, provided they are within a distance 7 of each other. On a corpus of 90 documents, drawn from the TIPSTER evaluation, described in Section 4.1 below, this coreference extension scored 94% precision (470 valid coreference classes/501 total coreference classes) on definite NP coreference. Also, &amp;quot;he&amp;quot; (likewise &amp;quot;she&amp;quot;) is marked, subject to 7, as coreferential with the last person name mentioned, with gender agreement enforced when the person's first name's gender is known (from  rule-name: rel-clause-intro-which- 1 patterns:</Paragraph>
    <Paragraph position="6"> of the errors were caused by different sequences of words between the determiner and the noun phrase head word (e.g., &amp;quot;the factory&amp;quot; -- &amp;quot;the cramped five-story pre-1915 factory&amp;quot; is OK, but &amp;quot;the virus program&amp;quot;- &amp;quot;the graduate computer science program&amp;quot; isn't).</Paragraph>
  </Section>
  <Section position="4" start_page="559" end_page="559" type="metho">
    <SectionTitle>
3 Revision Rules
</SectionTitle>
    <Paragraph position="0"> The revision rules carry out three types of operations. Elimination operations eliminate constituents from a sentence. These include elimination of parentheticals, and sentence-initial PPs and adverbial phrases satisfying lexical tests (such as &amp;quot;In particular,&amp;quot;, &amp;quot;Accordingly,&amp;quot; &amp;quot;In conclusion,&amp;quot; etc.) 5.</Paragraph>
    <Paragraph position="1"> Aggregation operations combine constituents from two sentences, at least one of which must be a sentence in the draft, into a new constituent which is inserted into the draft sentence. The basis for combining sentences is that of referential identity: if there is an NP in sentence i which is coreferential with an NP in sentence j, then sentences i and j are candidates for aggregation. The most common form of aggregation is expressed as tree-adjunction (Joshi 1998) (Oras 1999). Figures 1 and 2 show a relative clause introduction rule which turns a VP of a (non-embedded) sentence whose our analysis because of a system bug.</Paragraph>
    <Paragraph position="2"> 5Such lexical tests help avoid misrepresenting the meaning of the sentence.</Paragraph>
    <Paragraph position="3"> subject is coreferential with an NP of an earlier (draft) sentence into a relative clause modifier of the draft sentence NP. Other appositive phrase insertion rules include copying and inserting nonrestrictive relative clause modifiers (e.g., &amp;quot;Smith, who...,&amp;quot;), appositive modifiers of proper names (e.g., &amp;quot;Peter G. Neumann, a computer security expert familiar with the case,...&amp;quot;), and proper name appositive modifiers of definite NPs (e.g., &amp;quot;The network, named ARPANET, is operated by ..&amp;quot;).</Paragraph>
    <Paragraph position="4"> Smoothing operations apply to a single sentence, performing transformations so as to arrive at more compact, stylistically preferred sentences. There are two types of smoothing.</Paragraph>
    <Paragraph position="5"> Reduction operations simplify coordinated constituents. Ellipsis rules include subject ellipsis, which lowers the coordination from a pair of clauses with coreferential subjects to their VPs (e.g., &amp;quot;The rogue computer program destroyed files over a five month period and the program infected close to 100 computers at NASA facilities&amp;quot; ==~ &amp;quot;The rogue computer program destroyed files over a five month period and infected close to 100 computers at NASA facilities&amp;quot;). It usually applies to the result of an aggregation rule which conjoins clauses whose subjects are coreferential. Relative clause reduction includes rules which apply to clauses whose VPs begin with &amp;quot;be&amp;quot; (e.g., &amp;quot;which is&amp;quot; is deleted) or &amp;quot;have&amp;quot; (e.g., &amp;quot;which have&amp;quot; : ,~ &amp;quot;with&amp;quot;), as well as for other verbs, a rule deleting the relative pronoun and replacing the verb with its present participle (i.e., &amp;quot;which V&amp;quot; ,~ &amp;quot;V+ing&amp;quot;). Coordination rules include relative clause coordination. Reference Adjustment operations fix up the results of other revision operations in order to improve discourse-level coherence, and as a result, they are run last 6. They include substitution of a proper name with a name alias if the name is mentioned earlier, expansion of a pronoun with a coreferential proper name in a parenthetical (&amp;quot;pronoun expansion&amp;quot;), and (&amp;quot;indefinitization&amp;quot;) replacement of a definite NP with a coreferential indefinite if the definite occurs without a prior indefinite.</Paragraph>
    <Paragraph position="6"> SSuch operations have been investigated earlier by</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML