XML Viewer - w04-1009

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-1009_metho.xml
Size: 16,649 bytes
Last Modified: 2025-10-06 14:09:10
<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1009">
  <Title>Hybrid Text Summarization: Combining External Relevance Measures with Structural Analysis</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 The PALSUMM System
</SectionTitle>
    <Paragraph position="0"> PALSUMM summarization algorithms operate on data structures generated by FX Palo Alto's Linguistic Discourse Analysis System (LIDAS).</Paragraph>
    <Paragraph position="1"> LIDAS is a computational discourse parser implementing the Unified Linguistic Discourse Model (U-LDM). A description of the LIDAS system and the U-LDM as well as a summary of an article from the New Yorker are described in earlier work (Polanyi et al, 2004a, b, Thione 2004). Due to space limitations we can only sketch the main points of the system here.</Paragraph>
    <Paragraph position="2"> The LIDAS parser itself is purely symbolic. It parses a text discourse segment by discourse segment to construct a tree that captures discourse continuity and accessibility relations between the segments. The tree identifies what discourse constituents are available for further development and what information given by discourse constituents is available to be referred to. We use the fact that the resulting tree encodes (semantic) accessibility relations between the segments, and not rhetorical relations, to guarantee that the pruning algorithm used to summarize preserve antecedents for anaphors thus fostering readability.</Paragraph>
    <Paragraph position="3"> The basic units of this theory (Basic Discourse Unit or BDUs) are the syntactic reflexes of linguistically realized minimal semantic unit of meaning1 or functions, interpreted relative to the context given by the preceding discourse. To identify the BDUs in a text, LIDAS relies on the Xerox Linguistic Environment to parse sentences from a text (Maxwell and Kaplan, 1989). After sentential parsing is complete, the XLE sentence parse trees are segmented into BDUs using a set of robust sentence and discourse level rules described in detail in Polanyi et al 2004a, b. After parsing, BDUs (which need not be contiguous) are recombined into one or more discourse trees corresponding to (parts of) the sentence, called BDUtrees. null For each BDU-tree, one BDU, normally the main clause of a sentence or a compound unit of discourse directly derived from it, is designated as the Main-BDU (M-BDU) and is represented by the root node of the BDU-tree. The entire BDU-tree is attached as a unit to the emerging Open Right Tree representation of the structure of the discourse by relating syntactic, semantic and lexical information in the M-BDU (and preposed adverbial modifiers, clauses and &amp;quot;cue&amp;quot; words) to information available in nodes along the right edge of the tree using formal linguistic discourse attachment rules involving relatio nships among semantic, syntactic and lexical information to compute both the site of attachment and the attachment relation.</Paragraph>
    <Paragraph position="4"> Although a full discussion of these rules lies beyond the scope of this paper, Table 1 sketches some simple principles which are both language and domain independent.2 These rules are weighted and ordered in application, and multiple rules may &amp;quot;vote&amp;quot; for the same or different attachment points and discourse relations. The precise relationships among the rules remains a subject for future research.</Paragraph>
    <Paragraph position="5"> The U-LDM is similar in form to RST, but its primitives are rather different. Whereas RST takes rhetorical relations as primitives, the LDM takes its primitives from syntactic structure. The ontology of LDM relations has three top relations: coordination, subordination and n-ary.</Paragraph>
    <Paragraph position="6"> 1 We understand a minimum unit of Meaning to communicate information about not more than one &amp;quot;event&amp;quot; or state of affairs in a &amp;quot;possible world&amp;quot; of some type (roughly event-type predicates); while a minimal Functional unit encodes information about how previously occurring (or possibly subsequent) utterances relate structurally, semantically, interactionally or rhetorically to other units in the discourse or context in which the discourse takes place (Greetings, discourse PUSH/POP markers, connectives etc. are all Functional segments).</Paragraph>
    <Paragraph position="7"> 2 One reviewer remarked, quite correctly: &amp;quot;how a sentence is attached to the emerging representation of the structure of the discourse ... is the heart of the algorithm&amp;quot;. This issue is discussed in detail in Polanyi et al., 2004a,b ; Thione et al. 2004.</Paragraph>
    <Paragraph position="8"> Evidence attachment is a subordination Syntactic promotion: If the subject of an M-BDU co-refers with the object of the AP.</Paragraph>
    <Paragraph position="9"> Sub-cases: If the subject of the M-BDU refers to a sub-case of the subject of the AP. Sub-cases include subsets (all children/some children), sub-types (people/children), etc.</Paragraph>
    <Paragraph position="10"> Verbal properties: If the tense, aspect, modality or genericity of the verbs are different.</Paragraph>
    <Paragraph position="11"> Evidence attachment is a coordination Narrative: If the verbs express events.</Paragraph>
    <Paragraph position="12"> Lists: If the subjects are synonyms/antonyms and/or the syntactic structures of M-BDU and AP are sufficiently similar.</Paragraph>
    <Paragraph position="13">  Coordinations express a symmetric relationship between the children, including: lists, narratives, etc. Subordinations express an asymmetric relationship between children, including: elaborations, interruptions, etc. Finally, n-aries include a number of cases where the structure is defined by specific language constructions. Note that these constructions are not arbitrary, and often follow from (sentence) syntactic constructions. Examples include scope setting operators and units (when john comes, he will be happy), and more or less fixed forms like greetings and question-answer pairs, etc. It is the practice to also consider genre-specific structures (e.g. &amp;quot;a paper consists of a title, an abstract, an introduction, some sections, a conclusion, and references&amp;quot;) to be n-aries.</Paragraph>
    <Paragraph position="14"> Because to characterize the large structure of the discourse we only need to refer to coordinations, subordinations and n-aries, it is often claimed that the number of relations in the LDM is much smaller than in RST, even though strictly speaking this depends on which versions of LDM and RST one compares. The real difference between the two theories lies in the rather different origins of the rules.</Paragraph>
    <Paragraph position="15"> All non-terminal nodes in U-LDM trees are first class citizens and contain, in addition to a node label, content and context information inherited from child nodes. Under RST only terminal nodes have content; non-terminal nodes that represent the relationships obtaining among spans of the text longer than one sentence are labeled for the relationship between daughter nodes only.</Paragraph>
    <Paragraph position="16"> As in Summarist (Hovy and Lin, 1999), once the source text has been parsed and a discourse tree incrementally constructed, text summarization algorithms are applied to the resulting tree. However, the difference between constructing a semantic rather than a rhetorical representation of the text accounts for how PALSUMM summaries preserve readability and reference resolution: because the entire analysis involves matching semantically defined contextual units to the appropriate contexts available on the tree, nodes that structurally dominate other nodes necessarily contain the information needed to contextually interpret the dominated units.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Pruning PALSUMM Trees
</SectionTitle>
    <Paragraph position="0"> Summarization methods based on discourse structure all rely on assigning a numeric value to all intermediate and leaf nodes encoding their importance , based on the labels at the nodes. The difference between different methods orig inates in the different ways this importance measure is calculated. Because RST (Marcu, 2000) and U-LDM trees differ, there are key differences between the simple pruning methods applied to U-LDM trees as opposed to RST trees.</Paragraph>
    <Paragraph position="1"> Under the U-LDM theory of discourse, the asymmetric relationship expressed by subordinations implicitly encodes a notion of importance. The subordinated child elaborates or further qualifies the head, or temporarily interrupts the flow of discourse. Subordinated material is almost always less important to the main line of the text than subordinating material: the level of embedding thus gives a first rough measure of importance of a unit of discourse.</Paragraph>
    <Paragraph position="2"> Our original summarization algorithm, Sym-Trim, used the level of embedding directly. It pruned the tree at a given level of embedding, and generated a summary based on the span of the remaining tree. The number of possible summary lengths, however, was restricted to the number of embedding levels, resulting in a discreet number of summaries of a fixed length, often ones longer than desired. This led to a need for more subtle pruning alg orithms.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Solving the SymTrim Restriction
</SectionTitle>
      <Paragraph position="0"> There are two theoretic problems that underlie the practical problems of SymTrim. First, across the board pruning at a fixed level is of limited utility.</Paragraph>
      <Paragraph position="1"> If two sections of a document differ significantly in size, the larger section will have more space for deeper sub-trees. Consequently, units of equal importance may occur at deeper levels of larger sub-trees.</Paragraph>
      <Paragraph position="2"> Secondly, no method that relies solely on purely structural information can determine what parts of the document contain important information. For this an approximation the meaning of the units is needed. A description of the relationships among them does not suffice.</Paragraph>
      <Paragraph position="3"> We address the first issue by not trimming the tree at an absolute level, but at a level relative to the depth of the sub-branch in which a node is found. We address the second issue by skewing the pruning level using statistical methods3 as an oracle to indicate relative importance.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Score Adjustment and Percolation
</SectionTitle>
      <Paragraph position="0"> We assign every node a relative depth T(l), based on the local and global structure of the tree branch to which it belongs, calculated as follows: (1) establish the absolute depth D(l) of each node, (2) calculate an embedding branch weight W(l) by percolating the value of D(l) up from the leaves according to the percolation algorithm outlined in Figure 1, (3) assign each node a relative depth T(l) = 1 - (D(l) - 1) / W(l).4 We also compute a statistical score that approximates the &amp;quot;semantic importance&amp;quot; of every node. To do so, we begin by seeding every leaf node l with a statistical seed S(l) using the MEAD statistical summarizer. Each segment is scored by MEAD in the context of the full document, with a score that mirrors its judgment of the relevance of that segment for a summary. MEAD's metrics include: TF/IDF cosine similarity between a segment and the document - optionally skewed towards a query entered by the user, the relative position of a segment within the document, an adverse score against segments deemed as too similar to the current summary, and our own implementation of a feature concerning the presence of certain cue words (Hirschberg and Litman, 1993). After scoring, the values are percolated up through the tree, as before. During percolation of both structurally and statistically obtained scores, the new value of a node that receives a higher score from a child node is percolated downwards through all non-subordinated children. Children of 3 We use the publicly available MEAD (Radev et al. 2003). Adopting a sentence extraction approach, it is capable of assigning scores to each and every sentence. PALSUMM does its own discourse segmentation and sends the segments to MEAD as if they were sentences. This allows us to assign independent scores to discourse segments, thus enabling sub-sentential summarization (segment-extraction vs. sentence extraction) and yielding more compressed yet still highly readable summaries.</Paragraph>
      <Paragraph position="1"> 4 The expression for T(l) was chosen to assign the top node relative depth 1. coordinations and n-aries are considered equally relevant and scored equally, whereas subordinated children are less relevant then subordinating ones.5 After percolation we normalize the statistical scores, dividing by the maximum occurring value.</Paragraph>
      <Paragraph position="2"> Different summarization algorithms result from the choice of seeding algorithms and methods of combining scores. Note that the percolation algorithm in Figure 1 respects structural embedding by always assigning lower or equal scores to subordinated nodes.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Pruning Algorithms
</SectionTitle>
      <Paragraph position="0"> In order for summaries to maintain textual coherence and readability, constituents that contain contextual or referential information necessary to interpreting other constituents selected for the summary must be marked for inclusion. For any node, this information is available in nodes that are siblings of the same coordination or n-ary, and in nodes that dominate it through subordinationtype relations. As long as the score assigned to nodes respects subordinations as in Figure 1, any pruning of the tree that excludes constituents whose final relevance score is smaller than a chosen value is guaranteed to preserve the antecedents for the anaphora in the text, preserving well-formedness of the resulting tree and the readability of the summary it yields.</Paragraph>
      <Paragraph position="1"> In Table 2 we list four different final score assignments, based on the embedding level of the nodes (L), their percolated statistical score (S) and the percolated relative depth score (T).</Paragraph>
      <Paragraph position="2">  5 In a modified percolation scheme, downward percolation is restricted to preceding siblings in discourse-level coordination nodes. This is a result of the fact that contextual information necessary to preserve readability and referential integrity must appear before access.</Paragraph>
      <Paragraph position="3"> After scores are calculated and combined, a relative threshold is computed by sorting the set of constituent by final score and identifying the cut-off value that more closely approximates the request of the user in terms of desired summary length. Note that the root node will always have norma lized score 1 and will therefore always be included in a full summary. 6</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Evaluating PALSUMM
</SectionTitle>
    <Paragraph position="0"> The PALSUMM corpus contains over 300 FXPAL Technical Reports in a wide range of domains. The Reports vary in size from 10 to 30 pages. To evaluate the readability of summaries and create a baseline for evaluating the SymTrim-R and HybRduce-R algorithms, we conducted a small pilot study on five documents selected from the corpus. The documents were hand-annotated with their U-LDM discourse structures. The Sym-Trim-R and HybReduce-R variants were then automatically applied to these discourse structures, and the summaries submitted to a panel of 12 non-experts. The panelists were asked to judge the summaries on a 6-point scale for readability by answering a set of questions including &amp;quot;How readable is this summary?&amp;quot; and &amp;quot;Did you get confused at any point in the summary?&amp;quot; The initial results suggest that the discourse algorithms produced readable summaries and that the relative effectiveness of the discourse algorithms varies according to some still to be determined property of the documents.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML