File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/c00-1012_concl.xml

Size: 3,081 bytes

Last Modified: 2025-10-06 13:52:43

<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-1012">
  <Title>The effects of analysing cohesion on document summarisation</Title>
  <Section position="7" start_page="80" end_page="81" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> Starting from a class of problems inherent to summarization by sentence extraction, we have proposed a strat- null egy for alleviating some of the particularly jarring end-user effects in the summaries, which are due to coherence degradation, readability deterioration, and topical under-representation. Our approach is to aim for more cohesive summaries, by leveraging the lexical cohesion factors in the source document texts. As an initial experiment, we have looked at one particular facto1; lcxical repetition, and have developed a framework for integrab ing a discourse segmentation component capable of detecting shifts in topic, with a linguistically-aware summarizer which utilizes notions of salience and dynamically-adjustable size of the resulting summaries. By analyzing cohesion indicators in the discourse, segmentation identifies points in the narrative where sub-stories alternate; the summarization function uses the resulting set of discourse segments to derive more complete, informative and faithful summaries than ones extracted solely on the basis of sentence salience (calculated with respect to a background document collection).</Paragraph>
    <Paragraph position="1"> A comparative evaluation of summarization with, and without, segmentation analysis shows that under certain conditions, segmentation-enhanced summarization is better than the base segmentation technology: Some of these conditions can be expressed as a function of the original document length, and the document-tosummary ratio; thus, of particular interest is the fact that optimal strategy for combining the two technologies can be selected 'on the fly', depending on the type of input to be summarized.</Paragraph>
    <Paragraph position="2"> Furthemore, having access to a segmentation component makes it possible to alleviate a serious shortcoming of summarizers like ours, which crucially depend on the statistics of a background collection: in situations where background collection-based salience calculatkm is impossible, or impractical, it is realistic to deliver summaries--of comparable quality, yet considerably cheaper to generate--derived by access to discourse segmentation information alone.</Paragraph>
    <Paragraph position="3"> The research reported here is part of a larger effort focused on leveraging elements of the discourse structure for a variety of content characterisation tasks. Overall, we aim to build an infrastructure for recognizing and using a broad range of cohesive devices in text. Document summarization is just one application in the larger space of document content management; our long term goal is to develop a framework where summarization and other applications would be enabled by a rich substrate of linguistic analysis of lexical cohesion.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML