File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-3007_intro.xml

Size: 3,859 bytes

Last Modified: 2025-10-06 14:03:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-3007">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Investigations on Event-Based Summarization</Title>
  <Section position="3" start_page="0" end_page="37" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> With the growing of online information, it is inefficient for a computer user to browse a great number of individual news documents. Automatic summarization is a powerful way to overcome such difficulty. However, the research literature demonstrates that machine summaries need to be improved further.</Paragraph>
    <Paragraph position="1"> The previous research on text summarization can date back to (Luhn 1958) and (Edmundson 1969). In the following periods, some researchers focus on extraction-based summarization, as it is effective and simple. Others try to generate abstractions, but these works are highly domain-dependent or just preliminary investigations. Recently, query-based summarization has received much attention. However, it is highly related to information retrieval, another research subject. In this paper, we focus on generic summarization.</Paragraph>
    <Paragraph position="2"> News reports are crucial to our daily life. In this paper, we focus on effective summarization approaches for news reports.</Paragraph>
    <Paragraph position="3"> Extractive summarization is widely investigated in the past. It extracts part of document(s) based on some weighting scheme, in which different features are exploited, such as position in document, term frequency, and key phrases. Recent extraction approaches may also employ machine learning approaches to decide which sentences or phrases should be extracted. They achieve preliminary success in different application and wait to be improved further.</Paragraph>
    <Paragraph position="4"> Previous extractive approaches identify the important content mainly based on terms. Bag of words is not a good representation to specify an event. There are multiple possible explanations for the same collection of words. A predefined template is a better choice to represent the event.</Paragraph>
    <Paragraph position="5"> However it is domain-dependent and need much effort to create and fill it. This tension motivates us to seek a balance between effective implementation and deep understanding.</Paragraph>
    <Paragraph position="6"> According to related works (Filatovia and Hatzivassiloglou, 2004) (Vanderwende et al., 2004), we assume that event may be a natural unit to convey meanings of documents. In this paper, event is defined as the collection of event terms and associated event elements in clause level. Event terms express the meaning of actions themselves, such as &amp;quot;incorporate&amp;quot;. In addition to verbs, action nouns can also express meaning of actions and should be regarded as event terms.</Paragraph>
    <Paragraph position="7"> For example, &amp;quot;incorporation&amp;quot; is action noun. Event elements include named entities, such as person name, organization name, location, time.</Paragraph>
    <Paragraph position="8"> These named entities are tagged with GATE (Cunningham et al., 2002). Based on our event definition, independent and relevant event-based approaches are investigated in this research. Experiments show that both of them achieve encouraging results.</Paragraph>
    <Paragraph position="9"> The related works are discussed in Section 2.</Paragraph>
    <Paragraph position="10"> Independent event-based summarization approach is described in Section 3. Relevant event-based summarization approach is described in Section 4. Section 5 presents the experiments and  evaluations. Then the strength and limitation of our approaches are discussed in Section 6. Finally, we conclude the work in Section 7.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML