File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-1053_intro.xml

Size: 2,435 bytes

Last Modified: 2025-10-06 14:01:25

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1053">
  <Title>Extracting Important Sentences with Support Vector Machines</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Extracting important sentences means extracting from a document only those sentences that have important information. Since some sentences are lost, the result may lack coherence, but important sentence extraction is one of the basic technologies forgenerating summaries that are useful for humans to browse. Therefore, this technique plays an important role in automatic text summarization.</Paragraph>
    <Paragraph position="1"> Many researchers have been studied important sentence extraction since the late 1950's (Luhn, 1958). Conventional methods focus on sentence features and define significance scores.</Paragraph>
    <Paragraph position="2"> The features include key words, sentence position, and certain linguistic clues. Edmundson (1969) and Nobata et al. (2001) have proposed  scoringfunctionstointegrateheterogeneousfeatures. However, we can not tune the parameter values by hand when the number of features is large.</Paragraph>
    <Paragraph position="3"> When a large quantity of training data is available, tuning can be effectively realized by machine learning. In recent years, machine learning has attracted attention in the field of automatic text summarization. Aone et al.</Paragraph>
    <Paragraph position="4"> (1998) and Kupiec et al. (1995) employed Bayesianclassifiers, Manietal. (1998),Nomoto et al. (1997), Lin (1999), and Okumura et al. (1999) used decision tree learning. However,mostmachinelearningmethodsoverfitthe null training data when many features are given.</Paragraph>
    <Paragraph position="5"> Therefore, we need to select features carefully.</Paragraph>
    <Paragraph position="6"> Support Vector Machines (SVMs) (Vapnik, 1995) is robust even when the number of features is large. Therefore, SVMs have shown good performance for text categorization (Joachims, 1998), chunking (Kudo and Matsumoto, 2001), and dependency structure analysis (Kudo and Matsumoto, 2000).</Paragraph>
    <Paragraph position="7"> In this paper, we present an important sentence extraction technique based on SVMs. We verified the technique against the Text SummarizationChallenge(TSC)(FukushimaandOku- null mura, 2001) corpus.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML