File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-1009_intro.xml
Size: 1,919 bytes
Last Modified: 2025-10-06 14:06:35
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1009"> <Title>Trainable, Scalable Summarization Using Robust NLP and Machine Learning*</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Frequency-based (Edmundson, 1969; Kupiec, Pedersen, and Chen, 1995; Brandow, Mitze, and Rau, 1995), knowledge-based (Reimer and Hahn, 1988; McKeown and l:Ladev, 1995), and discourse-based (Johnson et al., 1993; Miike et al., 1994; Jones, 1995) approaches to automated summarization correspond to a continuum of increasing understanding of the text and increasing complexity in text processing. Given the goal of machine-generated summaries, these approaches attempt to answer three central questions: * How does the system count words to calculate worthiness for summarization? * How does the system incorporate the knowledge of the domain represented in the text? * How does the system create a coherent and cohesive summary? Our work leverages off of research in these three approaches and attempts to remedy some of the difficulties encountered in each by applying a combination of information retrieval, information extraction, &quot;We would like to thank Jamie Callan for his help with the INQUERY experiments.</Paragraph> <Paragraph position="1"> and NLP techniques and on-line resources with machine learning to generate summaries. Our DimSum system follows a common paradigm of sentence extraction, but automates acquiring candidate knowledge and learns what knowledge is necessary to summarize. null We present how we automatically acquire candidate features in Section 2. Section 3 describes our training methodology for combining features to generate summaries, and discusses evaluation results of both batch and machine learning methods. Section 4 reports our task-based evaluation.</Paragraph> </Section> class="xml-element"></Paper>