File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1010_intro.xml
Size: 1,750 bytes
Last Modified: 2025-10-06 14:02:35
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1010"> <Title>Template-Filtered Headline Summarization</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Producing headline-length summaries is a challenging summarization problem. Every word becomes important. But the need for grammaticality--or at least intelligibility-- sometimes requires the inclusion of non-content words. Forgoing grammaticality, one might compose a &quot;headline&quot; summary by simply listing the most important noun phrases one after another. At the other extreme, one might pick just one fairly indicative sentence of appropriate length, ignoring all other material. Ideally, we want to find a balance between including raw information and supporting intelligibility.</Paragraph> <Paragraph position="1"> We experimented with methods that integrate content-based and form-based criteria. The process consists two phases. The keyword-clustering component finds headline phrases in the beginning of the text using a list of globally selected keywords. The template filter then uses a colle ction of pre-specified headline templates and subsequently populates them with headline phrases to produce the resulting headline.</Paragraph> <Paragraph position="2"> In this paper, we describe in Section 2 previous work. Section 3 describes a study on the use of headline templates. A discussion on the process of selecting and expanding key headline phrases is in Section 4. And Section 5 goes back to the idea of templates but with the help of headline phrases.</Paragraph> <Paragraph position="3"> Future work is discussed in Section 6.</Paragraph> </Section> class="xml-element"></Paper>