File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1039_intro.xml
Size: 2,210 bytes
Last Modified: 2025-10-06 14:03:38
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1039"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Bayesian Query-Focused Summarization</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> We describe BAYESUM, an algorithm for performing query-focused summarization in the common case that there are many relevant documents for a given query. Given a query and a collection of relevant documents, our algorithm functions by asking itself the following question: what is it about these relevant documents that differentiates them from the non-relevant documents? BAYESUM can be seen as providing a statistical formulation of this exact question.</Paragraph> <Paragraph position="1"> The key requirement of BAYESUM is that multiple relevant documents are known for the query in question. This is not a severe limitation. In two well-studied problems, it is the de-facto standard.</Paragraph> <Paragraph position="2"> In standard multidocument summarization (with or without a query), we have access to known relevant documents for some user need. Similarly, in the case of a web-search application, an underlying IR engine will retrieve multiple (presumably) relevant documents for a given query. For both of these tasks, BAYESUM performs well, even when the underlying retrieval model is noisy.</Paragraph> <Paragraph position="3"> The idea of leveraging known relevant documents is known as query expansion in the information retrieval community, where it has been shown to be successful in ad hoc retrieval tasks. Viewed from the perspective of IR, our work can be interpreted in two ways. First, it can be seen as an application of query expansion to the summarization task (or, in IR terminology, passage retrieval); see (Liu and Croft, 2002; Murdock and Croft, 2005).</Paragraph> <Paragraph position="4"> Second, and more importantly, it can be seen as a method for query expansion in a non-ad-hoc manner. That is, BAYESUM is a statistically justified query expansion method in the language modeling for IR framework (Ponte and Croft, 1998).</Paragraph> </Section> class="xml-element"></Paper>