File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/w02-1033_intro.xml

Size: 3,708 bytes

Last Modified: 2025-10-06 14:01:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-1033">
  <Title>An Analysis of the AskMSR Question-Answering System</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Question answering has recently received attention from the information retrieval, information extraction, machine learning, and natural language processing communities (AAAI, 2002; ACL-ECL, 2002; Voorhees and Harman, 2000, 2001). The goal of a question answering system is to retrieve answers to questions rather than full documents or best-matching passages, as most information retrieval systems currently do. The TREC Question Answering Track, which has motivated much of the recent work in the field, focuses on fact-based, short-answer questions such as &amp;quot;Who killed Abraham Lincoln?&amp;quot; or &amp;quot;How tall is Mount Everest?&amp;quot; In this paper we describe our approach to short answer tasks like these, although the techniques we propose are more broadly applicable.</Paragraph>
    <Paragraph position="1"> Most question answering systems use a variety of linguistic resources to help in understanding the user's query and matching sections in documents. The most common linguistic resources include: part-of-speech tagging, parsing, named entity extraction, semantic relations, dictionaries, WordNet, etc. (e.g., Abney et al., 2000; Chen et al.</Paragraph>
    <Paragraph position="2"> 2000; Harabagiu et al., 2000; Hovy et al., 2000; Pasca et al., 2001; Prager et al., 2000). We chose instead to focus on the Web as a gigantic data repository with tremendous redundancy that can be exploited for question answering. We view our approach as complimentary to more linguistic approaches, but have chosen to see how far we can get initially by focusing on data per se as a key resource available to drive our system design. Recently, other researchers have also looked to the web as a resource for question answering (Buchholtz, 2001; Clarke et al., 2001; Kwok et al., 2001). These systems typically perform complex parsing and entity extraction for both queries and best matching Web pages, and maintain local caches of pages or term weights. Our approach is distinguished from these in its simplicity and efficiency in the use of the Web as a large data resource. null Automatic QA from a single, small information source is extremely challenging, since there is likely to be only one answer in the source to any user's question. Given a source, such as the TREC corpus, that contains only a relatively small number of formulations of answers to a query, we may be faced with the difficult task of mapping questions to answers by way of uncovering complex lexical, syntactic, or semantic relationships between question string and answer string. The need for anaphor resolution and synonymy, the presence of alternate syntactic formulations and indirect answers all make answer finding a potentially challenging task. However, the greater the  answer redundancy in the source data collection, the more likely it is that we can find an answer that occurs in a simple relation to the question. Therefore, the less likely it is that we will need to solve the aforementioned difficulties facing natural language processing systems.</Paragraph>
    <Paragraph position="3"> In this paper, we describe the architecture of the AskMSR Question Answering System and evaluate contributions of different system components to accuracy. Because a wrong answer is often worse than no answer, we also explore strategies for predicting when the question answering system is likely to give an incorrect answer.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML