File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/n04-1008_intro.xml

Size: 3,771 bytes

Last Modified: 2025-10-06 14:02:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="N04-1008">
  <Title>Automatic Question Answering: Beyond the Factoid</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The Question Answering (QA) task has received a great deal of attention from the Computational Linguistics research community in the last few years (e.g., Text REtrieval Conference TREC 2001-2003). The definition of the task, however, is generally restricted to answering factoid questions: questions for which a complete answer can be given in 50 bytes or less, which is roughly a few words. Even with this limitation in place, factoid question answering is by no means an easy task. The challenges posed by answering factoid question have been addressed using a large variety of techniques, such as question parsing (Hovy et al., 2001; Moldovan et al., 2002), question-type determination (Brill et al., 2001; Ittycheraih and Roukos, 2002; Hovy et al., 2001; Moldovan et al., 2002), WordNet exploitation (Hovy et al., 2001; Pasca and Harabagiu, 2001; Prager et al., 2001), Web exploitation (Brill et al., 2001; Kwok et al., 2001), noisy-channel transformations (Echihabi and Marcu, 2003), semantic analysis (Xu et al., 2002; Hovy et al., 2001; Moldovan et al., 2002), and inferencing (Moldovan et al., 2002).</Paragraph>
    <Paragraph position="1"> The obvious limitation of any factoid QA system is that many questions that people want answers for are not factoid questions. It is also frequently the case that non-factoid questions are the ones for which answers cannot as readily be found by simply using a good search engine. It follows that there is a good economic incentive in moving the QA task to a more general level: it is likely that a system able to answer complex questions of the type people generally and/or frequently ask has greater potential impact than one restricted to answering only factoid questions. A natural move is to recast the question answering task to handling questions people frequently ask or want answers for, as seen in Frequently Asked Questions (FAQ) lists. These questions are sometimes factoid questions (such as, &amp;quot;What is Scotland's national costume?&amp;quot;), but in general are more complex questions (such as, &amp;quot;How does a film qualify for an Academy Award?&amp;quot;, which requires an answer along the following lines: &amp;quot;A feature film must screen in a Los Angeles County theater in 35 or 70mm or in a 24-frame progressive scan digital format suitable for exhibiting in existing commercial digital cinema sites for paid admission for seven consecutive days. The seven day run must begin before midnight, December 31, of the qualifying year. [...]&amp;quot;).</Paragraph>
    <Paragraph position="2"> In this paper, we make a first attempt towards solving a QA problem more generic than factoid QA, for which there are no restrictions on the type of questions that are handled, and there is no assumption that the answers to be provided are factoids. In our solution to this problem we employ learning mechanisms for question-answer transformations (Agichtein et al., 2001; Radev et al., 2001), and also exploit large document collections such as the Web for finding answers (Brill et al., 2001; Kwok et al., 2001). We build our QA system around a noisy-channel architecture which exploits both a language model for answers and a transformation model for answer/question terms, trained on a corpus of 1 million question/answer pairs collected from the Web. Our evaluations show that our system achieves reasonable performance in terms of answer accuracy for a large variety of complex, non-factoid questions.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML