File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/02/p02-1054_relat.xml

Size: 2,483 bytes

Last Modified: 2025-10-06 14:15:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="P02-1054">
  <Title>Is It the Right Answer? Exploiting Web Redundancy for Answer Validation</Title>
  <Section position="7" start_page="0" end_page="0" type="relat">
    <SectionTitle>
6 Related Work
</SectionTitle>
    <Paragraph position="0"> Although there is some recent work addressing the evaluation of QA systems, it seems that the idea of using a fully automatic approach to answer validation has still not been explored. For instance, the approach presented in (Breck et al., 2000) is semiautomatic. The proposed methodology for answer validation relies on computing the overlapping between the system response to a question and the stemmed content words of an answer key. All the answer keys corresponding to the 198 TREC-8 questions have been manually constructed by human annotators using the TREC corpus and external resources like the Web.</Paragraph>
    <Paragraph position="1"> The idea of using the Web as a corpus is an emerging topic of interest among the computational linguists community. The TREC-2001 QA track demonstrated that Web redundancy can be exploited at different levels in the process of finding answers to natural language questions. Several studies (e.g.</Paragraph>
    <Paragraph position="2"> (Clarke et al., 2001) (Brill et al., 2001)) suggest that the application of Web search can improve the precision of a QA system by 25-30%. A common feature of these approaches is the use of the Web to introduce data redundancy for a more reliable answer extraction from local text collections. (Radev et al., 2001) suggests a probabilistic algorithm that learns the best query paraphrase of a question searching the Web. Other approaches suggest training a question-answering system on the Web (Mann, 2001).</Paragraph>
    <Paragraph position="3"> The Web-mining algorithm presented in this paper is similar to the PMI-IR (Pointwise Mutual Information - Information Retrieval) described in (Turney, 2001). Turney uses PMI and Web retrieval to decide which word in a list of candidates is the best synonym with respect to a target word. However, the answer validity task poses different peculiarities. We search how the occurrence of the question words influence the appearance of answer words. Therefore, we introduce additional linguistic techniques for pattern and query formulation, such as keyword extraction, answer type extraction, named entities recognition and pattern relaxation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML