File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/05/h05-1040_abstr.xml

Size: 1,623 bytes

Last Modified: 2025-10-06 13:44:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="H05-1040">
  <Title>Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 315-322, Vancouver, October 2005. c(c)2005 Association for Computational Linguistics Enhanced Answer Type Inference from Questions using Sequential Models</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Question classification is an important step in factual question answering (QA) and other dialog systems. Several attempts have been made to apply statistical machine learning approaches, including Support Vector Machines (SVMs) with sophisticated features and kernels. Curiously, the payoff beyond a simple bag-of-words representation has been small. We show that most questions reveal their class through a short contiguous token subsequence, which we call its informer span. Perfect knowledge of informer spans can enhance accuracy from 79.4% to 88% using linear SVMs on standard benchmarks. In contrast, standard heuristics based on shallow pattern-matching give only a 3% improvement, showing that the notion of an informer is non-trivial. Using a novel multi-resolution encoding of the question's parse tree, we induce a Conditional Random Field (CRF) to identify informer spans with about 85% accuracy.</Paragraph>
    <Paragraph position="1"> Then we build a meta-classifier using a linear SVM on the CRF output, enhancing accuracy to 86.2%, which is better than all published numbers.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML