File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-3251_evalu.xml

Size: 3,859 bytes

Last Modified: 2025-10-06 13:59:22

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-3251">
  <Title>Instance-Based Question Answering: A Data-Driven Approach</Title>
  <Section position="8" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
6 Results
</SectionTitle>
    <Paragraph position="0"> The most important step in our instance-based approach is identifying clusters of questions. Figure 4 shows the question distribution in terms of number of clusters. For example: 30 questions belong to exactly 3 clusters. The number of clusters corresponding to a question can be seen as a measure of how common the question is - the more clusters a question has, the more likely it is to have a dense neighborhood.</Paragraph>
    <Paragraph position="1"> The resulting MRR is 0:447 and 61:5% questions have correct answers among the first five proposed answers. This translates into results consistently above the sixth highest score at each TREC 9-12. Our results were compared directly to the top performing systems' results on the same temporal  rect answers to the final answer set - only the top 10 answers were considered for each question.</Paragraph>
    <Paragraph position="2"> question test set.</Paragraph>
    <Paragraph position="3"> Figure 5 shows the degree to which clusters produce correct answers to test questions. Very often, more than one cluster contributes to the final answer set, which suggests that there is a benefit in clustering the neighborhood according to different similarity features and granularity.</Paragraph>
    <Paragraph position="4"> It is not surprising that cluster size is not correlated with performance (Figure 6). The overall strategy learned from the cluster &amp;quot;When did &lt;NP&gt; die?&amp;quot; corresponds to an MRR of 0:79, while the strategy learned from cluster &amp;quot;How &lt;Q&gt;?&amp;quot; corresponds to an MRR of 0:13. Even if the two clusters generate strategies with radically different performance, they have the same size - 10 questions are covered by each cluster.</Paragraph>
    <Paragraph position="5">  with answer confidence scores. The higher the confidence threshold the higher the precision (MRR) of the predicted answers. When small, unstable clusters are ignored, the predicted MRR improves considerably. Small clusters tend to produce unsta- null in the feature space, cluster size is not well correlated with performance. A specific cardinality may represent a small and dense part cluster, or a large and sparse cluster.  ent thresholds for minimum cluster size.</Paragraph>
    <Paragraph position="6"> ble strategies and have extremely low performance. Often times structurally different but semantically equivalent clusters have a higher cardinality and much better performance. For example, the cluster &amp;quot;What year did &lt;NP&gt; die?&amp;quot; has cardinality 2 and a corresponding MRR of zero. However, as seen previously, the cluster &amp;quot;When did &lt;NP&gt; die?&amp;quot; has cardinality 10 and a corresponding MRR of 0:79.</Paragraph>
    <Paragraph position="7"> Table 2 presents an intuitive cluster and the top n-grams and paraphrases with most information content. Each feature has also a corresponding average mutual information score. These particular content features are intuitive and highly indicative of a correct answer. However, in sparse clusters, the content features have less information content and are more vague. For example, the very sparse cluster &amp;quot;When was &lt;Q&gt;?&amp;quot; yields content features such as &amp;quot;April&amp;quot;, &amp;quot;May&amp;quot;, &amp;quot;in the spring of&amp;quot;, &amp;quot;back in&amp;quot; which only suggest broad temporal expressions.</Paragraph>
    <Paragraph position="8">  paraphrases for class &amp;quot;When did &lt;NP&gt; die?&amp;quot;, where &lt;Q&gt; refers to a phrase in the original question.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML