File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-2004_concl.xml
Size: 1,622 bytes
Last Modified: 2025-10-06 13:53:30
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-2004"> <Title>Exploiting Diversity for Answering Questions</Title> <Section position="6" start_page="0" end_page="0" type="concl"> <SectionTitle> 5 Discussion and Conclusions </SectionTitle> <Paragraph position="0"> The disparity between the dynamic range of these techniques on the development and test datasets suggests that the dev set sample size of 100 (6700 proposed answers and NILs) may be too small to draw conclusions on the relative quality of selection techniques. Still, consistencies in rank orderings of selection techniques between the two datasets suggest that these methods of system combination are effective.</Paragraph> <Paragraph position="1"> None of our combinators did as well as the best TREC system on the test dataset. It is important to note that in these experiments we did not have access to several useful evidence sources. First, this year's submissions included system estimates on answer confidence, if only implicitly. The selection mechanism could take advantage of this by weighting each submitted answer string appropriately. Second, past TRECs show that some systems are reliably more accurate than others, and if each answer string were labeled with a system ID, even if anonymized, we could use system-level features in the selector, such as a simple prior. Given sufficient training, we might even take question features into account, learning that certain systems are better at certain types of questions. We would like to pursue the use of these and other evidence sources in the future.</Paragraph> </Section> class="xml-element"></Paper>