File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-2037_concl.xml
Size: 1,109 bytes
Last Modified: 2025-10-06 13:53:37
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-2037"> <Title>Evaluating Answers to Definition Questions</Title> <Section position="4" start_page="0" end_page="0" type="concl"> <SectionTitle> 3 Conclusion </SectionTitle> <Paragraph position="0"> The AQUAINT pilot evaluations are designed to explore the issues surrounding new evaluation methodologies for question answering systems using a small set of systems.</Paragraph> <Paragraph position="1"> If a pilot is successful, the evaluation will be transferred to the much larger TREC QA track. The definition pilot demonstrated that relative F scores based on concept recall and adjusted response length are stable when computed using different human assessor judgments, and reflect intuitive judgments of quality. The main measure used in the pilot strongly emphasized recall, but varying the F measure's a7 parameter allows different user preferences to be accommodated as expected. Definition questions will be included as a part of the TREC 2003 QA track where they will be evaluated using this methodology. null</Paragraph> </Section> class="xml-element"></Paper>