File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-1415_evalu.xml
Size: 2,313 bytes
Last Modified: 2025-10-06 13:59:52
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1415"> <Title>Generating Intelligent Numerical Answers in a Question-Answering System</Title> <Section position="9" start_page="108" end_page="109" type="evalu"> <SectionTitle> 5 Evaluation </SectionTitle> <Paragraph position="0"> In this section, we present some elements of evaluation of our system with respect to 15 end-users3.</Paragraph> <Paragraph position="1"> We first evaluated how users behave when they are faced with different candidate answers to a question. To each user, we presented 5 numerical questions and their candidate answers which vary according to time or restrictions and ask them to produce their own answer from candidate answers. For numerical answers varying according to restrictions, 93% of subjects produce answers explaining the different numerical values for each restriction. For numerical answers varying over time, 80% of subjects produce answers giving the most recent information (20% of subjects produce an answer which a summary of all candidate values). This validates our hypothesis presented in section 4.1.1.</Paragraph> <Paragraph position="2"> The second point we evaluated is the answer order. Our system produces answers in the form of a direct answer, then an explanation and a justification (page extract) if necessary. We proposed to users answers with these three parts arranged randomly. Contrary to (Yu et al, 2005) which propose first an overview and then a zoom on inter3Subjects are between 20 and 35 years old and are accustomed to using search engines.</Paragraph> <Paragraph position="3"> esting phenomena, 73% of subjects prefered the order proposed by our system, perhaps because, in QA systems, users wants to have a direct answer to their question before having explanations.</Paragraph> <Paragraph position="4"> The last point we evaluated is the quality of the system answers. For this purpose, we asked subjects to choose, for 5 questions, which answer they prefer among: the system answer, an average, an interval and a disjunction of all candidate answers. 91% of subjects prefered the system answer. 75% of subjects found that the explanation produced is useful and only 31% of subjects consulted the Web page extract (28% of these found it useful).</Paragraph> </Section> class="xml-element"></Paper>