File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/p04-1027_concl.xml

Size: 2,609 bytes

Last Modified: 2025-10-06 13:54:03

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1027">
  <Title>An Empirical Study of Information Synthesis Tasks</Title>
  <Section position="7" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6 Conclusions
</SectionTitle>
    <Paragraph position="0"> In this paper, we have reported an empirical study of the &amp;quot;Information Synthesis&amp;quot; task, defined as the process of (given a complex information need) extracting, organizing and relating the pieces of information contained in a set of relevant documents, in order to obtain a comprehensive, non redundant report that satisfies the information need.</Paragraph>
    <Paragraph position="1"> We have obtained two main results: The creation of an Information Synthesis testbed (ISCORPUS) with 72 reports manually generated by 9 subjects for 8 complex topics with 100 relevant documents each.</Paragraph>
    <Paragraph position="2"> The empirical comparison of candidate metrics to estimate the similarity between reports.</Paragraph>
    <Paragraph position="3"> Our empirical comparison uses a quantitative criterion (the QARLA estimation) based on the hypothesis that a good similarity metric will be able to distinguish between manual and automatic reports.</Paragraph>
    <Paragraph position="4"> According to this measure, we have found evidence that the Information Synthesis task is not a standard multi-document summarization problem: state-of-the-art similarity metrics for summaries do not perform equally well with the reports in our testbed.</Paragraph>
    <Paragraph position="5"> Our most interesting finding is that manually generated reports tend to have the same key concepts: a similarity metric based on overlapping key concepts (NICOS) gives significantly better results than metrics based on language models, n-gram co-ocurrence and sentence overlapping. This is an indication that detecting relevant key concepts is a promising strategy in the process of generating reports. null Our results, however, has also some intrinsic limitations. Firstly, manually generated summaries are extractive, which is good for comparison purposes, but does not faithfully reflect a natural process of human information synthesis. Another weakness is the maximum time allowed per report: 30 minutes seems too little to examine 100 documents and extract a decent report, but allowing more time would have caused an excessive fatigue to users. Our volunteers, however, reported a medium to high satisfaction with the results of their work, and in some occasions finished their task without reaching the time limit.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML