File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-1035_concl.xml

Size: 1,718 bytes

Last Modified: 2025-10-06 13:53:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-1035">
  <Title>Toward a Task-based Gold Standard for Evaluation of NP Chunks and Technical Terms</Title>
  <Section position="6" start_page="3" end_page="3" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> In this paper we have reported on a rigorous experimental technique for black-box evaluation of the usefulness of NP chunks and technical terms in an information access task. Our experiment shows that it is possible to reliably identify human preferences for sets of terms.</Paragraph>
    <Paragraph position="1"> The set of human terms created for use in a back-of-the-book index serves as a gold standard. An advantage of the task-based evaluation is that a set of terms could outperform the gold standard; any system that could do this would be a good system indeed.</Paragraph>
    <Paragraph position="2"> The two automatic methods that we evaluated performed much less well than the terms created by the human indexer; we plan to evaluate additional techniques for term identification in the hope of identifying automatic methods that identify index terms that people prefer over the human terms. We also plan to prepare test materials in different domains, and assess in greater depth the properties of the terms that our experimental subjects preferred; our goal is to develop practical guidelines for the identification and selection of technical terms that are optimal for human users. We will also study the impact of semantic differences between terms on user preferences and investigate whether terms which are preferred for information access are equally suitable for other NLP tasks.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML