File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/96/x96-1046_abstr.xml

Size: 4,888 bytes

Last Modified: 2025-10-06 13:48:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="X96-1046">
  <Title>The Text REtrieval Conferences (TRECs)</Title>
  <Section position="2" start_page="373" end_page="374" type="abstr">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> Phase two of the TIPSTER project included two workshops for evaluating document detection (information retrieval) projects: the third and fourth Text REtrieval Conferences (TRECs). These workshops were held at the National Institute of Standards and Technology (NIST) in November of 1994 and 1995 respectively. The conferences included evaluation not only of the TIPSTER contractors, but also of many information retrieval groups outside of the TIPSTER project. The conferences were run as workshops that provided a forum for participating groups to discuss their system results on the retrieval tasks done using the TIP-STER/TREC collection. As with the first two TRECs, the goals of these workshops were: * To encourage research in text retrieval based on large-scale test collections * To increase communication among industry, academia, and government by creating an open forum for exchange of research ideas * To speed the transfer of technology from research labs into commercial products by demonstrating substantial improvements in retrieval methodologies on real-world problems * To increase the availability of appropriate evaluation techniques for use by industry and academia, including development of new evaluation techniques more applicable to current systems * To serve as a showcase for state-of-the-art retrieval systems for DARPA and its clients.</Paragraph>
    <Paragraph position="1"> The number of participating systems has grown from 25 in TREC-1 to 32 in TREC-3 (see Table 1) and to 36 in TREC-4 (see Table 2). These systems include most of the major text retrieval software companies and most of the universities doing research in text retrieval. Note that whereas the universities tend to participate every year, the companies often skip years because of the amount of effort required to run the TREC tests.</Paragraph>
    <Paragraph position="2"> By opening the evaluation to all interested groups, TIPSTER has ensured that TREC represents many different approaches to text retrieval. The emphasis on diverse experiments evaluated within a common setting has proven to be a major strength of TREC.</Paragraph>
    <Paragraph position="3"> The research done by the participating groups in the four TREC conferences has varied, but has followed a general pattern. TREC-1 (1992) required significant system rebuilding by most groups, due to the huge increase in the size of the document collection from a traditional test collection of several megabytes in size to the 2 gigabyte TIPSTER collection. The second TREC conference (TREC-2) occurred in August of 1993, less than 10 months after the first conference. The results (using new test topics) showed significant improvements over the TREC- 1 results, but should be viewed as an appropriate baseline representing the 1993 state-of-the-art retrieval techniques as scaled up to handling a 2 gigabyte collection.</Paragraph>
    <Paragraph position="4"> TREC-3 provided an opportunity for complex experimentation. The experiments included the development of automatic query expansion techniques, the use of passages or subdocuments to increase the precision of retrieval results, and the use of training information to help systems select only the best terms for queries.</Paragraph>
    <Paragraph position="5"> Some groups explored hybrid approaches (such as the use of the Rocchio methodology in systems not using a vector space model), and others tried approaches that were radically different from their original approaches. For example, experiments in manual query expansion were done by the University of California at Berkeley, and experiments in combining information from three very different retrieval techniques were done by the Swiss Federal Institute of Technology (ETH). For more details on the specific system approaches, see the complete overview of the TREC-3 conference, including papers from the participating groups \[1\].</Paragraph>
    <Paragraph position="6"> TREC-4 presented a continuation of many of these complex experiments, and also included a set of five focussed tasks, called tracks. Both the main tasks were more difficult -- the test topics were much shorter, and the test documents were harder to retrieve. Several groups made major changes in their retrieval algorithms, and all groups had difficulty working with the very short topics. Many interesting experiments were done in the tracks, including 10 groups that worked with Spanish  data, and 11 groups that ran extensive experiments in interactive retrieval. Details of specific system approaches are in the proceedings of the TREC-4 conference \[2\].</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML