File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/x98-1017_concl.xml

Size: 3,320 bytes

Last Modified: 2025-10-06 13:58:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="X98-1017">
  <Title>The Smart/Empire TIPSTER IR System</Title>
  <Section position="10" start_page="118" end_page="119" type="concl">
    <SectionTitle>
7 SUMMARY
</SectionTitle>
    <Paragraph position="0"> In summary, we have developed supporting technology for improving end-user efficiency of information retrieval (IR) systems. We have made progress in three related application areas: high precision information retrieval, near-duplicate document detection, and context-dependent document summarization. Our research aims to increase end-user efficiency in each of the above tasks by reducing the amount of text that the user must p.eruse in order to get the  desired useful information.</Paragraph>
    <Paragraph position="1"> As the underlying technology for the above applications, we use a novel combination of statistical and linguistic techniques. The proposed statistical approaches extend existing methods in IR by pertbrming statistical computations within the context of another query or document. The proposed linguistic approaches build on existing work in information extraction and rely on a new technique for trainable partial parsing. The goal of the integrated approach is to identify selected relationships among important terms in a query or text and use the extracted relationships: (1) to discard or reorder retrieved texts, (2) to locate redundant information, and (3) to generate coherent query-dependent summaries. We believe that the integrated approach offers an innovative and promising solution to problems in end-user efficiency for a number of reasons: * Unlike previous attempts to combine natural language understanding and information retrieval, our approach always performs linguistic analysis relative to another document or query.</Paragraph>
    <Paragraph position="2"> * End-user effectiveness will not be significantly compromised in the face of errors by the Smart/Empire system.</Paragraph>
    <Paragraph position="3"> * The partial parser is a trainable system that can be tuned to recognize those linguistic relationships that are most important for the larger IR task.</Paragraph>
    <Paragraph position="4"> In addition, we have developed TRUESmart, a Toolbox for Research in User Efficiency. TRUESmart is a set of tools and data supporting researchers in the development of methods for improving user efficiency for state-of-the-art information retrieval systems. In addition, TRUESmart includes a simple graphical user interface that aids system evaluation and analysis by highlighting important term relationships identified by the underlying statistical and linguistic language processing algorithms. To date, we have used TRUESmart to integrate and evaluate system components in high-precision retrieval and context-dependent summarization.</Paragraph>
    <Paragraph position="5"> In conclusion, we believe that our statistical-linguistic approach to automated text retrieval has shown promising results and has simultaneously addressed four important goals for the TIPSTER program -- the need for increased accuracy in detection systems, increased portability and applicability of extraction systems, better summarization of free text, and increased communication across detection and extraction systems.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML