File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/93/w93-0302_concl.xml

Size: 1,822 bytes

Last Modified: 2025-10-06 13:57:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="W93-0302">
  <Title>ROBUST TEXT PROCESSING IN AUTOMATED INFORMATION RETRIEVAL</Title>
  <Section position="13" start_page="17" end_page="17" type="concl">
    <SectionTitle>
CONCLUSIONS
</SectionTitle>
    <Paragraph position="0"> We presented in some detail a natural language information retrieval system consisting of an advanced NLP module and a 'pure' statistical core engine. While many problems remain to be resolved, including the question of adequacy of term-based representation of document contents, we attempted to demonstrate that the architecture described here is nonetheless viable. In particular, we demonstrated that natural language processing can now be done on a fairly large scale and that its speed and robustness can match those of traditional statistical programs such as key-word indexing or statistical phrase extraction. We suggest, with some caution until more experiments are run, that natural language processing can be very effective in creating appropriate search queries out of user's initial specifications which can be frequently imprecise or vague.</Paragraph>
    <Paragraph position="1"> On the other hand, we must be aware of the limits of NLP technologies at our disposal. While part-of-speech tagging, lexicon-based stemming, and parsing can be done on large amounts of text (hundreds of millions of words and more), other, more advanced processing involving conceptual structuring, logical forms, etc., is still beyond reach, compurationally. It may be assumed that these superadvanced techniques will prove even more effective, since they address the problem of representationlevel limits, however the experimental evidence is sparse and necessarily limited to rather small scale tests (e.g., Mauldin, 1991).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML