File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-0309_concl.xml

Size: 1,917 bytes

Last Modified: 2025-10-06 13:53:18

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0309">
  <Title>Biomedical Text Retrieval in Languages with a Complex Morphology</Title>
  <Section position="6" start_page="0" end_page="0" type="concl">
    <SectionTitle>
5 Conclusions
</SectionTitle>
    <Paragraph position="0"> There has been some controversy, at least for simple stemmers (Lovins, 1968; Porter, 1980), about the effectiveness of morphological analysis for document retrieval (Harman, 1991; Krovetz, 1993; Hull, 1996). The key for quality improvement seems to be rooted mainly in the presence or absence of some form of dictionary. Empirical evidence has been brought forward that inflectional and/or derivational stemmers augmented by dictionaries indeed perform substantially better than those without access to such lexical repositories (Krovetz, 1993; Kraaij and Pohlmann, 1996; Tzoukermann et al., 1997).</Paragraph>
    <Paragraph position="1"> This result is particularly valid for natural languages with a rich morphology -- both in terms of derivation and (single-word) composition. Document retrieval in these languages suffers from serious performance degradation with the stemming-only query-term-to-text-word matching paradigm. We proposed here a dictionary-based approach in which morphologically complex word forms, no matter whether they appear in queries or in documents, are segmented into relevant subwords and these subwords are subsequently submitted to the matching procedure. This way, the impact of word form alterations can be eliminated from the retrieval procedure.</Paragraph>
    <Paragraph position="2"> We evaluated our hypothesis on a large biomedical document collection. Our experiments lent (partially statistically significant) support to the sub-word hypothesis. The gain of subword indexing was slightly more accentuated with layman queries, probably due to a higher vocabulary mismatch.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML