File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/p97-1010_concl.xml

Size: 2,469 bytes

Last Modified: 2025-10-06 13:57:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="P97-1010">
  <Title>Homonymy and Polysemy in Information Retrieval</Title>
  <Section position="6" start_page="77" end_page="78" type="concl">
    <SectionTitle>
4 Conclusion
</SectionTitle>
    <Paragraph position="0"> Most of the research on lexical ambiguity has not been done in the context of an application. We have conducted experiments with hundreds of unique query words, and tens of thousands of word occurrences. The research described in this paper is one of the largest studies ever done. We have examined the lexicon as a whole, and focused on the distinction between homonymy and polysemy. Other research on resolving lexical ambiguity for IR (e.g., (Sanderson 94) and (Voorhees 93)) does not take this distinction into account.</Paragraph>
    <Paragraph position="1"> Our research supports the argument that it is important to distinguish homonymy and polysemy. We have shown that natural language processing results in an improvement in retrieval performance (via grouping related morphological variants), and our experiments suggest where further improvements can be made. We have also provided an explanation for the performance of the Porter stemmer, and shown it is surprisingly effective at distinguishing variant word forms that are unrelated in meaning. The experiment with part-of-speech tagging also highlighted the importance of polysemy; more than half of all words in the dictionary that differ in part of speech are also related in meaning. Finally, our experiments with lexical phrases show that it is crucial to assign partial credit to the component words of a phrase. Our experiment with open/closed compounds indicated that these forms are almost always related in meaning.</Paragraph>
    <Paragraph position="2"> The experiment with part-of-speech tagging indicated that taggers make a number of errors, and our current work is concerned with identifying those words in which a difference in part of speech is associated with a difference in meaning (e.g., train as a noun and as a verb). The words that exhibit such differences are likely to affect retrieval performance. We are also examining lexical phrases to decide how to assign partial credit to the component words.</Paragraph>
    <Paragraph position="3"> This work will give us a better idea of how language processing can provide further improvements in IR, and a better understanding of language in general.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML