XML Viewer - c04-1162

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1162_intro.xml
Size: 4,162 bytes
Last Modified: 2025-10-06 14:02:11
<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1162">
  <Title>PageRank on Semantic Networks, with Application to Word Sense Disambiguation</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Open Text Word Sense Disambiguation
</SectionTitle>
    <Paragraph position="0"> The task of word sense disambiguation consists of assigning the most appropriate meaning to a polysemous word within a given context. Applications such as machine translation, knowledge acquisition, common sense reasoning, and others, require knowledge about word meanings, and word sense disambiguation is considered essential for all these applications. null Most of the efforts in solving this problem were concentrated so far toward targeted supervised learning, where each sense tagged occurrence of a particular word is transformed into a feature vector, which is then used in an automatic learning process.</Paragraph>
    <Paragraph position="1"> The applicability of such supervised algorithms is however limited only to those few words for which sense tagged data is available, and their accuracy is strongly connected to the amount of labeled data available at hand.</Paragraph>
    <Paragraph position="2"> Instead, open-text knowledge-based approaches have received significantly less attention1. While the performance of such methods is usually exceeded by their supervised corpus-based alternatives, they have however the advantage of providing larger coverage.</Paragraph>
    <Paragraph position="3"> 1We use the term knowledge-based to denote methods that involve logical inferences and derivation of global properties that extend the data in a dictionary and/or a corpus with new knowledge. In our definition of knowledge-based approaches, the use of a corpus is not excluded.</Paragraph>
    <Paragraph position="4"> Knowledge-based methods for word sense disambiguation are usually applicable to all words in open text, while supervised corpus-based techniques target only few selected words for which large corpora are made available. Four main types of knowledge-based methods have been developed so far for word sense disambiguation.</Paragraph>
    <Paragraph position="5"> Lesk algorithms. First introduced by (Lesk, 1986), these algorithms attempt to identify the most likely meanings for the words in a given context based on a measure of contextual overlap between the dictionary definitions of the ambiguous words, or between the current context and dictionary definitions provided for a given target word.</Paragraph>
    <Paragraph position="6"> Semantic similarity. Measures of semantic similarity computed on semantic networks (Rada et al., 1989). Depending on the size of the context they span, these measures are in turn divided into two main categories:  (1) Local context - where the semantic measures are used to disambiguate words additionally connected by syntactic relations (Stetina et al., 1998).</Paragraph>
    <Paragraph position="7"> (2) Global context - where the semantic measures  are employed to derive lexical chains, which are threads of meaning often drawn throughout an entire text (Morris and Hirst, 1991).</Paragraph>
    <Paragraph position="8"> Selectional preferences. Automatically or semi-automatically acquired selectional preferences, as means for constraining the number of possible senses that a word might have, based on the relation it has with other words in context (Resnik, 1997).</Paragraph>
    <Paragraph position="9"> Heuristic-based methods. These methods consist of simple rules that can reliably assign a sense to certain word categories: one sense per collocation (Yarowsky, 1993), and one sense per discourse (Gale et al., 1992).</Paragraph>
    <Paragraph position="10"> In this paper, we propose a new open-text disambiguation algorithm that combines information drawn from a semantic network (WordNet) with graph-based ranking algorithms (PageRank). We compare our method with other open-text word sense disambiguation algorithms, and show that the accuracy achieved through our new PageRank-based method exceeds the performance obtained by other knowledge-based methods.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML