File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-0321_intro.xml
Size: 5,716 bytes
Last Modified: 2025-10-06 14:06:20
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0321"> <Title>Word Sense Disambiguation Based on Structured Semantic Space*</Title> <Section position="2" start_page="0" end_page="187" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> Word sense disambiguation has long been one of the major concerns in natural language processing area (e.g., Bruce et al., 1994; Choueka et al., 1985; Gale et al., 1993; McRoy, 1992; Yarowsky 1992, 1994, 1995), whose aim is to identify the correct sense of a word in a particular context, among all of its senses defined in a dictionary or a thesaurus.</Paragraph> <Paragraph position="1"> Undoubtedly, effective disambiguation techniques are of great use in many natural language processing tasks, e.g., machine translation and information retrieving (Allen, 1995; Ng and Lee, 1996; Resnik, 1995), etc.</Paragraph> <Paragraph position="2"> Previous strategies for word sense disambiguation mainly fall into two categories: statistics-based method and exemplar-based method. Statistics-based method often requires large-scale corpora (e.g., Hirst, 1987; Luk, 1995), sense-tagging or not, monolingual or aligned bilingual, as training data to specify significant clues for each word sense. The method generally suffers from the problem of data sparseness. Moreover, huge corpora, especially sense-tagged or aligned ones, are not generally available in all domains for all languages.</Paragraph> <Paragraph position="3"> Exemplar-based method makes use of typical contexts (exemplars) of a word sense, e.g., verb-noun collocations or adjective-noun collocations, and identifies the correct sense of a word in a particular context by comparing the context with the exemplars (Ng and Lee, 1996). Recently, some kinds of learning techniques have been applied to cumulatively acquire exemplars form large corpora (Yarowsky, 1994, 1995). But ideal resources from which to learn exemplars are not generally available for any languages. Moreover, the effectiveness of this method on disambiguating words in large-scale corpora into fine-grained sense distinctions needs to be further investigated (Ng and Lee, 1996).</Paragraph> <Paragraph position="4"> * The work is supported by National Science Foundation of China. A common assumption held by both approaches is that neighboring words provide strong and consistent clues for the correct sense of a target word in some context. In this paper, we also hold the same assumption, but start from a different point. We see the senses of all words in a particular language as forming a space, which we call semantic space, for any word of the language, each of its senses is regarded as a point in the space. So the task of disambiguating a word in a particular context is to locate an appropriate point in the space based on the context.</Paragraph> <Paragraph position="5"> Now that word senses can be generally suggested by their distributional contexts, we model senses with their contexts. In this paper, we formalize the contexts as a kind of multidimensional real-valued vectors, so the semantic space can be seen as a vector space. The similar idea about representing contexts with vectors has been proposed by Schuetze (1993), but what his work focuses on is the contexts of words, while what we concern is the contexts of word senses. Furthermore, his formulation of contexts is based on word frequencies, while we formalize them with semantic codes given in a thesaurus and their salience with respect to senses.</Paragraph> <Paragraph position="6"> It seems that we should first have a large-scale sense-tagged corpus in order to build semantic space, but establishing such a corpus is obviously too timeconsuming. To simplify it, we only try to outline the semantic space by locating the mono-sense words in the space, rather than build it completely by spotting all word senses in the space.</Paragraph> <Paragraph position="7"> Now that we don't try to specify all word senses in the semantic space, for a word in a particular context, it may be the case that we cannot directly spot its correct sense in the space, because the space may not contain the sense at all. But we could locate some senses in the space which are similar with it according to their contexts, and based on their definitions given in a dictionary, we could make out the correct sense of the word in the context.</Paragraph> <Paragraph position="8"> In our implementation, we first build the semantic space based on the contexts of the mono-sense words, and structure the senses in the space as a dendrogram, which we call structured semantic space. Then we make use of an heuristic method to determine some nodes in the dendrogram which correspond with sets of similar senses, which we call sense clusters. Finally, given a target word in a particular context, some clusters in the dendrogram can be activated by the context, then we can make use of the definitions of the target word and the words ~ in the clusters to determine its correct sense in the context.</Paragraph> <Paragraph position="9"> The remainder of the paper is organized as follows: Section 2 defines the notion of semantic space, and discuss how to outline it by establishing the context vectors for mono-sense words. Section 3 examines the structure of the semantic space, and introduces algorithms to merge the senses into a dendrogram and specify the nodes in it which correspond with sets of similar senses. Section 4 discusses the disambiguation procedure based on the contexts. Section 5 describes some experiments and their results. Section 6 presents some conclusions and discusses the future work</Paragraph> </Section> class="xml-element"></Paper>