File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/p06-2119_metho.xml
Size: 11,746 bytes
Last Modified: 2025-10-06 14:10:30
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2119"> <Title>Word Sense Disambiguation using lexical cohesion in the context</Title> <Section position="4" start_page="929" end_page="929" type="metho"> <SectionTitle> 3 Selection of knowledge bases </SectionTitle> <Paragraph position="0"> WordNet (Fellbaum, 1998) provides a fine-grained enumerative semantic net that is commonly used to tag the instances of English target words in the tasks of SENSEVAL with different senses (WordNet synset numbers). WordNet groups related concepts into synsets and links them through IS-A and PART-OF links, emphasizing the vertical interaction between the concepts that is much paradigmatic.</Paragraph> <Paragraph position="1"> Although WordNet can capture the fine-grained paradigmatic relations of words, another typical word relationship, syntagmatic connectedness, is neglected. The syntagmatic relationship, which is often characterized with different POS tag, and frequently occurs in corpora or human brains, plays a critical part in crossconnecting words from different domains or POS tags.</Paragraph> <Paragraph position="2"> It should be noted that WordNet 2.0 makes some efforts to interrelate nouns and verbs using their derived lexical forms, placing associated words under the same domain. Although some verbs have derived noun forms that can be mapped onto the noun taxonomy, this mapping only relates the morphological forms of verbs, and still lacks syntagmatic links between words.</Paragraph> <Paragraph position="3"> The interrelationship of noun and verb hierarchies is far from complete and only a supplement to the primary IS-A and PART-OF taxonomies in WordNet. Moreover as WordNet generally concerns the paradigmatic relations (Fellbaum, 1998), we have to seek for other lexical knowledge sources to compensate for the shortcomings of WordNet in WSD.</Paragraph> <Section position="1" start_page="929" end_page="929" type="sub_section"> <SectionTitle> The Edinburgh Association Thesaurus </SectionTitle> <Paragraph position="0"/> </Section> </Section> <Section position="5" start_page="929" end_page="930" type="metho"> <SectionTitle> (EAT) </SectionTitle> <Paragraph position="0"> provides an associative network to account for word relationship in human cognition after collecting the first response words for the stimulus words list (Kiss et al., 1973). Take the words eat and food for example. There is no direct path between the concepts of these two words in the taxonomy of WordNet (both as noun and verb), except in the gloss of the first and third sense of eat to explain 'take in solid food', or 'take in food', which glosses are not regularly or care- null fully organized in WordNet. However in EAT eat is strongly associated with food, and when taking eat as a stimulus word, 45 out of 100 subjects regarded food as the first response.</Paragraph> <Paragraph position="1"> Yarowsky (1993) indicated that the objects of verbs play a more dominant role than their subjects in WSD and nouns acquire more stable disambiguating information from their noun or adjective modifiers.</Paragraph> <Paragraph position="2"> In the case of verbs association tests, it is also reported that more than half the response words of verbs (the stimuli) are syntagmatically related (Fellbaum, 1998). In experiments of examining the psychological plausibility of WordNet relationships, Chaffin et al. (1994) stated that only 30.4% of the responses of 75 verb stimuli belongs to verbs, and more than half of the responses are nouns, of which nearly 90% are categorized as the arguments of the verbs.</Paragraph> <Paragraph position="3"> Sinopalnikova (2004) also reported that there are multiple relationships found in word association thesaurus, such as syntagmatic, paradigmatic relations, domain information etc.</Paragraph> <Paragraph position="4"> In this paper we only use the straightforward forms of context words separating the effect of syntactic dependence on the WSD. As a supplement of enriching word linkage in the WSD, we retrieve the lexical knowledge from both Word-Net and EAT. We first explore the function of semantic hierarchies of WordNet on WSD, and then we transform the context word with EAT to investigate whether other relationships can improve WSD.</Paragraph> </Section> <Section position="6" start_page="930" end_page="932" type="metho"> <SectionTitle> 4 System design </SectionTitle> <Paragraph position="0"> In order to find semantically related words to cohesively form lexical hubs, we first employ the two word similarity algorithms of Yang and Powers (2005; 2006) that use WordNet to compute noun similarity and verb similarity respectively. We next construct the lexical hub for each target sense to assemble the similarity score between the target and its context words together.</Paragraph> <Paragraph position="1"> The maximum score of these lexical hubs specifically predicts the real sense of the target, also implicitly captures the cohesion and real meaning of the word in its context.</Paragraph> <Section position="1" start_page="930" end_page="930" type="sub_section"> <SectionTitle> 4.1 Similarity metrics on nouns </SectionTitle> <Paragraph position="0"> Yang and Powers (2005) designed a metric,</Paragraph> <Paragraph position="2"> utilizing both IS-A and PART-OF taxonomies of WordNet to measure noun similarity, and they argued that the similarity of nouns is the maximum of all their concept similarities. They defined the similarity (Sim) of two concepts (c1 and</Paragraph> <Paragraph position="4"> ) to specify the weights of different link types (t) (syn/antonym, hyper/ hyponym, and holo/meronym) in the WordNet, and a path type factor (b</Paragraph> <Paragraph position="6"> ) to reduce the uniform distance of the single link, along with a depth factor (l ) to restrict the maximum searching distance between concepts. Since their metric on noun similarity is significantly better than some popular measures and even outperforms some subjects on a standard data set, we selected it as a measure on noun similarity in our WSD task.</Paragraph> </Section> <Section position="2" start_page="930" end_page="930" type="sub_section"> <SectionTitle> 4.2 Similarity metrics on verbs </SectionTitle> <Paragraph position="0"> Yang and Powers (2006) also redesigned their noun model,</Paragraph> <Paragraph position="2"> to accommodate verb case, which is harder to deal with in the shallow and incomplete taxonomy of verbs in WordNet. As an enhancement to the uniqueness of verb similarity they also consider three fall-back factors, where if a str is 1 normally but successively falls back to: * a stm : the verb stem polysemy ignoring sense and form * a der : the cognate noun hierarchy of the verb * a gls : the definition of the verb They also defined two alternate search protocols: rich hierarchy exploration (RHE) with no more than six links and shallow hierarchy exploration (SHE) with no more than two links. One minor improvement to the verb model in their system comes from comparing the similarity of verbs and nouns using the noun model metric for the derived noun form of verb. It thus allows us to compare nouns and verbs and avoids the limitation of having to have the same POS tag.</Paragraph> </Section> <Section position="3" start_page="930" end_page="931" type="sub_section"> <SectionTitle> 4.3 Depth in WordNet </SectionTitle> <Paragraph position="0"> Yang and Powers fine-tuned the parameters of the noun and verb similarity models, finding them relatively insensitive to the precise values, and we have elected to use their recommended values for the WSD task. But it is worth mentioning that their optimal models are achieved in purely verbal data sets, i.e. the similarity score is context-free.</Paragraph> <Paragraph position="1"> In their models, the depth in the WordNet, i.e. the distance between the synsets of words (l ) , is indeed an outside factor which confines the searching scope to the cost of computation and depends on the different applications. If we tuned it using the training data set of SENSEVAL-2 we probably would assign different values and might achieve better results. Note that for both nouns and verbs we employ RHE (rich hierarchy exploration) with l = 2 making full use of the taxonomy of WordNet and making no use of glosses.</Paragraph> </Section> <Section position="4" start_page="931" end_page="932" type="sub_section"> <SectionTitle> 4.4 How to setup the selection standard for </SectionTitle> <Paragraph position="0"> the senses Other than making the most of WSD results, our main motive for this paper is to explore to what extent the semantic relationships will reach accuracy, and to fully acknowledge the contribution of this single attribute working on WSD, which is encouraged by SENSEVAL in order to gain further benefits in this field (Kilgarriff and Palmer, 2000). Without any definition, which is previously surveyed by Lesk (1986) and Pedersen et al. (2003), we screen off the definition factor in the metric of verb similarity, with the intention of focusing on the taxonomies of WordNet. null Assuming that the lexical hub for the right sense would maximize the cohesion with other words in the discourse, we design six different strategies to calculate the lexical hub in its unordered contextual surroundings.</Paragraph> <Paragraph position="1"> We first put forward three metrics to measure up the similarity of the senses of the target and the context word: * The maximized sense similarity</Paragraph> <Paragraph position="3"> where T denotes the target, T k is the kth sense of the target; C i is the ith context word in a fixed window size around the target, C i,j the jth sense of C i . Note that T and C can be any noun and verb, along with Sim the metrics of Yang and Powers.</Paragraph> <Paragraph position="5"> Subsequently we can define six distinctive heuristics to score the lexical hub in the following parts:</Paragraph> <Paragraph position="7"> Taking into account all of the links between the target and its context word, the correct sense of the target is:</Paragraph> <Paragraph position="9"> The straightforward output of the correct sense of the target in the discourse is to count the maximum number of context words whose similarity scores with the target are larger than zero:</Paragraph> <Paragraph position="11"> No matter what kind of relations between the target and its context are, the sense of the target, which is related to the maximum counts of senses of all its context words, is scored as the right meaning:</Paragraph> <Paragraph position="13"> Therefore the lexical hub of each sense of the target only relies on the interaction of the target and its each context word, rather than of the context words. The implication is that the lexical hub only disambiguates the real sense of the tar- null get other than the real meaning of the context word; the maximum scores or link numbers (on the level of words or senses) in the six heuristics suggest that the correct sense of the target should cohere with as many words or their senses as practicable in the discourse.</Paragraph> <Paragraph position="14"> When similarity scores are ties we directly produce all of the word senses to prevent us from guessing results. Some WSD systems in SENSEVAL handle tied scores simply using the first sense (in WordNet) of the target as the real sense. It is no doubt that the skewed distribution of word senses in the corpora (the first sense often captures the dominant sense) can benefit the performance of the systems, but at the same time it mixes up the contribution of the semantic hierarchy on WSD in our system.</Paragraph> </Section> </Section> class="xml-element"></Paper>