XML Viewer - c94-2113

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2113_metho.xml
Size: 21,678 bytes
Last Modified: 2025-10-06 14:13:41
<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-2113">
  <Title>WORD SENSE AMBIGUATION: CLUSTERING RELATED SENSES</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> The problem of word sense disambiguation is one which has received increased attention in recent work on Natural Language Processing (NLP) and hfformation Retrieval (IR). Given an occurrence of a polysemous word in running text, the task as it is generally formulated involves examining a set of senses, defined by a MRD or hand-constructed lexicon, and examining contextual cues to discover which of these is the intended one. This paper considers a problem with the standard approach to handling polysemy, arguing that in many cases this kind of &amp;quot;forced-choice&amp;quot; approach to disambiguation leads to arbitrary decisions which have negative consequences for NLP systems. In particular, we show that a great deal of potentially useful information about a word's meaning may be missed if the task involves isolating a single &amp;quot;correct&amp;quot; sense. We describe an approach to the construction of an MRD-derived lexical database that helps overcome some of these difficulties.</Paragraph>
    <Paragraph position="1"> We begin by reviewing two difficulties with this approach, then go on to suggest our approach to solving these difficulties in creating a large MRD-derived lexical database. Our method might be termed &amp;quot;ambiguation&amp;quot;, because it involves blurring the boundaries between closely related word senses.</Paragraph>
    <Paragraph position="2"> Alter describing the algorithm which accomplishes this task, we go on to briefly discuss its results.</Paragraph>
    <Paragraph position="3"> Finally, we describe the implications of this work has for the task of merging multiple.</Paragraph>
    <Paragraph position="4"> The arbitrm'iness of sense divisions The division of word meanings into distinct dictionary senses and entries is frequently arbitrary (Atkins and Levin, 1988; Atkins, 1991), as a comparison of any two dictionaries quickly makes clear. For example, consider the verb &amp;quot;mo(u)lt&amp;quot;, whose single sense in the American Heritage Dictionary, Third Edition (AHD3) corresponds to two senses in Longman's Dictionary of Contemporary English (LDOCE): 2 AHD3 !y: part :or a!l of a coat or...</Paragraph>
    <Paragraph position="5"> covering; sucli:as feathers;cuticle or skin '</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="712" type="metho">
    <SectionTitle>
LDOCE
</SectionTitle>
    <Paragraph position="0"> (1) &amp;quot;(of a biid ) to !0se o r thro TM off (tleatberS) at the  seasoii when new feathers grow' . i (2) {of an anima!; eSpl ad6g or cat)to lose or throw 0ff(hair or fur) The arbitrary nature of such divisions is componnded by the fact that dictionaries typically provide no information about how the different senses of a polysemous headword might be related. Examination of dictionary entries shows that these interrelationships are ol)en highly complex, encompassing senses which differ only in some slight shade of meaning, those which are historically but not synchronically related, those which are linked through some more or less opaque process of metaphor or metonymy, and finally, those which appear to be completely unrelated.</Paragraph>
    <Paragraph position="1"> A typical case is the entry liar the noun &amp;quot;crank&amp;quot;, which includes one &amp;quot;apparatus&amp;quot; sense (1) and two &amp;quot;person&amp;quot; senses (2) and (3). Nothing in this entry indicates that (2) and (3) are more closely related to one another than either is to (1).</Paragraph>
    <Paragraph position="2"> I 1 would like to extend my thanks to Robert Dale and Lisa Braden-ltarder, as well as the naembers of the Microsoft NLP group: George Heidorn, Karen Jensen, Joseph Pentheroudakis, Diana Peterson, Steve Richardson, and Lucy Vanderwende.</Paragraph>
    <Paragraph position="3">  (!) &amp;quot;an apparatus fl~r c ranging movement in a straight line into circular nlovenmnt...&amp;quot; (2,) &amp;quot;a person with.,stnmge, odd, or peculiar ideas&amp;quot; (3) &amp;quot;a nasty bad4empered person,  Usi!!g MRI)s forSe+nse l)isatnbiguation Atkins (1991) argues that dictionary-derived lexical databases will be capable of supporting high-quality NI+P only if they contain highly detailed taxonomic descriptions ol' the interrelationships among word senses. These rchttionships are often systematic (see Atkins, 1991), and it is possible to imagine strategies tot autonmtically or at least semiiUltomatically identi(ying them. One such proposal is due to Chodorow (199t)), who notes 10 recnrring types of inter-sense relationships in Webster's 7th, inchlding PROCESS/RESUI+T, FOOI)/PI+ANT, and CONTAINEP,/VOI+UME, and suggests that seine instances of these relationships might be autolnatically identified. Ideally, such strategy mighl allow the autolnated construction of lexical databases which explicitly characterize how individual senses of a headword are rehttcd, with these il+terrclationships described by a fixed, general set of semantic isssociations which hold between words throughout the lexicon.</Paragraph>
    <Paragraph position="4"> In practice, however, attempts It) antomatically identil}C/ systematic polysemy in MRDs will capture only a slnall subset of the clsses in which word senses overlap semantically. ()flen, distinctions among a word's senses are so fine or so idiosyncratic that they silnply ca!mot bc characterized in a general way. For instance, while the two LI)OCF, senses of &amp;quot;moult&amp;quot; arc closely related, the film distinction they reflect between &amp;quot;bird&amp;quot; lind &amp;quot;animal&amp;quot; behavior is not one which recurs systematically thronghont the English lexicon.</Paragraph>
    <Paragraph position="5"> In short, the task of identil+ying and atlaching a mealfingful label to each of tim links alnong related words senses in a largo lexical database is a daunting one, and one that will ultinultely require it great deal of hand-coding, l)erhaps lot these teasel/S, we know of no large-scale attempts to autolnatically create labeled links among serlses of polysemous words.</Paragraph>
    <Paragraph position="6"> Moreover, it is not clear that isttaching a lncanil~gful label Io the rehstionstfil~ between two semantically rclated senses of a word will necessarily aid in perforlning NI+P tasks. Krovetz and Croft  (1992) snggest jnst the opposile, claimil~g that in lnany cases, dictionary elltries for polyselnons words encode film-grain semantic distinctions that arc unlikely to be of practical valnc for specific ispl)tications+ Our expcrience suggests a silnilar conclusion. Consider, for instance, tile following pair of senses lor tile tloun &amp;quot;stalk&amp;quot;: (1) &amp;quot;the m,,ain upright part of a plant (not a tree)&amp;quot; . (Ex: abeanstaik) (2) 'a long narrow part of a plant suppor, ting one or  \[ inore leaves, fruitS, or flowers; stem The differences between these two senses are subtle enough that for many tasks, including sense disambiguation in running text, the two are likely to be indistinguishable from one another. In a sentence like &amp;quot;Tile stalks remained in the farmer's field long after summer&amp;quot;, lor instance, the choice of some particular sense of &amp;quot;stalk&amp;quot; as the &amp;quot;correct&amp;quot; one will be essentially arbitrary.</Paragraph>
    <Paragraph position="7"> S~nsc \])jsambkguatiQ!~ ve!'sus l!fformation lmss Sense disambiguation algorithms arc l~equently faced with mnltiple &amp;quot;conect&amp;quot; choices, a siluation which increases their odds of choosing a reasonable sense, hut which also has bidden negative consequences for selnanlic processing. First of all, the task of discrimilmting between two or more extremely similar senses can waste processing resonrces while providing no obvious benefit.</Paragraph>
    <Paragraph position="8"> tlowever, there are more problematic effects nf combining a lexicon which makes unnecessarily fine distinctions between word senses with a disambignation algorithm which sets up the iSltiI'icial task of choosing a single &amp;quot;correct&amp;quot; sense Ibr a word. The probleln is that this strategy means thal the innonnt of senlantic inflmnation retrieved los' a word will always be lilnited to just that which is available in some individual sense, and valuable background inlormation about a word's meaning nlay be ignored.</Paragraph>
    <Paragraph position="9"> In the cltse of &amp;quot;stalk&amp;quot;, for instance, choosing the first sense will mean losing the fitct that &amp;quot;stalks&amp;quot; are &amp;quot;steins&amp;quot;, that they lue &amp;quot;mmow&amp;quot;, lind that they &amp;quot;support leaves, fruits, or flowers&amp;quot;. Choosing the second sense, on the other band, will mean losing tim fact that stalks arc upright, that one example of a stalk is a &amp;quot;beanstalk&amp;quot;, and that the main upright part of a &amp;quot;tree&amp;quot; cannot be called a &amp;quot;stalk&amp;quot;. lhunan dictionary users never encounter this problein. The reason is that instead of treating the entry tot a word like &amp;quot;stalk&amp;quot; its a pail el' entirely discl'ele senses, a hulnan looking this word np wouM typically arrive at nlore abstract notion of its meaning, one which encompasses infol+lnatiol~ flom both senses, llow call we refommlate tile problem ot: sense disalnbigualion its a computational context so that Selnantic processing can do a better job of nlinficking the hunlan user? Our solution involves encoding in our 13)OCE-derived Icxical database inforlnatiol~ isbout how a word's senses overlap semantically.</Paragraph>
  </Section>
  <Section position="5" start_page="712" end_page="713" type="metho">
    <SectionTitle>
2. Mentifying Semantically Similar Senses
</SectionTitle>
    <Paragraph position="0"> The relnahuler of tile paper describes a heuristic-based algnrithln which antomatically determines which senses of a given IJ)OCE headword are  closely related to one another vs. those which appear to represent fundamentally different senses of the word. While no attempt is made to explicitly identif}C/ the nature of these links, our program has the advantage of generality: no hand-coding is required, and the techniques we describe can thus be applied (with some modification) to on-line dictionaries other than LDOCE. This work has an important effect on the formulation of the sense disambiguation task: by encodiug information of this kind in onr LDOCE-derived lexical database, we can now permit the sense disambiguation component of our system to return a merged representation of the semantic information contained in multiple senses of a word like &amp;quot;stalk&amp;quot;. Making available more background information about a word's meaning increases the likelihood of correctly interpreting sentences which contain this word.</Paragraph>
    <Paragraph position="1"> Our method involves pertbrming an exhaustive set of pairwise comparisons of the different senses of a polysemons word with one another, with the aim of discovering which pairs show a higher degree of semantic similarity. Comparisons are not limited by part of speech; for example, noun and verb senses are compared to one another. A variety of types (ff inRnmation about a sense's meaning are exploited by this comparison step, including:</Paragraph>
  </Section>
  <Section position="6" start_page="713" end_page="713" type="metho">
    <SectionTitle>
* LDOCE Syntactic Subcategorization Codes
,, LI)OCE Boxcodes
</SectionTitle>
    <Paragraph position="0"> The program uses a taxonomic classification of these codes based on Bruce and Guthrie (1992) to allow partial matches between senses with non-identical but related Boxcodes. In addition, certain Boxcode specifications (e.g., Iplant\]) match against sets of keywords in definition strings (e.g., {plant,</Paragraph>
    <Paragraph position="2"> A taxonomic classification of the 124 Domain codes like that in Slator (1988) is used to identify cases in which two senses have similar but non-identical codes. As with the Boxcodes, certain Domain specifications (e.g., BB, &amp;quot;baseball&amp;quot;\]) match against sets of keywords in definition strings (e.g., {baseball, ball, sports}).</Paragraph>
    <Paragraph position="3"> * Features Abstracted from LDOCE Definitious: A number of binary features, inch, ding \[locative\] and \[human\] have been automatically assigned to LDOCE senses, based on syntactic and lexical properties of their definitions. Matches between these features increase the likelihood that two senses are semantically related.</Paragraph>
  </Section>
  <Section position="7" start_page="713" end_page="715" type="metho">
    <SectionTitle>
* Semantic Relations
</SectionTitle>
    <Paragraph position="0"> The most important source of evidence about the interrelationships among senses has been automatically derived fi'om LDOCE definition sentences. The program consults a lexical database which contains approximately 150,000 semantic associations between word senses, the result of autonmtically parsing the definition text of each noun and verb sense in LDOCE and then applying a set of heuristic rules which antomatically attempt to identify any systematic semantic relationships holding between a headword and the (base forms of) words used to define it (Jensen &amp; Binot, 1987; Montemagni and Vanderwende, 1992).</Paragraph>
    <Paragraph position="1"> Approximately 25 types of semantic relations are currently identified, including Hypernym (genus term), Location, Manner, Purpose, ttas Part, TypicalSubject, and Possessor. Finally, each of.</Paragraph>
    <Paragraph position="2"> these links is automatically sense-disambiguated.</Paragraph>
    <Paragraph position="3"> The resulting associations are modeled as labeled edges in a directed cyclic graph whose nodes correspond to individual word senses (Dolan et al, 1993; Pentheroudakis and Vanderwende, 1993).</Paragraph>
    <Paragraph position="4"> Matching two senses involves comparing any wdues which have been identified for each of the semantic relation types. One of the most important comparisons is of Hypernyms, which have been identified lbr the wtst majority of noun and verb senses. An exact Hypernym match generally signals a close semantic relationship between two senses, as in the following senses of the noun &amp;quot;cat&amp;quot;: (!)with S0f( fur and sharp teeih and \] claws (naiis), 0ften kept as a pet,i&amp;quot;: \[ rel!t!ed to this, \[ :suCh as :the li0n or:tiger,.. \] Comparisons are not limited to Hypernyms, of course: in comparing two senses, the program attempts to identify shared values tot each of the different semantic attributes present in a word's lexical representation. For instance, in each of the following verb senses of &amp;quot;crawl&amp;quot;, the word &amp;quot;slowly&amp;quot; has been automatically identified as the value of a Manner attribute.</Paragraph>
    <Paragraph position="5"> (I)&amp;quot;tomgves!oWlywith the body c!0se t0 tlie \[ ground or floor 0r 0n tlie hands and kneeS&amp;quot; I (2) &amp;quot;tO gO very Siowly&amp;quot; I Each time an identical value is found for a given semantic attribute, the algorithm increments the correlation score for that pair of senses. If no exact match is found, the program checks whether the values for this attribute in the two senses have a hypemym or hyponym in common. The following senses of the noun &amp;quot;insect&amp;quot;, lbr example, are linked through the Ilypernyms &amp;quot;creature&amp;quot; .'md &amp;quot;animal&amp;quot;:  (1) &amp;quot;a small Creature with no bones and: a hard outer covering...&amp;quot; \[ (2) &amp;quot;a very Sinall animal that creeps along the \[  groundl such as a spider or worm&amp;quot; \[ According to the network implicit in LDOCE, &amp;quot;creatm'e&amp;quot; is a hyponym of &amp;quot;animal&amp;quot;, while &amp;quot;animal&amp;quot; is a hyponym of &amp;quot;creature&amp;quot;. (For discussion of this  kind el: circularity in dictionary detinitions, see Calzolari, 1977*) In addition to such straightlorward comparisons, a number of &amp;quot;scrambled&amp;quot; colnparisons are attempted. For instance, any value for the lngredientOf attribute is automatically compared to tile Itypernym wflue(s) lk)r each other senses. This comparison reflects tile fact that maBy nouns are both the nalne for a substance and tor something which is made li'om that substance. An exanrple of this is the noun &amp;quot;coffee&amp;quot;: in one sense, &amp;quot;coffee&amp;quot; is hlgredientOf of a &amp;quot;drink&amp;quot;, while in another sense it has been klentified as a Hypernym of the noun &amp;quot;drink&amp;quot;.</Paragraph>
    <Paragraph position="6">  (!) 'a brown p0wder made by crushing coffee beans,~\] used fo{i mak!g g drmks: \] (2) !'(a cupful of)a hot br0wn drittk made,by adding \] \[ hot water am!/or, milk to this powder' \[ 3. DiscAIssion and Ewthtation  The sense clustering prograln was run over tile set of 33,0(10 single word noun defintions and 12,000 single word verb definitions in I,DOCE (45,000 total) in a process that took approximately 20 hours on a 486/50 I'C. Given a set of senses tot a polysemous word such as &amp;quot;crank&amp;quot;, tile result of the exhaustive pairwise comparisons performed by the program is a (synnnetrical) matrix of correlation scores:  Since our conlparison are heuristic in nature, the relative rankings of the pairwise comparisons for a polysemous word's senses are the relevant measnre of semantic similarity, rather than any absolute threshold. In the case of &amp;quot;crank&amp;quot;, (21ustel'ing has correctly indicated a high correlation between the tWO &amp;quot;hulnail&amp;quot; senses of the II()l.in &amp;quot;ci'ank&amp;quot;, anti a high correlation between the two verbal subsenses and the &amp;quot;apparatus&amp;quot; noun sense. Mo,eover, tire two &amp;quot;hnman&amp;quot; noun senses are not semantically correlated with any of tile three &amp;quot;apparattls&amp;quot; senses.</Paragraph>
    <Paragraph position="7"> Negative scores are also common, reflecting certain kinds of incompatibilities between senses (e.g., one sense is \[+animate\] while the other is \[-animate\]). As a rule of thumb, however, it is much easier to identify commonalties between senses than to identify definite mismatches.</Paragraph>
    <Section position="1" start_page="714" end_page="714" type="sub_section">
      <SectionTitle>
Zero Derivation
</SectionTitle>
      <Paragraph position="0"> One of the most useful products of clustering is the identification of many cases of zero-derived norm/verb pairs. For instance, tile comparison of tile various senses of the word &amp;quot;cook&amp;quot; shows the verb sense &amp;quot;to prepare (toed) for eating...&amp;quot; to be highly correlated with the noun sense &amp;quot;a person who prepares and cooks a~od&amp;quot;. This kind of cross.classification, which dictionaries generally fail to provkte, has interesting implications tot normalizing tilt semantics of superficially very different sentences. For example, a concept which is expressed verbally in one sentence can now be related to the same general concept expressed nominally in anoflmr, even if LDOCE does not explicitly link the definitions ff)r the two parts of speech. (Pentheroudakis &amp; Vanderwende (1993) describe a general approach to identifying semanlic links among lnorphologically-rehited words.)</Paragraph>
    </Section>
    <Section position="2" start_page="714" end_page="715" type="sub_section">
      <SectionTitle>
Metaphor
</SectionTitle>
      <Paragraph position="0"> Interestingly, the tact that many conventional metaphors tire lexicalized in diclionary definitions can lead to difficulties with our strategy of comt,aring different definitions to one another. 3 Consider the lbllowing senses of the noun &amp;quot;nmuth&amp;quot;: (i) &amp;quot;the oPening onthe face through which an 1 animal or hmnan being may take food::.&amp;quot; / &amp;quot; &amp;quot;r (2) an opemng, entrance, or way out ...... (Ex:&amp;quot; &amp;quot;mouttt of a cave&amp;quot;) , In considering these two senses, Clustering returned a correlation score C/ff 26, snggesting a reasonably close semantic relationship between them. l~:mm one perspective this is simply wrong: a lnunan or anilnal &amp;quot;month&amp;quot; is fundamentally different from a cave &amp;quot;mouth&amp;quot;, and we wotfld like our MRD-derived lexicon to indicatc this fact. Once the obvious metaphorical association between these two senses of &amp;quot;mouth&amp;quot; is noted, however, the reason for tile clustering program's result becomes clear: both senses are defined as kinds of &amp;quot;npenings&amp;quot;. The case lor treating the two senses as semantically similar is strengthened by other evidence: one sense of &amp;quot;entrance&amp;quot; (which is the tlypernym of the second sense) has &amp;quot;opening&amp;quot; as its own Hypernym: &amp;quot;a gate, door, or other opening by which one enters&amp;quot;.</Paragraph>
      <Paragraph position="1"> Such metaphorical associations between word senses add a considerable degree of complexity to disambiguation and other kinds of reasoning processes that operate by identifying semantic relationships between different words. More work aimed at identifying tire systematic natnre of such  relationships will be required before metaphor-based confusions of the kind described above can be automatically resolved.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML