File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/99/w99-0505_abstr.xml
Size: 3,565 bytes
Last Modified: 2025-10-06 13:49:51
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0505"> <Title>Towards a Meaning-Full Comparison of Lexieal Resources</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> The mapping from WordNet to Hector senses m Senseval provides a &quot;gold standard&quot; against wluch to judge our ability to compare lexlcal resources The &quot;gold standard&quot; is provided through a word overlap analysis (with and without a stop list) for flus mapping, achieving at most a 36 percent correct mapping (inflated by 9 percent from &quot;empty&quot; assignments) An alternaUve componenttal analysis of the defimtaons, using syntacUc, collocatmnal, and semantac component and relation identification (through the use ofdefimng patterns integrated seamlessly mto the parsing thclaonary), provides an almost 41 percent correct mapping, with an additaonal 4 percent by recogmzmg semantic components not used in the Senseval mapping Defimtion sets of the Senseval words from three pubhshed thclaonanes and Dorr's lextcal knowledge base were added to WordNet and the Hector database to exanune the nature of the mapping process between defimtton sets of more and less sco\[~e The tecbauques described here consUtute only an maaal implementation of the componenUal analysis approach and suggests that considerable further improvements can be aclueved Introduction The difficulty of companng lemcal resources, long a s~gnfficant challenge in computauonal hnguistlcs (Atlans, 1991), came to the fore in the recent Senseval competatton (IOlgarnff, 1998), when some systems that relied heavily on the WordNet (Miller, et al, 1990) sense inventory were faced with the necessity of using another sense inventory (Hecto0 A hasty solutaon to the problem was the &quot; development of a map between the two inventories, but some part~cipants expressed concerns that use of flus map may have degraded their performance to an unknown degree Although there were disclaimers about the WordNet-Hector map, it nonetheless stands as a usable gold standard for efforts to compare lexical resources Moreover, we have a usable baseline (a word overlap method suggested m (Lesk, 1986)) against which to compare whether we are able to make improvements m the mapping (since flus method has been shown to perform not as well as expected (Krovetz, 1992)) We first describe the lextcal resources used m the study (Hector, WordNet, other dicUonanes, and a lex~cal knowledge base), first characterizing them in terms ofpolysemy and the types of leracal mformaUon each contmns (syntacUc properties and features, semantac components and relaUons, and collocaUonal properties) We then present results of perfornung the word overlap analysis of the 18 verbs used m Senseval, analyzing the definitions m WordNet and Hector We then expand our analysis to include other dictionaries We describe our methods of analysis, particularly the methods of parsing defimtaons and identff)qng semantic relations (semrels) based on defimng patterns, essentially takang first steps m Implementing the program described by Atkms and focusmg on the use of&quot;meamng&quot; full mformataon rather than statistical mformaUon We identify the results that have been achieved thus far and outline further steps that may add more &quot;meanmg&quot; to the analysis IAll analyses described m this paper were performed automatically using functlonahty incorporated m</Paragraph> </Section> class="xml-element"></Paper>