File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-2411_concl.xml

Size: 3,151 bytes

Last Modified: 2025-10-06 13:54:25

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2411">
  <Title>Calculating Semantic Distance between Word Sense Probability Distributions</Title>
  <Section position="7" start_page="3" end_page="3" type="concl">
    <SectionTitle>
6 Conclusions
</SectionTitle>
    <Paragraph position="0"> We have proposed a new method for comparing WordNet probability distributions, which we call sense profile distance (SPD). Given any pair of probability distributions over WordNet (which we call a sense profile), SPD captures in a single measure the aggregate semantic distance of the component nodes, weighted by their probability.</Paragraph>
    <Paragraph position="1"> The method addresses conceptual problems of an earlier measure proposed by McCarthy (2000), which was limited to tree cut models (Li and Abe, 1998) and failed to distinguish detailed semantic differences between them.</Paragraph>
    <Paragraph position="2"> Our approach is more general, since it can work on the result of any model that populates WordNet with probability scores. Moreover, the integration of a WordNet distance measure into the formula enables it to take semantic distances directly into account and better capture meaningful distinctions between the distributions.</Paragraph>
    <Paragraph position="3"> We have shown that SPD yields practical advantages as well, in demonstrating improved performance in the ability to detect a verb alternation through comparison of the sense profiles of potentially alternating slots. SPD achieves a best performance of 70% accuracy (baseline 50%) on unseen test verbs, and no other measure we tested performed consistently as well as it did. By comparison, McCarthy (2000) attained 73% accuracy on her set of hand-selected test verbs in a similar task; however, when applied to our randomly selected verbs, our replication of her method achieved an overall performance of 67%, and performed very poorly on low frequency verbs.</Paragraph>
    <Paragraph position="4"> In our on-going work, we are exploring other applications of SPD, such as assessing document collection similarity, in which such an aggregate semantic distance measure has the potential to reveal meaningful distinctions. In this type of task, sense profiles over other, more domain-specific, ontologies may prove to be useful. In our presentation here, we have described SPD as a measure over sense profiles in WordNet, but clearly the 6Another method is to use some type of &amp;quot;expected distance&amp;quot; as a normalizing factor (Paola Merlo, p.c.). However, it is yet unclear how we would calculate this number.</Paragraph>
    <Paragraph position="5"> method is general enough to apply to any hierarchical ontology. Indeed, a sense profile--a set of scores over the hierarchy--need not even form a probability distribution.</Paragraph>
    <Paragraph position="6"> The only requirements for the method are that a meaningful distance measure be definable over nodes in the hierarchy, and that for any two profiles being compared, the sum of their scores is equal (the latter being trivially true for probability distributions, which sum to 1).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML