File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-2605_concl.xml
Size: 3,019 bytes
Last Modified: 2025-10-06 13:54:25
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2605"> <Title>Using Selectional Profile Distance to Detect Verb Alternations</Title> <Section position="7" start_page="5" end_page="5" type="concl"> <SectionTitle> 6 Conclusions </SectionTitle> <Paragraph position="0"> We have proposed a new method for comparing Word-Net probability distributions, which we call selectional profile distance (SPD). Given any pair of probability distributions over WordNet (which we call a selectional profile), SPD captures in a single measure the aggregate semantic distance of the component nodes, weighted by their probability. The method addresses conceptual problems of an earlier measure proposed by McCarthy (2000), which was limited to tree cut models (Li and Abe, 1998) and failed to distinguish detailed semantic differences between them. Our approach is more general, since it can work on the result of any model that populates Word-Net with probability scores. Moreover, the integration of a WordNet distance measure into the formula enables it to take semantic distances directly into account and better capture meaningful distinctions between the distributions. null We have shown that SPD yields practical advantages as well, in demonstrating improved performance in the ability to detect a verb alternation through comparison of the selectional profiles of potentially alternating slots. SPD achieves a best performance of 70% accuracy (baseline 50%) on unseen test verbs, and no other measure we tested performed consistently as well as it did, achieving best performance (alone or tied) in 9 of 12 development experiments, and best or second best in all three test scenarios. By comparison, McCarthy (2000) attained 73% accuracy on her set of hand-selected test verbs in a similar task; however, when applied to our various sets of randomly selected verbs, our replication of her method performed very poorly, rarely reaching above chance performance. We believe that the randomly selected verbs in our experiments may show a wider variation, than verbs that are hand-selected, in whether and how much they alternate, and thus constitute a more difficult but more realistic scenario for testing the usefulness of these measures in practice.</Paragraph> <Paragraph position="1"> Interestingly, we found that separating verbs into low and high frequency bands improved performance, and our best performance of 70% in fact results from an average of SPD results on the individual frequency bands.</Paragraph> <Paragraph position="2"> Perhaps even more interesting is the underlying reason for this: causative verbs in the low frequency band show greater similarity (lower SPD scores) across the slots than those in the high frequency band. In on-going work, we are extending our experiments to a larger corpus (the BNC), so that we can investigate a larger range and number of verbs to explore this issue, which will enable us to better elucidate the reasons for this interaction.</Paragraph> </Section> class="xml-element"></Paper>