File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/e06-3009_intro.xml
Size: 6,302 bytes
Last Modified: 2025-10-06 14:03:23
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-3009"> <Title>Example-Based Metonymy Recognition for Proper Nouns</Title> <Section position="2" start_page="0" end_page="71" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Metonymy is a figure of speech that uses &quot;one entity to refer to another that is related to it&quot; (Lakoff and Johnson, 1980, p.35). In example (1), for instance, China and Taiwan stand for the governments of the respective countries: (1) China has always threatened to use force if Taiwan declared independence. (BNC) Metonymy resolution is the task of automatically recognizing these words and determining their referent. It is therefore generally split up into two phases: metonymy recognition and metonymy interpretation (Fass, 1997).</Paragraph> <Paragraph position="1"> The earliest approaches to metonymy recognition identify a word as metonymical when it violates selectional restrictions (Pustejovsky, 1995). Indeed, in example (1), China and Taiwan both violate the restriction that threaten and declare require an animate subject, and thus have to be interpreted metonymically. However, it is clear that many metonymies escape this characterization. Nixon in example (2) does not violate the selectional restrictions of the verb to bomb, and yet, it metonymically refers to the army under Nixon's command.</Paragraph> <Paragraph position="2"> (2) Nixon bombed Hanoi.</Paragraph> <Paragraph position="3"> This example shows that metonymy recognition should not be based on rigid rules, but rather on statistical information about the semantic and grammatical context in which the target word occurs. null This statistical dependency between the reading of a word and its grammatical and semantic context was investigated by Markert and Nissim (2002a) and Nissim and Markert (2003; 2005). The key to their approach was the insightthatmetonymyrecognition isbasically asubproblem of Word Sense Disambiguation (WSD). Possibly metonymical words are polysemous, and they generally belong to one of a number of predefinedmetonymical categories. Hence, like WSD, metonymy recognition boils down to the automatic assignment of a sense label to a polysemous word. This insight thus implied that all machine learning approaches to WSD can also be applied to metonymy recognition.</Paragraph> <Paragraph position="4"> There are, however, two differences between metonymy recognition and WSD. First, theoretically speaking, the set of possible readings of a metonymical word is open-ended (Nunberg, 1978). In practice, however, metonymies tend to stick to a small number of patterns, and their labels can thus be defined a priori. Second, classic WSD algorithms take training instances of one particular word as their input and then disambiguate test instances of the same word. By contrast, since all words of the same semantic class may undergo the same metonymical shifts, metonymy recognition systems can be built for an entire semantic class instead of one particular word (Markert and Nissim, 2002a).</Paragraph> <Paragraph position="5"> To this goal, Markert and Nissim extracted from the BNC a corpus of possibly metonymical words from two categories: country names (Markert and Nissim, 2002b) and organization names (Nissim and Markert, 2005). All these words were annotated with a semantic label -- either literal or the metonymical category they belonged to. For the country names, Markert and Nissim distinguished between place-for-people, place-for-event and place-for-product. For the organization names, the most frequent metonymies are organization-for-members and organization-for-product. In addition, Markert and Nissim used a label mixed for examples that had two readings, and othermet for examples that did not belong to any of the pre-defined metonymical patterns.</Paragraph> <Paragraph position="6"> For both categories, the results were promising. The best algorithms returned an accuracy of 87% for the countries and of 76% for the organizations. Grammatical features, which gave the function of a possibly metonymical word and its head, proved indispensable for the accurate recognition of metonymies, but led to extremely low recall values, due to data sparseness. Therefore Nissim and Markert (2003) developed an algorithm that also relied on semantic information, and tested it on the mixed country data. This algorithm used Dekang Lin's (1998) thesaurus of semantically similar words in order to search the training data for instances whose head was similar, and not just identical, to the test instances. Nissim and Markert (2003) showed that a combination of semantic and grammatical information gave the most promising results (87%).</Paragraph> <Paragraph position="7"> However, Nissim and Markert's (2003) approach has two major disadvantages. The first of these is its complexity: the best-performing algorithm requires smoothing, backing-off to grammatical roles, iterative searches through clusters of semantically similar words, etc. In section 2, Iwill therefore investigate ifametonymy recognition algorithm needs to be that computationally demanding. In particular, I will try and replicate Nissim and Markert's results with the 'lazy' algorithm of Memory-Based Learning.</Paragraph> <Paragraph position="8"> The second disadvantage of Nissim and Markert's (2003) algorithms is their supervised nature. Because they rely so heavily on the manual annotation of training and test data, an extension of the classifiers to more metonymical patterns is extremely problematic. Yet, such an extension is essential for many tasks throughout the field of Natural Language Processing, particularly Machine Translation. This knowledge acquisition bottleneck is a well-known problem in NLP, and many approaches havebeendeveloped toaddress it. One of these is active learning, or sample selection, a strategy that makes it possible to selectively annotate those examples that are most helpful to the classifier. It has previously been applied to NLP tasks such as parsing (Hwa, 2002; Osborne and Baldridge, 2004) and Word Sense Disambiguation (Fujii et al., 1998). In section 3, I will introduce active learning into the field of metonymy recognition. null</Paragraph> </Section> class="xml-element"></Paper>