File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-2504_intro.xml
Size: 6,435 bytes
Last Modified: 2025-10-06 14:04:03
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-2504"> <Title>What's in a name? The automatic recognition of metonymical location names.</Title> <Section position="3" start_page="0" end_page="25" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> In the last few years, metonymy has emerged as an important focus of research in many areas of linguistics. In Cognitive Linguistics, it is often defined as &quot;a cognitive process in which one conceptual entity, the vehicle, provides mental access to another conceptual entity, the target, within the same domain, or idealized cognitive model (ICM)&quot; (K&quot;ovecses, 2002, p.145). In example (1), for instance, China and Taiwan provide mental access to the governments of the respective countries: (1) China has always threatened to use force if Taiwan declared independence. (BNC) This paper is concerned with algorithms that automatically recognize such metonymical country names. These are extremely relevant in Natural Language Processing, since any system that automatically builds semantic representations of utterances needs to be able to recognize and interpret metonymical words.</Paragraph> <Paragraph position="1"> Early approaches to metonymy recognition, such as Pustejovsky's (1995), identified a word as metonymical when it violated certain selectional restrictions. Indeed, in example (1), China and Taiwan both violate the restriction that threaten and declare require an animate subject, and thus have to be interpreted metonymically. This view is present in the psycholinguistic literature, too. Some authors argue that a figurative interpretation of a word typically comes about when all literal interpretations fail; see Gibbs (1994) for an overview. This failure is often due to the violation of selectional restrictions.</Paragraph> <Paragraph position="2"> However, in psycholinguistics as well as in computational linguistics, this approach has lost much of its appeal. It has become clear to researchers in both fields that many metonymies do not violate any restrictions at all. In to like Shakespeare, for instance, there is no explicit linguistic trigger for the metonymical interpretation of Shakespeare. Rather, it is our world knowledge that pre-empts a literal reading of the author's name. Examples like this one demonstrate that metonymy recognition should not be based on rigid rules, but rather, on information about the semantic class of the target word and the semantic and grammatical context in which it occurs. In psycholinguistics, this insight (among others) has given rise to theories claiming that a figurative interpretation does not follow the failure of a literal one, but that both processes occur in parallel (Frisson and Pickering, 1999). In computational linguistics, it has led to the development of statisti- null cal, corpus-based approaches to metonymy recognition. null This view was first put into computational practice by Markert and Nissim (2002a). Their key to success was the realization that metonymy recognition is a sub-problem of Word Sense Disambiguation (WSD). They found that most metonymies in the same semantic class belong to one of a limited number of metonymical patterns that can be defined a priori. The task of metonymy recognition thus consists of the automatic assignment of one of these readings to a target word. Since all words in the same semantic class may undergo the same semantic shifts, there only has to be one classifier per class (and not per word, as in classic WSD).</Paragraph> <Paragraph position="3"> In this paper I will be concerned with the automatic identification of metonymical location names. More particularly, I will test two new approaches to metonymy recognition on the basis of Markert and Nissim's (2002b) corpora of 1,000 mixed country names and 1,000 instances of the country name Hungary.1 The most important metonymical patterns in these corpora are place-for-people, place-for-event and place-for-product. In addition, there is a label mixed for examples that have two readings, and othermet for examples that do not belong to any of the pre-defined metonymical patterns.</Paragraph> <Paragraph position="4"> On the mixed country data, Nissim and Markert's (2003) classifiers achieved an accuracy of 87%. This was the result of a combination of both grammatical and semantic information. Their grammatical information included the function of a target word and its head. The semantic information, in the form of Dekang Lin's (1998) thesaurus of semantically similar words, allowed the classifier to search the training set for instances whose head was similar, and not just identical, to that of a test instance.</Paragraph> <Paragraph position="5"> Markert and Nissim's (2002a) and Nissim and Markert's (2003) study is the only one to approach metonymy recognition from a data-driven, statistical perspective. However, it also has a number of disadvantages. First, it requires the annotation of a large number of training and test instances. This compromises its possible application to a wide variety of metonymical patterns across a large num- null ber of semantic categories. Second, its algorithms are rather complex. In the training phase, they calculate smoothed probabilities on the basis of a large annotated training corpus and in the test phase, they iteratively search through a thesaurus of semantically similar words. This leads to the question if this complexity is indeed necessary in metonymy recognition.</Paragraph> <Paragraph position="6"> This paper investigates two approaches that each tackle one of these problems. The unsupervised algorithm in section 2 has the intuitive appeal of not requiring any annotated training instances. I will show that it is nevertheless often able to distinguish between two data clusters that correlate with the two target readings. In section 3, I will again take recourse to a supervised learning method, but one that explicitly incorporates a much simpler learning phase than its competitors in the literature -- Memory-Based Learning. I will demonstrate that this algorithm of 'lazy learning' gives state-of-the-art results in metonymy recognition. Moreover, although their psychological validity is not a focus of the present investigation, the two studied algorithms have clear links to models of human behaviour.</Paragraph> </Section> class="xml-element"></Paper>