File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-2505_intro.xml
Size: 4,968 bytes
Last Modified: 2025-10-06 14:04:04
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-2505"> <Title>Multilingual versus Monolingual WSD</Title> <Section position="2" start_page="0" end_page="33" type="intro"> <SectionTitle> 1 Introduction Word Sense Disambiguation (WSD) is con- </SectionTitle> <Paragraph position="0"> cerned with the choice of the most appropriate sense of an ambiguous word given its context.</Paragraph> <Paragraph position="1"> The applications for which WSD has been thought to be helpful include Information Retrieval, Information Extraction, and Machine Translation (MT) (Ide and Veronis, 1998). The usefulness of WSD for MT, particularly, has been recently subject of debate, with conflicting results. Vickrey et al. (2005), e.g., show that the inclusion of a WSD module significantly improves the performance of their statistical MT system. Conversely, Carpuat and Wu (2005) found that WSD does not yield significantly better translation quality than a statistical MT system alone. In this latter work, however, the WSD module was not specifically designed for MT: it is based on the use of monolingual methods to identify the source language senses, which are then mapped into the target language translations. null In fact, although it has been agreed that WSD is more useful when it is meant for a specific application (Wilks and Stevenson, 1998; Kilgarriff, 1997; Resnik and Yarowsky, 1997), little has been done on the development of WSD modules specifically for particular applications. WSD models in general are application independent, and focus on monolingual contexts, particularly English.</Paragraph> <Paragraph position="2"> Approaches to WSD as an application-independent task usually apply standardised sense repositories, such as WordNet (Miller, 1990). For multilingual applications, a popular approach is to carry out monolingual WSD and then map the source language senses into the corresponding target word translations (Carpuat and Wu, 2005; Montoyo et al., 2002). Although this strategy can yield reasonable results for certain pairs of languages, especially those which have a common sense repository, such as EuroWordNet (Vossen, 1998), mapping senses between languages is a very complex issue (cf. Section 2). We believe that WSD is an intermediate, application dependent task, and thus WSD modules for particular applications must be developed following the requirements of such applications.</Paragraph> <Paragraph position="3"> Many key factors of the process are applicationdependent. The main factor is the sense inventory. As emphasized by Kilgarriff (1997), no sense inventory is suitable for all applications. Even for the same application there is often little consensus about the most appropriate sense inventory. For example, the use of WordNet, although very frequent, has been criticized due to characteristics such as the level sense granularity and the abstract criteria used for the sense distinctions in that resource (e.g., Palmer 1998). In particular, it is generally agreed that the granularity in WordNet is too refined for MT.</Paragraph> <Paragraph position="4"> In addition to requiring different sense inventories (Hutchins and Somers, 1992), the disambiguation process itself often can be varied according to the application. For instance, in mono-lingual WSD, the main information source is the context of the ambiguous word, that is, the surrounding words in a sentence or paragraph. For MT purposes, the context can be also that of the translation in the target language, i.e., words which have been already translated.</Paragraph> <Paragraph position="5"> In this paper we focus on the differences in the sense inventory, contrasting the WordNet inventory for English disambiguation, which was created according to psycholinguistics principles, with the Portuguese translations assigned to a set of eight verbs in a corpus, simulating MT as a Computational Linguistics application.</Paragraph> <Paragraph position="6"> We show that the relation between the number of senses and translations is not a one-to-one, and that it is not only a matter of the level of refinement of WordNet. The number of translations can be either smaller or larger, i.e., either two or more senses can be translated as the same word, or the same sense can be translated using different words. With that, we present evidence that employing a monolingual WSD method for the task of MT is not appropriate, since monolingual information offers little help to multilingual disambiguation. In other words, we argue that multilingual WSD is different from monolingual WSD, and thus requires specific strategies. We start by presenting approaches that show cognate results for different pairs of languages, and also approaches developed with the reverse goal of using multilingual information to help monolingual WSD (Section 2). We then present our experiments (Sections 3 and 4) and their results (Section 5).</Paragraph> </Section> class="xml-element"></Paper>