Multilingual versus Monolingual WSD 
Lucia Specia 
ICMC – University of São Paulo 
Av. do Trabalhador São-Carlense, 400 
São Carlos, 13560-970, Brazil 
lspecia@icmc.usp.br
Maria das GraçasVolpe Nunes 
ICMC – University of São Paulo 
Av. do Trabalhador São-Carlense, 400 
São Carlos, 13560-970, Brazil 
gracan@icmc.usp.br
Mark Stevenson 
Computer Science – University of Sheffield 
Regent Court, 211 Portobello Street 
Sheffield, S1 4DP, UK  
M.Stevenson@dcs.shef.ac.uk  
Gabriela Castelo Branco Ribeiro 
DL - Pontificial Catholic University - Rio 
R. Marquês de São Vicente, 225 - Gávea 
Rio de Janeiro, RJ, Brazil. CEP: 22.453-900 
gabrielacastelo@globo.com
  
Abstract 
Although it is generally agreed that Word 
Sense Disambiguation (WSD) is an ap-
plication dependent task, the great major-
ity of the efforts has aimed at the devel-
opment of WSD systems without consid-
ering their application. We argue that this 
strategy is not appropriate, since some 
aspects, such as the sense repository and 
the disambiguation process itself, vary 
according to the application. Taking Ma-
chine Translation (MT) as application 
and focusing on the sense repository, we 
present evidence for this argument by ex-
amining WSD in English-Portuguese MT 
of eight sample verbs. By showing that 
the traditional monolingual WSD strate-
gies are not suitable for multilingual ap-
plications, we intend to motivate the de-
velopment of WSD methods for particu-
lar applications.  
1 Introduction 
Word Sense Disambiguation (WSD) is con-
cerned with the choice of the most appropriate 
sense of an ambiguous word given its context. 
The applications for which WSD has been 
thought to be helpful include Information Re-
trieval, Information Extraction, and Machine 
Translation (MT) (Ide and Verónis, 1998). The 
usefulness of WSD for MT, particularly, has 
been recently subject of debate, with conflicting 
results. Vickrey et al. (2005), e.g., show that the 
inclusion of a WSD module significantly im-
proves the performance of their statistical MT 
system. Conversely, Carpuat and Wu (2005) 
found that WSD does not yield significantly bet-
ter translation quality than a statistical MT sys-
tem alone. In this latter work, however, the WSD 
module was not specifically designed for MT: it 
is based on the use of monolingual methods to 
identify the source language senses, which are 
then mapped into the target language transla-
tions. 
In fact, although it has been agreed that WSD 
is more useful when it is meant for a specific ap-
plication (Wilks and Stevenson, 1998; Kilgarriff, 
1997; Resnik and Yarowsky, 1997), little has 
been done on the development of WSD modules 
specifically for particular applications. WSD 
models in general are application independent, 
and focus on monolingual contexts, particularly 
English. 
Approaches to WSD as an application-
independent task usually apply standardised 
sense repositories, such as WordNet (Miller, 
1990). For multilingual applications, a popular 
approach is to carry out monolingual WSD and 
then map the source language senses into the cor-
responding target word translations (Carpuat and 
Wu, 2005; Montoyo et al., 2002). Although this 
strategy can yield reasonable results for certain 
pairs of languages, especially those which have a 
common sense repository, such as EuroWordNet 
(Vossen, 1998), mapping senses between lan-
guages is a very complex issue (cf. Section 2).  
33
We believe that WSD is an intermediate, applica-
tion dependent task, and thus WSD modules for 
particular applications must be developed fol-
lowing the requirements of such applications. 
Many key factors of the process are application-
dependent. The main factor is the sense inven-
tory. As emphasized by Kilgarriff (1997), no 
sense inventory is suitable for all applications. 
Even for the same application there is often little 
consensus about the most appropriate sense in-
ventory. For example, the use of WordNet, al-
though very frequent, has been criticized due to 
characteristics such as the level sense granularity 
and the abstract criteria used for the sense dis-
tinctions in that resource (e.g., Palmer 1998). In 
particular, it is generally agreed that the granular-
ity in WordNet is too refined for MT.  
In addition to requiring different sense inven-
tories (Hutchins and Somers, 1992), the disam-
biguation process itself often can be varied ac-
cording to the application. For instance, in mono-
lingual WSD, the main information source is the 
context of the ambiguous word, that is, the sur-
rounding words in a sentence or paragraph. For 
MT purposes, the context can be also that of the 
translation in the target language, i.e., words 
which have been already translated.  
In this paper we focus on the differences in the 
sense inventory, contrasting the WordNet inven-
tory for English disambiguation, which was cre-
ated according to psycholinguistics principles, 
with the Portuguese translations assigned to a set 
of eight verbs in a corpus, simulating MT as a 
Computational Linguistics application.  
We show that the relation between the number 
of senses and translations is not a one-to-one, 
and that it is not only a matter of the level of re-
finement of WordNet. The number of transla-
tions can be either smaller or larger, i.e., either 
two or more senses can be translated as the same 
word, or the same sense can be translated using 
different words. With that, we present evidence 
that employing a monolingual WSD method for 
the task of MT is not appropriate, since monolin-
gual information offers little help to multilingual 
disambiguation. In other words, we argue that 
multilingual WSD is different from monolingual 
WSD, and thus requires specific strategies. We 
start by presenting approaches that show cognate 
results for different pairs of languages, and also 
approaches developed with the reverse goal of 
using multilingual information to help monolin-
gual WSD (Section 2). We then present our ex-
periments (Sections 3 and 4) and their results 
(Section 5).  
2 Related work  
Recently, others have also investigated the dif-
ferences between sense repositories for monolin-
gual and multilingual WSD. Chatterjee et al. 
(2005), e.g., investigated the ambiguity in the 
translation of the English verb “to have” into 
Hindi. 11 translation patterns were identified for 
the 19 senses of the verb, according to the vari-
ous target syntactic structures and/or target 
words for the verb. They argued that differences 
in both these aspects do not depend only on the 
sense of the verb. Out of the 14 senses analyzed, 
six had 2-5 different translations each.  
Bentivogli et al. (2004) proposed an approach 
to create an Italian sense tagged corpus (Mul-
tiSemCor) based on the transference of the anno-
tations from the English sense tagged corpus 
SemCor (Miller et al., 1994), by means of word-
alignment methods. A gold standard corpus was 
created by manually transferring senses in Sem-
Cor to the Italian words in a translated version of 
that corpus. From a total of 1,054 English words, 
155 annotations were considered non-
transferable to their corresponding Italian words, 
mainly due to the lack of synonymy at the lexical 
level.  
Miháltz (2005) manually mapped senses from 
the English in a sense tagged corpus to Hungar-
ian translations, in order to carry out WSD be-
tween these languages. Out of 43 ambiguous 
nouns, 38 had all or most of their English senses 
mapped into the same Hungarian translation. 
Some senses of the remaining nouns had to be 
split into different Hungarian translations. On 
average, the sense mapping decreased the ambi-
guity from 3.97 English senses to 2.49 Hungar-
ian translations. 
As we intend to show with this work, differ-
ences like those mentioned above in the sense 
inventories make it inappropriate to use mono-
lingual WSD strategies for multilingual disam-
biguation. Nevertheless, some approaches have 
successfully employed multilingual information, 
especially parallel corpora, to support monolin-
gual WSD. They are motivated by the argument 
that the senses of a word should be determined 
based on the distinctions that are lexicalized in a 
second language (Resnik and Yarowsky, 1997). 
In general, the assumptions behind these ap-
proaches are the following:  
(1) If a source language word is translated dif-
ferently into a second language, it might be am-
biguous and the different translations can indi-
cate the senses in the source language.  
34
(2) If two distinct source language words are 
translated as the same word into a second lan-
guage, it often indicates that the two are being 
used with similar senses.  
Ide (1999), for example, analyzes translations 
of English words into four different languages, in 
order to check if the different senses of an Eng-
lish word are lexicalized by different words in all 
the other languages. A parallel aligned corpus is 
used and the translated senses are mapped into 
WordNet senses. She uses this information to 
determine a set of monolingual sense distinctions 
that is potentially useful for NLP applications. In 
subsequent work (Ide et al., 2002), seven lan-
guages and clustering techniques are employed 
to create sense groups based on the translations.  
Diab and Resnik (2002) use multilingual in-
formation to create an English sense tagged cor-
pus to train a monolingual WSD approach. An 
English sense inventory and a parallel corpus 
automatically produced by an MT system are 
employed. Sentence and word alignment systems 
are used to assign the word correspondences be-
tween the two languages. After grouping all the 
words that correspond to translations of a single 
word in the target language, all their possible 
senses are considered as candidates. The sense 
that maximizes the semantic similarity of the 
word with the others in the group is chosen.  
Similarly, Ng et al. (2003) employ English-
Chinese parallel word aligned corpora to identify 
a repository of senses for English. The English 
word senses are manually defined, based on the 
WordNet senses, and then revised in the light of 
the Chinese translations. For example, if two oc-
currences of a word with two different senses in 
WordNet are translated into the same Chinese 
word, they will be considered to have the same 
English sense.  
In general, these approaches rely on the two 
previously mentioned assumptions about the in-
teraction between translations and word senses. 
Although these assumptions can be useful when 
using cross-language information as an approxi-
mation to monolingual disambiguation, they are 
not very helpful in the opposite direction, i.e., 
using monolingual information for cross-
language disambiguation, as we will show in 
Section 4.  
3 Experimental setting  
We focused our experiments on verbs, which 
represent difficult cases for WSD. In particular, 
we experimented with five frequent and highly 
ambiguous verbs identified as problematic for 
MT systems in a previous study (Specia, 2005): 
“to come”, “to get”, “to give”, “to look”, and “to 
make”; and other three frequent verbs that are 
not so ambiguous: “to ask”, “to live”, and “to 
tell”. The inclusion of the additional verbs allows 
us to analyze the effect of the ambiguity level in 
the experiment. These verbs will then be trans-
lated into Portuguese so that the resulting transla-
tions can be contrasted to the English senses. 
3.1 Corpus selection 
We collected all the sentences containing one of 
the eight verbs and their corresponding phrasal 
verbs from SemCor, Senseval-2 and Senseval-3 
corpora1. These corpora were chosen because 
they are both widely used and easily available. In 
each of these corpora, ambiguous words are an-
notated with WordNet 2.0 senses. Occurrences 
which did not identify a unique sense were not 
used. The numbers of sentences selected for each 
verb and its phrasal verbs are shown in Table 1. 
Verb # Verb  
Occurrences
# Phrasal Verb 
Occurrences 
ask 414 8
come 674 330
get 683 267
give 740 79
live 242 5 
look 370 213 
make 1463 105 
tell 509 3 
Table 1. Number of verbs and phrasal verbs ex-
tracted from SemCor and Senseval corpora 
It is worth mentioning that the phrasal verbs in-
clude simple verb-particle constructions, such as 
“give up”, and more complex multi-word expres-
sions, e.g., “get in touch with”, “make up for”, 
“come to mind”, etc. 
In order to avoid biasing the experiment due to 
possible misunderstandings of the verb uses, and 
to make the experiment feasible, with a reason-
able number of occurrences to be analyzed, we 
selected a subset of the total number of sentences 
in Table 1, which were distributed among five 
professional English-Portuguese translators (T1, 
T2, T3, T4, T5), according to the following crite-
ria:  
- The meaning of the verb/phrasal verb in the 
context of the sentence should be understandable 
and non-ambiguous (for human translators). 
                                                
1 Available at http://www.cs.unt.edu/~rada/downloads.html. 
35
- The experiment should be the most compre-
hensive possible, with the largest possible num-
ber of senses for each verb/phrasal. 
- Each translator should be given two occur-
rences (when available) of all the distinct senses 
of each verb/phrasal verb, in order to make it 
possible to contrast different uses of the verb.  
- The translators should not be given any in-
formation other than the sentence to select the 
translation.  
To meet these criteria, a professional translator, 
who was not involved in the translation task, 
post-processed the selected sentences, filtering 
them according to the criteria specified above. 
Due to both the scarce number of occurrences of 
each phrasal verb sense and the large number of 
different phrasal verbs for certain verbs, the post-
selection of phrasal verbs was different from the 
post-selection of verbs. In the case of verbs, the 
translator scanned the sentences in order to get 
10 distinct occurrences of each sense (two for 
each translator), eliminating those sentences 
which were too complex to understand or used 
the verb in an ambiguous way. This process did 
not eliminate any senses, and thus did not reduce 
the coverage of the experiment. When there were 
fewer than 10 occurrences of a given sense, sen-
tences were repeated among translators to guar-
antee that each translator would be given exam-
ples of all the senses of the verb. For instance, if 
a sense had only four occurrences, the first two 
occurrences were given to T1, T3 and T5, while 
the other two occurrences were given to T2 and 
T4. If a sense occurred only once for a verb, it 
was repeated for all five translators. 
For phrasal verbs, the same process was used 
to eliminate the complex and ambiguous sen-
tences. Two occurrences (when available) of 
each sense of a phrasal verb were then selected. 
Due to the large number of different phrasal 
verbs for certain verbs, they were divided among 
translators, so that each translator was given two 
occurrences of only some phrasal verbs of each 
verb. Sentences were distributed so that all trans-
lators had a similar number of cases, as shown in 
Table 2. 
In order to avoid biasing the translations ac-
cording to the English senses, the original sense 
annotations were not shown to the translators and 
the sentences for each of the verbs, together with 
their phrasal verbs, were randomly ordered. 
Additionally, we gave the same set of selected 
sentences to another group of five translators, so 
that we could analyze the reliability of the ex-
periment by investigating the agreement between 
the groups of translators on the same data. 
Translator
Verb 
# T1 # T2 # T3 # T4 # T5
ask 13 13 13 10 10
come 53 52 52 51 47
get 59 59 56 59 57
give 46 50 48 47 48
live 11 11 11 16 16
look 15 19 17 19 14
make 47 45 44 46 41
tell 14 12 12 15 10
Total  258 261 253 263 243
Table 2. Number of selected sentences and its 
distribution among the five translators 
3.2 English senses and Portuguese transla-
tions 
As mentioned above, the corpora used are tagged 
with WordNet senses. Although this may not be 
the optimal sense inventory for many purposes, it 
is the best option in terms of availability and 
comprehensiveness. Moreover, it is the most fre-
quently used repository for monolingual WSD 
systems, making it possible to generalize, to a 
certain level, our results to most of the monolin-
gual work. The number of senses for the eight 
selected verbs (and their phrasal verbs) in 
WordNet 2.0, along with the number of their 
possible translations in bilingual dictionaries2, is 
shown in Table 3. 
Verb # Senses # Translations 
ask  12 16
come 108 226
get  147 242
give  92 128
live  15 15
look  34 63
make 96 239
tell  12 28
Table 3. Verbs, possible senses and translations 
As we can see, the number of possible transla-
tions is different from the number of possible 
senses, which already shows that there is not a 
one-to-one correspondence between senses and 
translations (although there is a high correlation 
between the number of senses and translations: 
Pearson’s Correlation = 0.955). In general, the 
number of possible translations is greater than 
                                                
2 For example, DIC Pratico Michaelis®, version 5.1. 
36
the number of possible senses, in part because 
synonyms are considered as different transla-
tions. As we will show in Section 5 (Table 4), we 
eliminate the use of synonyms as possible trans-
lations. Moreover, we are dealing with a limited 
set of possible senses, provided by the SemCor 
and Senseval data. As a consequence, the num-
ber of translations pointed out by the human 
translators for our corpus will be considerably 
smaller than the total number of possible transla-
tions. 
4 Contrasting senses and translations 
In order to contrast the English senses with the 
Portuguese translations, we submitted the se-
lected sentences (cf. Section 3.1) to two groups 
of five translators (T1, T2, T3, T4, and T5), all 
native speakers of Portuguese. We asked the 
translators to assign the appropriate translation to 
each of the verb occurrences, which we would 
then compare to the original English senses. 
They were not told what their translations were 
going to be used for. 
The translators were provided with entire sen-
tences, but for practical reasons they were asked 
to translate only the verb and were allowed to 
use any bilingual resource to search for possible 
translations, if needed. They were asked to avoid 
considering synonyms as different translations.  
The following procedure was defined to ana-
lyze the results returned by the translators, for 
each verb and its phrasal verbs separately: 
1) We grouped all the occurrences of an Eng-
lish sense and looked at all the translations used 
by the translators in order to identify synonyms 
(in those specific uses), using a dictionary of 
Portuguese synonyms. Synonyms were consid-
ered as unique translations.  
2) We then analyzed the sentences which had 
been given to multiple translators of the same 
group (when there were not enough occurrences 
of certain senses, as mentioned in Section 3.1), in 
order to identify a single translation for the oc-
currence and eliminate redundancies. The trans-
lation chosen was the one pointed out by the ma-
jority of the translators. When it was not possible 
to elect only one translation, the n equally most 
used were kept, and thus the sentence was re-
peated n times. 
3) Finally, we examined the relation between 
senses and translations, focusing on two cases: 
(1) if a sense had only one or many translations; 
and (2) if a translation referred to only one or 
many senses, i.e., whether the sense was shared 
by many translations. We placed each sense into 
two of the following categories, explained be-
low: (a) or (b), mutually exclusive, representing 
the first case; and (c), (d) or (e), also mutually 
exclusive, representing the second case. 
(a) 1 sense barb2right 1 translation: all the occur-
rences of the same sense being translated as 
the same Portuguese word. For example, “to 
ask”, in the sense of “inquire, enquire”, is al-
ways translated as “perguntar”. 
(b) 1 sense barb2rightn translations: different oc-
currences of the same sense being translated as 
different, non-synonyms, Portuguese words. 
For example, “to look”, in the sense of “per-
ceive with attention; direct one's gaze to-
wards” can be translated as “olhar”, “assistir”, 
and “voltar-se”. 
(c) n senses barb2right 1 translation (ambiguous): 
Different senses of a word being translated as 
the same Portuguese word, which encom-
passes all the English senses. For example, 
“make”, in the sense of “engage in”, “create”, 
and “give certain properties to something”, is 
translated as “fazer”, which carries the three 
senses. 
(d) n senses barb2right 1 translation (non-
ambiguous): different senses of a word being 
translated using the same Portuguese word, 
which has only one sense. For example, “take 
advantage” in both the senses of “draw advan-
tages from” and “make excessive use of”, be-
ing translated as “aproveitar-se”.  
(e) n senses barb2right n translations: different 
senses of a word being translated as different 
Portuguese words. For example, the “move 
fast” and “carry out a process or program” 
senses of the verb “run” being translated re-
spectively as “correr” and “executar”. 
Items (a) and (e) represent cases where multilin-
gual ambiguity only reflects the monolingual 
one, that is, to all the occurrences of every sense 
of an English word corresponds a specific Portu-
guese translation. On the other hand, items (b), 
(c) and (d) provide evidence that multilingual 
ambiguity is different from monolingual ambigu-
ity. Item (b) means that different criteria are 
needed for the disambiguation, as ambiguity 
arises only during the translation, due to specific 
principles used to distinguish senses in Portu-
guese. Items (c) and (d) mean that disambigua-
tion is not necessary, as either the Portuguese 
37
translation is also ambiguous, embracing the 
same senses of the English word, or Portuguese 
has a less refined sense distinction. 
5 Results and discussion 
Table 4 presents the number of different sen-
tences analyzed for each of the verbs (after 
grouping and eliminating the repeated sen-
tences), the English (E) senses and (non-
synonyms) Portuguese (P) translations in our 
corpus, followed by the percentage of occur-
rences of each of the categories outlined in Sec-
tion 4 (a – e) with respect to the number of 
senses (# Senses) for that verb. Items (c) and (d) 
were grouped, since for practical purposes it is 
not important to tell if the P word translating the 
various E senses encompasses one or many 
senses. For items (b) and (c&d) we also present 
the average of P translations per E sense ((b) av-
erage), and the average of E senses per P transla-
tion, respectively ((c&d) average).  
We divided the analysis of these results ac-
cording to our two cases (cf. Section 4): the first 
covers items (c&d) and (e) (light grey in Table 
4), while the second covers items (a) and (b) 
(dark grey in Table 4). 
1) Items (c), (d) and (e): n senses → ? transla-
tion(s) 
The number of senses in the corpus is almost 
always greater than the number of translations, 
suggesting that the level of sense distinctions in 
WordNet can be too fine-grained for translation 
applications The numbers of senses and transla-
tions are in an opposite relation comparing to the 
one shown in Table 3, where the number of pos-
sible translations was larger than the number of 
possible senses. This shows that indeed many of 
the possible translations are synonyms.  
On average, the level of ambiguity decreased 
from 40.3 (possible senses) to 24.4 (possible 
translations), if the monolingual and multilingual 
ambiguity are compared in the corpus. If we con-
sider the five most ambiguous verbs, the level of 
ambiguity decreased from 58.8 to 35. For the 
other three less ambiguous verbs, the level of 
ambiguity decreased from 9.3 to 6.7.  
Column % (c&d) shows the percentage of 
senses, with respect to the total shown in the 
third column (# Senses), which share translations 
with other senses. A shared translation means 
that several senses of the verb have the same 
translation. (c&d) average indicates the average 
number of E senses per P translation, for those 
cases where translations are shared. For all verbs, 
on average translations cover more than two 
senses. The level of variation in the number of 
shared translations among senses is high, e.g., 
from 2 (translation = “organizar”) to 27 (transla-
tion = “dar”) for the verb “to give”. Contrasting 
the percentage of senses that share translations, 
in % (c), with the percentages in % (d), which 
refers to the senses for which translations are not 
shared, we can see that the great majority of 
senses have translations in common with other 
senses, and thus the disambiguation among these 
senses would not be necessary in most of the 
cases. In fact, it could result in errors, since an 
incorrect sense could be chosen. 
2) Items (a) and (b): 1 sense → ? translation(s) 
As previously mentioned, the differences in the 
sense inventory for monolingual and multilingual 
WSD are not only due to the fact that sense dis-
tinctions in WordNet are too refined. That would 
only indicate that using monolingual WSD for 
multilingual purposes implies unnecessary work. 
However, we consider that the most important 
problem is the one evidenced by item (b) in the 
sixth column in Table 4. For all the verbs except 
“to ask” (the least ambiguous), there were cases 
in which different occurrences of the same sense 
were translated into different, non-synonyms 
words. Although the proportion of senses with 
only one translation is greater, as shown by item 
(a) in the fifth column, the percentage of senses 
with more than one translation is impressive, 
especially for the five most ambiguous verbs. In 
face of this, the lack of disambiguation of a word 
during translation based on the fact that the word 
is not ambiguous in the source language can re-
sult in very serious translation errors when 
monolingual methods are employed for multilin-
gual WSD. Therefore, this also shows that, for 
these verbs, sense inventories that are specific to 
the translation between the pair of languages un-
der consideration would be more appropriate to 
achieve effective WSD. 
5.1 Agreement between translators 
In an attempt to quantify the agreement between 
the two groups of translators, we computed the 
Kappa coefficient for annotation tasks, as de-
fined by Carletta (1996). Kappa was calculated 
separately for our two areas of inquiry, i.e., cases 
(1) and (2) discussed in Section 5.  
In the experiment referring to case (1), groups 
were considered to agree about a sense of a verb 
if they both judged that the translation of such 
38
Verb # Sen-
tences
# Senses # Transla-
tions 
% (a)  % (b) (b) av-
erage 
%  
(c&d) 
(c&d) av-
erage 
% (e) 
ask 83 8 3 100 0 0 87.5 3.5 12.5 
come 202 68 42 62 38 3.1 73.2 6.3 26.8 
get 226 90 61 70 30 2.6 61.1 3.4 38.9 
give 241 57 12 48.7 51.3 3.3 84.2 6.3 15.8 
live 55 10 7 83.3 16.7 3.0 70 2.7 30 
look 82 26 18 63.2 36.8 2.4 84.6 2.7 15.4 
make 225 53 42 51.4 48.6 2.9 77.4 4.1 22.6 
tell 73 10 10 37.5 62.5 2.8 60 4.0 40 
Table 4. Results of the procedure contrasting senses and translations 
verb was or was not shared by other senses. For 
example, both groups agreed that the word 
“fazer” should be used to translate occurrences 
of many senses of the verb “to make”, including 
“engage in”, “give certain properties to some-
thing”, and “make or cause to be or to become”. 
On the other hand, the groups disagreed about 
the sense “go off or discharge” of the phrasal 
verb “to go off”: the first group found that the 
translation of that sense, “disparar”, did not refer 
to any other sense, while the second group used 
that word to translate also the sense “be dis-
charged or activated” of the same phrasal verb.  
In the experiment with case (2), groups were 
considered to agree about a sense if they both 
judged that the sense had or had not more than 
one translation. For example, both groups agreed 
that the sense “reach a state, relation, or condi-
tion” of the verb “to come” should be translated 
by more than one Portuguese word, including 
“terminar”, “vir”, and “chegar”. They also 
agreed that the sense “move toward, travel to-
ward something or somebody or approach some-
thing or somebody” of the same verb had only 
one translation, namely “vir”.  
The average Kappa coefficient obtained was 
0.66 for item (1), and 0.65 for item (2). There is 
not a reference value for this particular annota-
tion task (translation annotation), but the levels 
of agreement pointed by Kappa here can be con-
sidered satisfactory. The agreement levels are 
close to the coefficient suggested by Carletta as 
indicative of a good agreement level for dis-
course annotation (0.67), and which has been 
adopted as a cutoff in Computational Linguistics. 
6 Conclusions and future work 
We presented experiments contrasting monolin-
gual and multilingual WSD. It was found that, in 
fact, monolingual and multilingual disambigua-
tion differ in many respects, particularly the 
sense repository, and therefore specific strategies 
could be more appropriate to achieve effective 
multilingual WSD. We investigated the differ-
ences in sense repositories considering English-
Portuguese translation, using a set of eight am-
biguous verbs collected from sentences in Sem-
Cor and Senseval corpora. The English sense 
tags given by WordNet were compared to the 
Portuguese translations assigned by two groups 
of five human translators.  
Results corroborate previous cognate work, 
showing that there is not a one-to-one mapping 
between the English senses and their translations 
(to Portuguese, in this study). In most of the 
cases, many different senses were translated into 
the same Portuguese word. In many other cases, 
different, non-synonymous, words were neces-
sary to translate occurrences of the same sense of 
the source language, showing that differences 
between monolingual and multilingual WSD are 
not only a matter of the highly refined sense dis-
tinction criterion adopted in WordNet. Therefore, 
these results reinforce our argument that apply-
ing monolingual methods for multilingual WSD 
can either imply unnecessary work, or result in 
disambiguation errors.  
As future work we plan to carry out further in-
vestigation of the differences between monolin-
gual and multilingual WSD contrasting the Eng-
lish senses and translations into other languages, 
and analyzing other grammatical categories, par-
ticularly nouns.  

References  
Bentivogli, L., Forner, P., and Pianta, E. (2004). 
Evaluating Cross-Language Annotation Trans-
fer in the MultiSemCor Corpus. COLING-
2004, Geneva, pp. 364-370. 
Carletta, J. (1996). Assessing agreement on clas-
sification tasks: the kappa statistic. Computa-
tional Linguistics, 22(2), pp. 249-254. 
Carpuat, M. and Wu, D. (2005). Word sense dis-
ambiguation vs. statistical machine translation. 
43rd ACL Meeting, Ann Arbor, pp. 387–394. 
Chatterjee, N., Goyal, S., and Naithani, A. 
(2005). Pattern Ambiguity and its Resolution 
in English to Hindi Translation. RANLP-2005, 
Borovets, pp. 152-156. 
Diab, M. and Resnik, P. (2002). An Unsuper-
vised Method for Word Sense Tagging using 
Parallel Corpora. 40th ACL Meeting, Philadel-
phia. 
Hutchins, W.J. and Somers H.L. (1992) An In-
troduction to Machine Translation. Academic 
Press, Great Britain. 
Ide, N. and Véronis, J. (1998). Word Sense Dis-
ambiguation: The State of the Art. Computa-
tional Linguistics, 24 (1).  
Ide, N. (1999). Parallel Translations as Sense 
Discriminators. SIGLEX99 Workshop: Stan-
dardizing Lexical Resources, Maryland, pp. 
52-61. 
Ide, N., Erjavec, T., and Tufi, D. (2002). Sense 
Discrimination with Parallel Corpora. ACL'02 
Workshop on Word Sense Disambiguation: 
Recent Successes and Future Directions, 
Philadelphia, pp. 54-60.  
Kilgarriff, A. (1997). I Don't Believe in Word 
Senses. Computers and the Humanities, 31 
(2):91-113. 
Miháltz, M. (2005). Towards A Hybrid Ap-
proach to Word-Sense Disambiguation in Ma-
chine Translation. RANLP-2005 Workshop: 
Modern Approaches in Translation Technolo-
gies, Borovets.  
Miller, G.A., Beckwith, R.T., Fellbaum, C.D., 
Gross, D., and Miller, K. (1990). WordNet: 
An On-line Lexical Database. International 
Journal of Lexicography, 3(4):235-244. 
Miller, G.A., Chorodow, M., Landes, S., Lea-
cock, C., and Thomas, R.G. (1994). Using a 
Semantic Concordancer for Sense Identifica-
tion. ARPA Human Language Technology
Workshop - ACL, Washington, pp. 240-243. 
Montoyo, A., Romero, R., Vazquez, S., Calle, 
M., and Soler, S. (2002). The Role of WSD 
for Multilingual Natural Language Applica-
tions.  TSD’2002, Czech Republic, pp. 41-48. 
Ng, H.T., Wang, B., and Chan, Y.S. (2003). Ex-
ploiting Parallel Texts for Word Sense Disam-
biguation: An Empirical Study. 41st ACL 
Meeting, Sapporo, pp. 455-462.  
Palmer, M. (1998). Are WordNet sense distinc-
tions appropriate for computational lexicons? 
Senseval, Siglex98, Brighton.  
Resnik, P. and Yarowsky, D. (1997). A Perspec-
tive on Word Sense Disambiguation Methods 
and their Evaluating. ACL-SIGLEX Workshop 
Tagging Texts with Lexical Semantics: Why, 
What and How?, Washington. 
Specia, L. (2005). A Hybrid Model for Word 
Sense Disambiguation in English-Portuguese 
Machine Translation. 8th CLUK, Manchester, 
pp. 71-78. 
Vickrey, D., Biewald, L., Teyssier, M., and 
Koller, D. (2005). Word-Sense Disambigua-
tion for Machine Translation. HLT/EMNLP, 
Vancouver.
Vossen, P. (1998). EuroWordNet: Building a 
Multilingual Database with WordNets for 
European Languages. The ELRA Newsletter, 
3(1). 
Wilks, Y. and Stevenson, M. (1998). The Gram-
mar of Sense: Using Part-of-speech Tags as a 
First Step in Semantic Disambiguation. Natu-
ral Language Engineering, 4(1):1-9.  
