Experiments in Word Domain Disambiguation for Parallel 
Texts 
Bernardo Magnini and Carlo Strapparava 
ITC-irst, Istituto per la Ricerca Scientifica e Tecnologica, 1-38050 Trento, ITALY 
email: {magnini,strappa)@irst.it c.it 
Abstract 
This paper describes some preliminary 
results about Word Domain Disambigua- 
tion, a variant of Word Sense Disam- 
bignation where words in a text are 
tagged with a domain label in place of a 
sense label. The English WoRDNET and 
its aligned Italian version, MULTIWORD- 
NET, both augmented with domain la- 
bels, are used as the main information 
repositories. A baseline algorithm for 
Word Domain Disambiguation is pre- 
sented and then compared with a mu- 
tual help disambignation strategy, which 
takes advantages of the shared senses of 
parallel bilingual texts. 
1 Introduction 
This work describes some preliminary results 
about Word Domain Disambignation (WDD), a 
variant of Word Sense Disambiguation (WSD) 
where for each word in a text a domain label 
(among those allowed by the word) has to be cho- 
sen instead of a sense label. Domain labels, such 
as MEDICINE and ARCHITECTURE, provide a nat- 
ural way to establish semantic relations among 
word senses, grouping them into homogeneous 
clusters. A relevant consequence of the appli- 
cation of domain clustering over the WORDNET 
senses is the reduction of the word polysemy (i.e. 
the number of domains for a word is generally 
lower than the number of senses for that word). 
We wanted to investigate the hypothesis that 
the polysemy reduction caused by domain clus- 
tering can profitably help the word domain disam- 
bignation process. A preliminary experiment has 
been set up with two goals: first, providing exper- 
imental evidences that a frequency based WDD 
algorithm can outperform a WSD baseline algo- 
rithm; second, exploring WDD in the context of 
parallel, not aligned, text disambignation. 
The English WOR~DNET and the Italian aligned 
version MULTIWORDNET, both augmented with 
domain labels, are used as the main information 
repositories. A baseline algorithm for Word Do- 
main Disambignation is presented and then com- 
pared with a mutual help disambignation strategy, 
which makes use of the shared senses of parallel 
bilingual texts. 
Several works in the literature have remarked 
that for many practical purposes the fine-grained 
sense distinctions provided by WoRDNET are not 
necessary (see for example \[Wilks and Stevenson, 
98\], \[Gonzalo et al., 1998\], \[Kilgarriff and Yallop, 
2000\] and the SENSEVAL initiative) and make 
it hard word sense disambignation. Two related 
works are also \[Buitelaar, 1998\] and \[Buitelaar, 
2000\], where the reduction of the WORDNET pol- 
• ysemy is obtained on the basis of regnlar polysemy 
relations. Our approach is based on sense clusters 
derived by domain proximity, which in some case 
may overlap with regular polysemy derived clus- 
ters (e.g. both "book" as composition and "book" 
as physical object belong to PUBLISHIN6), but in 
many cases may not (e.g. "lamb" as animal be- 
longs to ZOOLOQY, while "lamb" as meat belongs 
to FooD). Following this line we propose Word 
Domain Disambiguation as a practical alternative 
for applications that do not require fine grained 
sense distinctions. 
The paper is organized as follows. Section 2 
introduces domain labels, their organization and 
the extensions to WORDNET. Section 3 discusses 
Word Domain Disambiguation and presents the 
algorithms used in the experiment. Section 4 gives 
the experimental setting. Results are discussed in 
Section 5. 
2 WordNet and Subject Field 
Codes 
In this work we will make use of an augmented 
WoRDNET, whose synsets have been annotated 
with one or more subject field codes. This re- 
source, discussed in \[Magnini and Cavagli~, 2000\], 
27 
currently covers all the noun synsets of WORD- 
NET 1.6 \[Miller, 1990\], and it is under develop- 
ment for the remaining lexical categories. 
Subject Field Codes (SFC) group together 
words relevant for a specific domain. The best 
approximation of SFCs are the field labels used 
in dictionaries (e.g. MEDICINE, A~;CHITECTURE), 
even if their use is restricted to word usages be- 
longing to specific terminological domains. In 
WORDNET, too, SFCs seem to be treed occasion- 
ally and without a consistent desi~;n. 
Information brought by SFCs is complemen- 
tary to what is already in WoRDNET. First of 
all a SFC may include synsets of different syntac- 
tic categories: for instance MEDICINE 1 groups to- 
gether senses from Nouns, such as doctoral and 
hospital#I, and from Verbs such as operate#7. 
Second, a SFC may also contain .,~nses from dif- 
ferent WORDNET sub-hierarchies (i.e. deriving 
from different "unique beginners, or from dif- 
ferent "lexicographer files"). For example, the 
SPORT SFC contains senses such as athlete#I, 
deriving from li:~e~orm#1, game_equipment#1 
from physical_object#1, sport#1 from act#2, 
and playingJield#1 from location#1. 
We have organized about 250 SFCs in a hier- 
archy, where each level is made up of codes of 
the same degree of specificity: for example, the 
second level includes SFCs such as BOTANY, LIN- 
GUISTICS, HISTORY, SPORT and RELIGION, while 
at the third level we can find specializations such 
as AMERICAN.HISTORY, GRAMMAR, PHONETICS 
and TENNIS. 
A problem arises for synsets that do not belong 
to a specific SFC, but rather can appear in almost 
all of them. For this reason, a FACTOTUM SFC has 
been created which basically includes two types of 
synsets: 
Gener/c synsets, which are hard to classify in 
a particular SFC, are generally placed high 
in the WoRDNET hierarchy and are related 
senses of highly polysemous words. For ex- 
ample: 
man#1 an adult male person (as opposed to a 
woman) 
man#3 the generic use of the word to refer to any 
human being 
date#1 day of the month 
aThroughout the paper subject field codes are ino 
cUcated with this TYPEFACE while word senses are re- 
ported with this typeface#l, with their corresponding 
numbering in WORDNET 1.6. Moreover, we use sub. 
ject field code, domain label and semantic field with 
the same meaning. 
dal;e#3 appointment, engagement 
• Stop Senses synsets which appear frequently 
in different contexts, such as numbers, week 
days, colors, etc. These synsets usually be- 
long to non polysemous words and they be- 
have much as stop words, because they do not 
significantly contribute to the overall mean- 
ing of a text. 
A single domain label may group together more 
than one word sense, resulting in a reduction of 
the polysemy. Figure 1 shows an example. The 
word "book" has seven different senses in WORD- 
NET 1.6: three of them are grouped under the 
PUBLISHING domain, causing the reduction of the 
polysemy from 7 to 5 senses. 
3 Word Domain Disambiguation 
In this section we present two baseline algorithms 
for word domain disambiguation and we propose 
some variants of them to deal with WDD in the 
context of parallel texts. 
3.1 Baseline algorithms 
To decide a proper baseline for Word Domain Dis- 
ambiguation we wanted to be sure that it was ap- 
plicable to both the s (i.e. English and 
Italian) used in the experiment. This caused the 
exclusion of a selection based on the domain fre- 
quency computed as a function of the frequency 
of the WORDNET senses, because we did not 
have a frequency estimation for Italian senses. 
We adopted two alternative frequency measures, 
based respectively on the intra text frequency and 
the intra word frequency of a domain label. Both 
of them are computed with a two-stage disam- 
bignation process, structurally similar to the al- 
gorithm used in \[Voorhees, 1998\]. 
Baseline 1: Intra text domain frequency. 
The baseline algorithm follows two steps. First, 
all the words in the text are considered and for 
each domain label allowed by the word the label 
score is incremented by one. At the second step 
each word is reconsidered, and the domain label 
(or labels, depending on how many best solutions 
are requested) with the highest score is selected 
as the result of the disambiguation. 
Baseline 2: Intra word domain frequency. 
In this version of the baseline algorithm, step 1 is 
modified in that each domain label allowed by the 
word is incremented by the frequency of the la- 
bel among the senses of that word. For instance, 
28 
{book#1 - p~blishtd co, npositio. } PUBLISHING 
"book" 
{book#2 volume#3 - book a.¢ a physical objea} 
{daybook#2 book#7 ledger#2 - an accounting book as aphisical object} 
{book#6 - book of the B/b/e} PtrBHSmNG RELIGION 
THEA~ 
{script#1 book#4 playscript#1-mriuenvemionojraplay} 
~ ok#1 
COM~nZCE 
book#5 ledger#l - recorda of commercial acc.ount} 
FACTOTt~ 
record#5 recordbook#1 book#3-compilationoflmowfact~ regaMing $ome~ing or someone} 
Figure I: An example of polysemy reduction 
if "book" is the word (see Figure 1), PUBLISH- 
ING will receive .42 (i.e. three senses out of seven 
belong to PUBLISHING), while the others domain 
labels will receive .14 each. 
3.1.1 The "factotum" effect 
As we mentioned in Section 2, a FACTOTUM la- 
bel is used to mark WORDNET senses that do not 
belong to a specific domain, but rather are highly 
widespread across texts of different domains. A 
consequence is that very often, at the end of step 1 
of the disambignation algorithm, FACTOTUM out- 
performs the other domains, this way affecting the 
selection carried out at step 2 (i.e. in case of am- 
biguity FACTOTUM is often preferred). 
For the purposes of the experiment described 
in the next sections the FACTOTUM problem has 
been resolved with a slight modification at step 2 
of the baseline algorithm: when FACTOTUM is the 
best selection for a word, also the second available 
choice is considered as a result of the disambigua- 
tion process. 
3.2 Extensions for parallel texts 
We started with the following working hypothe- 
sis. Using aligned wordnets to disambiguate par- 
allel texts allows us to calculate the intersection 
among the synsets accessible from an English text 
through the English WoRDNET and the synsets 
accessible from the parallel Italian text through 
the Italian WORDNET. It would seem reasonable 
that the synset intersection maximizes the num- 
ber of significant synsets for the two texts, and 
at the same time tends to exclude synsets whose 
meaning is not pertinent to the content of the text. 
Let us try to make the point clearer with an 
example. Suppose we find in an English text the 
word "bank" and in the Italian parallel text the 
word "banca', which we do not know being the 
translation of "bank", because we do not have 
word alignments. For "bank" we get ten senses 
from WORDNET 1.6 (reported in Figure 2), while 
for "banca" we get two senses from MULTIWORD- 
NET (reported in Figure 2). As the two wordnets 
are aligned (i.e. they share synset offsets), the 
intersection can be straightforwardly determined. 
In this case it includes 06227059, corresponding to 
bank#1 and banca#1, and 02247680, correspond- 
ing to bank#4 and banca#2, which both pertain 
to the BANKING domain, and excludes, among the 
others, bank#2, which happens to be an homonym 
sense in English but not in Italian. 
Incidentally, if "istituto di credito" were not in 
the synset 06227059 (e.g. because of the incom- 
pleteness of the Italian WORDNET) and it were 
the only word present in the Italian news to de- 
notate the bank#1 sense, the synset intersection 
would have been empty. 
As far as disambiguation is concerned it seems 
a reasonable hypothesis that the synset intersec- 
tion could bring constraints on the sense selection 
for a word (i.e. it is highly probable that the cor- 
rect choice belongs to the intersection). Following 
this line we have elaborated a mutual help disam- 
biguation strategy where the synset intersection 
can be accessed to help the disambiguation pro- 
cess of both English and Italian texts. 
In addition to the synset intersection, we 
wanted to consider the intersection of domain la- 
bels, that is domains that are shared among the 
29 
Bank (from WordNet 1.6) 
1. J{06227059}\[ depository financial institution, bank, banking concern, banking company 
I I -- (a financial institution that accepts deposits and channels the money into lending 
activities; ) 
2. {06800223} bank -- (sloping land (especially the slope beside a body of water)) 
3. {09626760} bank -- (a supply or stock held in reserve especially for future use 
(especially in emergencies)) 
4. {02247680}\[ bank, bank building -- (a building in which commercial banking is 
transacted; ) B 
5. {06250735} bank -- (an ~rrangement of similar objects in a row or in tiers; ) 
6. {03277560} savings bank, coin bank, money box, bank -- (a container (usually with a 
slot in the top) for keeping money at home;) 
7. {06739355} bank -- (a long ridge or pile; "a huge bank of earth") 
8. {09616845} bank -- (the :funds held by a gambling house or the dealer in some gambling 
games; ) 
9. {06800468} bank, cant, camber -- (a slope in the turn of a road or track;) 
I0. {00109955} bank -- (a flight maneuver; aircraft tips laterally about its longitudinal 
axis (especially in turning)) 
Banca (from MultiWordnet) 
1. 1{06227059}1 istituto.di_credito cassa banco banca 
q B 
I{0  4,680}\[ ban. 
Figure 2: An example of sysnet intersection in MULTIWORDNET 
senses of the parallel texts. In the example above 
the domain intersection would include just one la- 
bel (i.e. BANKING), in place of the two synsets 
of the synset intersection. The hypothesis is that 
domain intersection could reduce problems due to 
possible misalignments among the synsets of the 
two wordnets. 
Two mutual help algorithms have been imple- 
mented, weak mutual help and strong mutual help, 
which are described in the following. 
Weak Mutual help. In this version of the mu- 
tual help algorithm, step 1 of the baseline is mod- 
itied in that, if the domain label is found in the 
synset or domain intersection, a bonus is assigned 
to that label, doubling its score. In case of empty 
intersection (i.e. either no synset or no domain is 
shared by the two texts) this algorithm guarantees 
the same performances of the baseline. 
Strong Mutual help. In the strong version of 
the mutual help strategy, step 1 of the baseline 
is modified in that the domain label is scored if 
and only if it is found in the synset or domain 
intersection. While this algorithm does not guar- 
antee the baseline performance (because the inter- 
section may not contain all the correct synsets or 
domains), the precision score will give us indica- 
tions about the quality of the synset intersection. 
4 Experimental Setting 
The goal of the experiment is to establish some 
reference figures for Word Domain Disambigua- 
tion. Only nouns have been considered, mostly 
because the coverage of both MULTIWORDNET 
and of the domain mapping for verbs is far from 
being complete. 
Lemmas I Senses \[Mean Polysemy 
WN 1.6 94474 116317 1.23 
Ital WN 19104 25226 1.32 
DISC 56134 118029 2.10 
Table 1: Overview of the used resources (Noun 
part of speech) 
4.1 Lexlcal resources 
Besides the English WORDNET 1.6 we used MUL- 
TIWoRDNET \[Artale et al., 1997; Magnini and 
Strapparava, 1997\], an Italian version of the En- 
glish WoRDNET. It is based on the assump- 
tion that a large part of the conceptual relations 
defined for the English  can be shared 
with Italian. From an architectural point of view, 
MULTIWORDNET implements an extension of the 
WoRDNET lexical matrix to a "multilingual lexi- 
30 
Mean Values for Nouns Italian News English News 
Lexical Coverage WN 1.6 
ItalWN 
Disc 
# Synsets English 
Italian 
Intersection 
93% 
100% 
111.38 
35.48 
98% 
155.21 
Table 2: Mean lexical coverage and synset amount for AdnKronos news 
Mean Values for Nouns \] Italian News English News 
Sense Polysemy WN 1.6 
ItalWN 
Disc 
Domain Polysemy English 
Italian 
3.22 
6.82 
2.68 
4.37 
3.58 
Table 3: Mean sense and domain polysemy for AdnKronos news 
cal matrix" through the addition of a third dimen- 
sion relative to the . MULTIWORDNET 
currently includes about 30,000 lemmas. 
As a matter of comparison, in particular to es- 
timate the lack of coverage of MULTIWORDNET, 
we consider some data from the Italian dictionary 
"DISC" \[Sabatini and Coletti, 1997\], a large size 
monolingual dictionary, available both as printed 
version and as CD-ROM. 
Table 1 shows some general figures (only for 
nouns) about the number of lemmas, the number 
of senses and the average polysemy for the three 
lexical resources considered. 
4.2 Parallel Texts 
Experiments have been carried out on a news cor- 
pus kindly placed at our disposal by AdnKronos, 
an important Italian news provider. The corpus 
consists of 168 parallel news (i.e. each news has 
both an Italian and an English version) concerning 
various topics (e.g. politics, economy, medicine, 
food, motors, fashion, culture, holidays). The av- 
erage length of the news is about 265 words. 
Table 2 reports the average lexical coverage (i.e. 
percent of lemmas found in the news corpus) for 
WORDNET 1.6, MULTIWORDNET and the Disc 
dictionary. A practically zero variance among the 
various news is exhibited. We observe a full cov- 
erage for the Disc dictionary; in addition, the in- 
completeness of MULTIWORDNET is limited to 
5% with respect to WoRDNET 1.6. The table 
also reports the average amount of unique synsets 
for each news. In this case the incompleteness of 
Italian WoRDNET with respect to WORDNET 1.6 
raises to 30%, showing that a significant amount 
of word senses is missing. 
Table 3 shows the average polysemy of the news 
corpus considering both word senses and word do- 
main labels. The figures reveal a polysemy reduc- 
tion of 17-18% when we deal with domain poly- 
semy. 
Manual Annotation. A subset of forty news 
pairs (about half of the initial corpus) have been 
manually annotated with the correct domain la- 
bel. Annotators were instructed about the domain 
hierarchy and then asked to select one domain la- 
bel for each lemma among those allowed by that 
lemma. 
Uncertain cases have been reviewed by a sec- 
ond annotator and, in case of persisting conflict, a 
third annotator was consulted to take a decision. 
Lemmatization errors as well as cases of incom- 
plete coverage of domain labels have been detected 
and excluded. The whole manual set consists of 
about 2500 annotated nouns. 
Although we do not have empirical evidences, 
our practical experience confirms the intuition 
that annotating texts with domain labels is an 
easier task than sense annotation. 
Forty-two domain labels, representing the more 
informative level of the domain hierarchy men- 
tioned in Section 1, have been used for the ex- 
periment. Table 4 reports the complete list. 
5 Results and Discussion 
WSD and WDD on the Semcor Brown Cor- 
pus. In the first experiment we wanted to verify 
that, because of the polysemy reduction induced 
by domain clustering, WDD is a simpler task than 
31 
administration 
art 
commerce 
fashion 
mathematics 
play 
sociology 
agriculture 
artisanship 
computer.science 
history 
medicine 
politics 
sport 
alimentation 
astrology 
earth 
industry 
military 
psychology 
telecommunication 
anthropology 
astronomy 
economy 
law 
pedagogy 
publishing 
tourism 
archaeology 
biology 
engineering 
linguistics 
philosophy 
religion 
transport 
architecture 
chemistry 
factotum 
literature 
physics 
sexuality 
veterinary 
Table 4: Domain labels used in the experiment. 
Baseline 1 Baseline ~,) Weak Mutual (baseline 2) 
Synset Inter. I Domain Inter. 
Italian .83 .86 .87 .88 
English .85 .86 .87 .87 
Stron 0 Mutual (baseline ~) 
Synset Inter. I Domain Inter. 
.74 1.68 I .77 / .91 
.7o/.57 I .8o/.9  
Table 5: Precision and B,ecall (English and Italian) for different WDD algorithms 
WSD. For the experiment we used a subset of the 
Semcor corpus. As for WSD we obtained .66 of 
correct disambiguation with a sense frequency al- 
gorithm on polysemous noun words and .80 on all 
nouns (this last is also reported in the literature, 
for example in \[Mihalcea and Moldovan, 1999\]). 
As for WDD, precision has been computed consid- 
ering the intersection between the word senses be- 
longing to the domain label with the higher score 
and the sense tag for that word reported in Sem- 
cor. Baseline I and baseline 2, described in section 
3.1, respectively gave .81 and .82 in precision, with 
a significant improvement over the WSD baseline, 
which confirms the initial hypothesis. 
WDD in parallel texts. In this experiment 
we wanted to test WDD in the context of par- 
allel texts. Table 5 reports the precision and re- 
call (just in case it is not I) scores for six dif- 
ferent WDD algorithms applied to parallel En- 
glish/Italian texts. Numbers refer to polysemous 
words only. 
Both the baseline algorithms perfbrm quite well: 
83% for Italian and 85% for English in case of 
baseline 1, and 86% for both s in case of 
baseline 2 are similar to the results obtained on 
the SemCor corpus. 
The algorithm which includes word domain fre- 
quency (i.e. baseline 2) reaches the highest score 
in both s, indicating that the combina- 
tion of domain word frequency (considered at step 
1 of the algorithm) and domain text frequency 
(considered at step 2) is a good one. In addition, 
the fact that results are the same for both lan- 
guages indicates that the method can smooth the 
coverage differences among the wordnets. 
We expected a better result for the bilingual ex- 
tensions. The weak mutual strategy, either con- 
sidering the synset intersection or the domain la- 
bels intersection, brings just minor improvements 
with respect to the baselines; the strong mutual 
strategy lowers both the precision and the recall. 
There are several explanations for these results. 
The difference in sense coverage between the two 
wordnets, about 30%, may affect the quality of 
the synset intersection: this would also explain 
the low degree of recall (68% for Italian and 57% 
for English). This is particularly evident for the 
strong mutual strategy, where the relative lexi- 
cal poorness of the Italian synsets can strongly 
reduce the number of synsets in the intersection. 
Note also that the length of the synset intersec- 
tion is about 30-40% of the mean synset number 
for Italian and English news respectively. This 
means less material which the disambiguation al- 
gorithms can take advantage of: relevant sysnsets 
can be left out of the intersection. For these rea- 
sons it is crucial having wordnet resources at the 
same level of completion to exploit the mutual help 
hypothesis. 
Furthermore, there may be a significant amount 
of senses which are "quasi" aligned. This may 
happen when two parallel senses map into close 
synsets, but not in the same one (e.g. one is the di- 
rect hypernym of the other). This problem could 
be overcome considering the IS-A relations during 
the computation of the intersection. In this situ- 
ation it is also probable that the senses maintain 
the same domain label. This would explain why 
the domain intersection behaves better than the 
synset intersection (from 74%-68% to 77%-91% 
for the Italian and from 70%-57% to 80%-91% for 
the English). 
32 
6 Conclusions 
We have introduced Word Domain Disambigua- 
tion, a variant of Word Sensse Disambiguation 
where words in a text are tagged with a domain 
label in place of a sense label. Two baseline algo- 
rithms has been presented as well as some exten- 
sions to deal with domain disambiguation in the 
context of parallel translation texts. 
Two aligned wordnets, the English WORDNET 
1.6 and the Italian MULTIWORDNET, both aug- 
mented with domain labels, have been used as the 
main information repositories. 
The experimental results encourage to further 
investigate the potentiality of word domain dis- 
ambiguation. There are two interesting perspec- 
tives for the future work: first, we want to ex- 
ploit the relations among different lexical cate- 
gories (mainly nouns and verbs) when they share 
the same domain label; second, it seems reason- 
able that the disambiguation process may take ad- 
vantage of both WDD and WSD, where the initial 
word ambiguity is first reduced with WDD and 
then resolved with more fine grained information. 
Finally, an in-depth investigation is necessary for 
what we called factotum effect, which is peculiar 
of WDD. 
As for the applicative scenarios, we want to ap- 
ply WDD to the problem of content based user 
modelling. In particular we are developing a per- 
sonal agent for a news web site that learns user's 
interests from the requested pages that are an- 
alyzed to generate or to update a model of the 
user \[Strapparava et al., 2000\]. Exploiting this 
model, the system anticipates which documents 
in the web site could be interesting for the user. 
Using MULTIWORDNET and domain disambigua- 
tion algorithms, a content-based user model can 
be built as a semantic network whose nodes, in- 
dependent from the , represent the word 
sense frequency rather then word frequency. Fur- 
therrnore, the resulting user model is indepen- 
dent from the  of the documents browsed. 
This is particular valuable with muitilingual web 
sites, that are becoming very common especially 
in news sites or in electronic commerce domains. 

References 
A. Artale, B. Magnini, and C. Strapparava. 
WoRDNET for italian and its use for lexical 
discrimination. In AI*IA97: Advances in Ar- 
tificial Intelligence. Springer Verlag, 1997. 
P. Buitelaar. CoPJ~LEX: An ontology of sys- 
tematic polysemous classes. In Proceedings of 
FOIS98, International Conference on Formal 
Ontology in Information Systems, Trento, Italy, 
June 6-8 1998. IOS Press, 1998. 
P. Buitelaar. Reducing lexical semantic complex- 
ity with systematic polysemous classes and un- 
derspecification. In Proceedings of ANLP2000 
Workshop on Syntactic and Semantic Complex- 
ity in Natural Language Processing Systems, 
Seattle, USA, April 30 2000, 2000. 
J. Gonzalo, F. Verdejio, C. Peters, and N. Calzo- 
lari. Applying eurowordnet to cross- 
text retrieval. Computers and Humanities, 
32(2-3):185--207, 1998. 
A. Kilgarriff and C. Yallop. What's in a the- 
sanrus? In Proceedings of LREC-BO00, Sec- 
ond International Conference on Language Re- 
sources and Evaluation, Athens, Greece, June 
2000. 
B. Maguini and G. Cavagli~. Integrating subject 
field codes into WordNet. In Proceedings of 
LREC-2000, Second International Conference 
on Language Resources and Evaluation, Athens, 
Greece, June 2000. 
B. Maguini and C. Strapparava. Costruzione di 
una base di conoscenza lessicale per l'italiano 
basata su WordNet. In M. Carapezza, D. Gam- 
barara, and F. Lo Piparo, editors, Linguaggio e 
Cognizione. Bulzoni, Palermo, Italy, 1997. 
K. Mihalcea and D. Moldovan. A method for word 
sense disambiguation of unrestricted text. In 
Proc. of A CL-99, College Park Maryland, June 
1999. held in conjunction with UM'96. 
G. Miller. An on-line lexical database. Interna- 
tional Journal of Lexicography, 13(4):235-312, 
1990. 
F. Sabatini and V. Coletti. Dizionario Italiano 
Sabatini Coletti. Giunti, 1997. 
C. Strapparava, B. Magnini, and A. Stefani. 
Seuse-based user modelling for web sites. In 
Adaptive Hyperraedia and Adaptive Web-Based 
Systems - Lecture Notes in Computer Science 
1892. Springer Verlag, 2000. 
E. Voorhees. Using wordnet for text retrieval. In 
C. Fellbaum, editor, WordNet - an Electronic 
Lexical Database. MIT Press, 1998. 
Y. Wilks and M. Stevenson. Word sense dis- 
ambiguation using optimised combination of 
knowledge sources. In Proc. of COLING- 
A CL '98, 98. 
