i 
l 
! 
I 
I 
! 
! 
Aligning WordNet with Additional Lexical Resources 
Oi Yee Kwong 
Computer Laboratory, University of Cambridge 
New Museums Site, Cambridge CB2 3QG, U.K. 
oyk20@cl.cam.ac.uk 
Abstract 
This paper explores the relationship between Word- 
Net and other conventional linguistically-based lex- 
ical resources. We introduce an algorithm for align- 
ing word senses from different resources, and use 
it in our exper~nent to sketch the role played by 
WordNet, as far as sense discrimination is concerned, 
when put in the context of other lexical databases. 
The results show how and where the resources sys- 
tematically differ from one another with respect to 
the degree of polysemy, and suggest how we can (i) 
overcome the inadequacy of individual resources to 
achieve an overall balanced degree of sense discrim- 
ination, and (ii) use a combination of semantic clas- 
sification schemes to enrich lexical information for 
NLP. 
1 Introduction 
Lexical resources used in natural language process- 
ing (NLP) have evolved from handcrafted lexical 
entries to machine readable lexical databases and 
large corpora which allow statistical manipulation. 
The availability of electronic versions of linguistic 
resources was a big leap. Among these resources we 
find conventional dictionaries as well as thesauri. 
However, it does not often suffice to depend on any 
single resource, either because it does not contain 
all required information or the information is not 
organised in a way suitable for the purpose. Merg- 
ing different resources is therefore necessary. Cal- 
zolaxi's (1988) Italian lexical database, Knight and 
Luk's (1994) PANGLOSS ontology, and Klavans and 
Tzoukermann's (1995) bilingual lexicon axe some re- 
sponses to this need. 
Many attempts have also been made to trans- 
form the implicit information in dictionary defini- 
tions to explicit knowledge bases for computational 
purposes (Amsler, 1981; Calzolaxi, 1984; Chodorow 
et al., 1985; Maxkowitz et al., 1986; Klavans et 
al., 1990; Vossen and Copestake, 1993). Nonethe- 
less, dictionaries axe also infamous for their non- 
standaxdised sense granularity, and the taxonomies 
obtained from definitions axe inevitably ad hoc. It 
would therefore be a good idea if we could integrate 
73 
such information from dictionaries with some exist- 
ing, and widely exploited, classifications such as the 
system in Roget's Thesaurus (Roget, 1852), which 
has remained intact for years. 
We can see at least the following ways in which 
an integration of lexical resources would be useful in 
NLP: 
• Most NLP functions, notably word sense dis- 
ambiguation (WSD), need to draw information 
from a variety of resources and cannot suffi- 
ciently rely on any single resource. 
• When different systems of sense tagging are 
used in different studies, we need a common 
ground for comparison. Knowing where one 
sense in one resource stands in another would 
enable better evaluation. 
• In attempting integration, we can discover how 
one resource differs from another and thus iden- 
tify their individual limitations. This can guide 
improvement of the resources. 
An approach to the integration problem is offered 
by WordNet. WordNet is designed to enable con- 
ceptual search (Miller et al., 1993), and therefore it 
should provide a way of linking word level senses as 
those in dictionaries with semantic classes as those 
in thesauri. However, one important question is 
whether WordNet, a psycholinguistically-based re- 
source, will work the same way as conventional lin- 
guistic resources do. 
We can divide this question into two parts. First, 
we axe interested in how similar the sense discrimi- 
nation is in WordNet and in a conventional dictio- 
nary. Second, WordNet has a classificatory struc- 
ture, but the principle of classification is somehow 
different from that of a thesaurus. As a result, terms 
which axe close in a thesaurus, thus allowing con- 
textual sense disambiguation, may be found further 
apart in the WordNet taxonomy, which may there- 
fore not be informative enough. For example, "car" 
and "driver" axe located in two different branches 
in the WordNet hierarchy and the only way to re- 
late them is through the top node "entity". This 
fails to uncover the conceptual closeness of the two 
I 
I 
I 
I 
l 
li 
I 
i 
I 
I 
I 
i 
I 
I 
I 
words as Roget's Thesaurus does, for they are put in 
adjacent semantic classes ("Land travel" and "Trav- 
eller" respectively). Nevertheless, we believe that 
there must be some relation between the classes in 
WordNet and those in a thesaurus, which provides 
some means for us to make an association between 
them. 
We have therefore proposed an algorithm to link 
up the three kinds of resources, namely a conven- 
tional dictionary, WordNet and a thesaurus. This is 
made possible with the WordNet taxonomic hierar- 
chy as the backbone because traversing the hierarchy 
gives many kinds of linking possibility. The resulting 
integrated information structure should then serve 
the following functions: 
* enhancing the lexical information in a dictio- 
nary with the taxonomic hierarchy in WordNet, 
and vice versa 
• complementing the taxonomic hierarchy in 
WordNet with the semantic classification in a 
thesaurus, and vice versa 
We have carried out an experiment, using the al- 
gorithm, to map senses in a dictionary to those in 
WordNe t, and those in WordNet to the classes in a 
thesaurus. Our aim has been to (i) assess the plau- 
sibility of the algorithm, and to (ii) explore how the 
various resources differ from one another. The re- 
sults suggest that mappings axe in general successful 
(i.e. when links can be made, they are appropriate) 
while failures mostly arise from the inadequacy of 
individual resources. Based on these findings, we 
have also proposed some ways to overcome such in- 
adequacies. 
The algorithm is described in the next section. 
The test materials and the design are given in Sec- 
tion 3. The results are presented in Section 4. They 
are analysed and discussed in Section 5, where we 
also suggest some ways to apply them. 
2 Integrating Different Resources 
2.1 Relations between Resources 
The three lexic',d resources we used are the 1987 re- 
vision of Roger's Thesaurus (ROGET) (Kirkpatrick, 
1987), the Longman Dictionary of Contemporary 
English (LDOCE) (Procter, 1978) and the Prolog 
version of WordNet 1.5 (WN) (Miller et al., 1993). 
The linking of LDOCE and WN is in principle quite 
similar to Knight and Luk's (1994) approach in the 
PANGLOSS project. But our form of comparison 
between LDOCE and WN, motivated by the organ- 
isation of individual resources in relation to one an- 
other, was simpler but as effective. Figure 1 shows 
how word senses are organised in the three resources 
and the arrows indicate the direction of mapping. 
In the middle of the figure is the structure of 
WN, a hierarchy with the nodes formed from the 
74 
synsets. If we now look up ROGET for word x2 in 
synset X, since words expressing every aspect of an 
idea are grouped together in ROGET, we can ex- 
pect to find not only words in synset X, but also 
those in the coordinate synsets (i.e. M and P, with 
words ml, m2, Pl, P2, etc.) and the superordinate 
synsets (i.e. C and A, with words cl, c2, etc.) in 
the same ROGET paragraph. In other words, the 
thesaurus class to which x2 belongs should roughly 
include X U M U P U C t3 A. On the other hand, 
the LDOCE definition corresponding to the sense of 
synset X (denoted by Dz) is expected to be similar 
to the textual gloss of synset X (denoted by GI(X)). 
Nevertheless, given that it is not unusual for dic- 
tionary definitions to be phrased with synonyms or 
superordinate terms, we would also expect to find 
words from X and C, or even A, in the LDOCE 
definition. That means we believe D~: ~ Gl(X) and 
Dzn(XUCUA) ~ ¢. We did not include coordinate 
terms (called "siblings" in Knight and Luk (1994)) 
because we found that while nouns in WN usually 
have many coordinate terms, the chance of hitting 
them in LDOCE definitions is hardly high enough 
to worth the computation effort. 
2.2 The Algorithm 
Our algorithm defines a mapping chain from 
LDOCE to ROGET through WN. It is based on 
shallow processing within the resources themselves, 
exploiting their inter-relatedness, and does not rely 
on extensive statistical data (e.g. as suggested 
in Yarowsky (1992)). Given a word with part of 
speech, W(p), the core steps are as follows: 
Step 1: From LDOCE, get the sense definitions 
Dr, ..., Dt under the entry W(p). 
Step 2: From WN, find all the synsets 
Sn{wt,w2,...} such that W(p) E Sn. Also 
collect the corresponding gloss definitions, 
Gl(Sn), if any, the hypernym synsets Hyp(S,~), 
and the coordinate synsets Co(S,~). 
Step 3: Compute a similarity score matrix ,4 for 
the LDOCE senses and the WN synsets. A 
similarity score A(i, j) is computed for the i th 
LDOCE sense and the jta WN synset using 
a weighted sum of the overlaps between the 
LDOCE sense and the WN synset, hypernyms, 
and gloss respectively, that is 
A(i,j) = atlDi n Sil + a:lDi n Hup(S~)I 
+ a31D~ n Gt(S~)I 
For our tests, we just tried setting at = 3, 
a., = 5 and a3 -- 2 to reveal the relative signif- 
icance of finding a hypernym, a synonym, and 
any word in the textual gloss respectively in the 
dictionary definition. 
A 
120. N. cl. cZ ... (in C); 
ml. m2.... (in M): pl. p?.. B C 
... (in P): xl, x2,... (ia X) ~ ~ (~l.c2.... I, Ol(C'3 
V .... i \ 
.... E F M P X 
121.N .... {ml. m2.... }, Gt(M) {pl.p?,....1. GI{P) \[xl. ~..... }. GI(X) /N 
R T 
(RoGEr} (WN} 
xl 
1, ... dcfiailk'm (Dx) si milia¢ m GI(X) 
0¢ de211~d in ~ Of wofctS in 
X c~r C. ctc_ 
2 .... 
J .... 
X3 
I .... 
2 .... 
(LDGCE) 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
i 
I 
i 
Figure 1: Organisation of word senses in different resources 
Step 4: From ROGET, find all paragraphs 
Pm{wl,w2, ...} such that W(p) E P,,~. 
Step 5: Compute a similarity score matrix B for the 
WN synsets and the ROGET classes. A simi- 
larity score B(j, k) is computed for the jth WN 
synset (taking the synset itself, the hypernyms, 
and the coordinate terms) and the k th ROGET 
class, according to the following: 
B(j,k) = bIISj N P~I + b~iHyp(Sj) N Pkl 
+ b31Co(Sj) n Pkl 
We have set bl = b2 = b3 = 1. Since a ROGET 
class contains words expressing every aspect of 
the same idea, it should be equally likely to find 
synonyms, hypernyms and coordinate terms in 
common. 
Step 6: For i = 1 to t (i.e. each LDOCE sense), find 
max(.A(i,j)) from matrix ,4. Then trace from 
matrix B the jth row and find max(B(j,k)). 
The i th LDOCE sense should finally be mapped 
to the ROGET class to which Pk belongs. 
3 Testing. 
3.1 Materials 
Three groups of test words (at present we only test 
on nouns), each containing 12 random samples, were 
prepared. Words have five or fewer WN senses in 
the "low polysemy group" (LO), between six to ten 
in the "medium polysemy group" (MED) , and 11 
or more in the "high polysemy group" (HI). Table 1 
shows the list of test words with the number of senses 
they have in the various lexical resources. 
3.2 Design and Hypotheses 
Our investigation was divided into three parts. 
While the application of the algorithm was in the 
third part, the first two parts were preliminaries to 
gather some information about the three resources 
so as to give a guide for expecting how well the 
mapping algorithm would work and such informa- 
tion would also help explain the results. 
Part 1: First, we just consider whether the bulk of 
information, measured by a simple count of the 
number of senses, for each word captured by dif- 
ferent resources is similar in size. Basically if it 
was not, it would mean that words are treated 
too differently in terms of sense distinction in 
different resources for the mapping to succeed. 
A crude way is to look for some linear relation- 
ship for the number of senses per word in differ- 
ent resources. If WN is as linguistically valid as 
other lexical databases, we would expect strong 
positive correlation with other resources for the 
"amount" of information. 
Part 2: Second, we look for some quantitative char- 
acterisation of the relationship we find for the 
resources in Part 1. We achieve this by perform- 
ing some paired-sample t-tests. If the resources 
do in fact capture similar "amount" of informa- 
tion, we would not expect to find any statisti- 
cally significant difference for the mean number 
of senses per word among different resources. 
Part 3: Third, we now apply the mapping algo- 
rithm and try to relate the results to the in- 
formation regarding the resources found in the 
previous parts. We map LDOCE senses to WN 
synsets, and WN synsets to ROGET classes (i.e. 
we skip step 6 in this study). 
We first analyse the results by looking at map- 
ping accuracy and failure, and secondly by 
characterising the relationship between the two 
pairs of source and target (i.e. LDOCE and 
WN, WN and ROGET) by means of the Poly- 
semy Factor P. This measure reflects the gran- 
ularity of the senses in the source resource (S) 
with respect to those in the target resource (7"), 
and is defined as follows: 
Let 
f : S --} 7- (human judgement) 
g : $ -+ 7" (algorithm) 
T' = (t E T : t = g(s) = f(s) for some s E $} 
r s = {s ~S:g(s) =/(s)} 
p = IT'I 
IS'l 
75 
Group .h Low Polysemy Words 
• 'cheque plan party bench industry society chance copy music current letter ceremony 
W 1 3 5 5 3 3 4 4 4 3 4 3 
L '2 5 9 5 4 6 6 4 5 4 5 3 
R I 5 I0 6 5 6 2 9 2 3 5 4 
W 
L 
R 
step 
9 
9 
9 
number 
I0 
ii 
6 
statement 
6 
4 
9 
Group 2: ~edium Polysemy Words 
strike power pmture sense credit note title cast bridge 
6 9 8 7 8 9 8 9 9 
3 16 I0 8 8 tO 4 8 7 
5 7 6 4 8 I0 6 13 5 
Group 3: High PolysemyWords 
li~ service field card hand face case air place ptece call block 
W 13 14 15 II 13 13 16 13 15 II 12 II 
L 20 16 12 8 12 8 17 13 21 16 15 8 
R 8 I0 II I0 13 7 14 7 5' I0 5 II 
i 
i 
I 
I 
l 
I 
I 
I 
I 
I 
I 
I 
I 
I 
Table 1: List of test words and number of senses 
T' is therefore the ratio between the minimum 
number of target senses covering all mapped 
source senses and all mapped source senses. In 
other words, the smaller this number, the more 
fine-grained are the senses in S with respect to 
7". It tells us whether what is in common be- 
tween the source and the target is similarly dis- 
tingnished, and whether any excess information 
in the source (as found in Part 2) is attributable 
to extra fineness of sense distinction. 
4 Results 
Part 1: Table 2 shows the correlation matrix for 
the number of senses per word in different re- 
sources. The upper half of the matrix shows 
the overall correlations while the lower half 
shows correlations within individual groups of 
test words. (An asterisk denotes statistical sig- 
nificance at the .05 level.) Significant positive 
correlations are found in general, and also for 
the LO group between WN and the other two 
resources. Such abstract relations do not nec- 
essarily mean simple sameness, and the exact 
numerical difference is found in Part 2. 
LDOCE 
WN 
ROGET 
LDOCE WN ROGET 
- 0.8227* 0.4238* 
L: 0.7170. 
M: 0.6778* - 0.6133. 
H: 0.4874 
L: 0.5658 L: 0.5831. 
M: 0.1026 M: 0.2605 - 
H: -0.2424 H: 0.1451 
Table 2: Correlation of number of senses per word 
among various resources 
Part 2: The means for the number of senses per 
word in the three resources are shown in Ta- 
ble 3. LDOCE has an average of 1.33 senses 
more than WN for the LO Group, and this 
difference is statistically significant (t = 3.75, 
p < .05). Also, a significant difference is found 
in the HI group between WN and ROGET, with 
the latter having 3.83 senses fewer on average 
(t = -4.24, p < .05). 
LDOCE WN ROGET 
LO 4.83 3.50 4.83 
MED 8.17 8.17 7.33 
HI 13.83 13.08 9125 
Table 3: Mean'number of senses per word 
Part 3: Apart from recording the accuracy, we also 
analysed the various types of possible map- 
ping errors. They are summarised in Table 4. 
Incorrectly Mapped and Unmapped-a are both 
"misses", whereas Forced Error and Unmapped- 
b are both "false alarms". These errors are man- 
ifested either in a wrong match or no match at 
all. The performance of the algorithm on the 
three groups of nouns is shown in Table 5. 
Target Exists Mapping Outcome 
Wrong Match No Match 
Yes Incorrectly Mapped Unmapped-a 
No Forced Error Unmapped-b 
Table 4: Different types of errors 
Statistics from one-way ANOVA show that for 
the mapping between LDOCE and WN, there 
are significantly more Forced Errors in the HI 
group than both the LO and MED group (F = 
7.58, p < .05). 
For the mapping between WN and ROGET, the 
MED group has significantly more Incorrectly 
Mapped than the LO group (F = 3.72, p < .05). 
Also, there are significantly more Forced Errors 
in the HI group than the LO and MED group 
(F-- 8.15, p < .05). 
76 
Ii 
11 
I 
II 
Accurately Mapped 
Incorrectly Mapped 
Unm app ed- a 
Unmapped-b 
Forced Error 
LDOCE -+ WN 
LO 
64.77% 
15.37% 
"2.08% 
11.25% 
6.53!~ 
0.79 
MED HI 
65.63% 52.96% 
11.82% 15.17% 
3.13% 2.57% 
11.48% 8.46% 
7.95% 20.83% 
0.88 0.91 
WN --r ROGET 
LO 
78.89% 
0.00% 
2.08% 
17.36% 
i.67% 
0.80 
"MED HI 
69.63% 69.79% 
6.94% 3.64% 
"1.39% 2.55% 
14.03% 7.47% 
8.01% 16.56% 
0.66 0.59 
Table 5: Average mapping results 
As far as the Polysemy Factor is concerned, it is 
found to be significantly lower in the HI group 
than the LO group (F = 3.63, p < .05) for the 
mapping between WN and ROGET. 
5 Discussion 
Though the actual figures in Table 5 may not be 
significant in themselves given the small s~unple size 
in the test, they nevertheless indicate some underly- 
ing relationships among the resources, and suggest 
it will be worth pursuing larger scale tests. 
5.1 Resource Similarities and Differences 
Results from the first two parts of the inwestigation 
give us a rough idea of the similarity and difference 
among the resources. Comparing the overall corre- 
lation between WN and LDOCE with respect to the 
number of senses per word with that between WN 
and ROGET, the much higher positive correlation 
found for the former suggests that WN, though or- 
ganised like a thesaurus, its content is like that in a 
dictionary. 
While the correlation results give us an idea of how 
strong the linear relationship is, the t-test results 
suggest to us that a conventional dictionary seems 
to capture relatively more meanings than WN when 
a word has fewer than five WN senses. On the other 
hand, a similar relation was found between WN and 
ROGET for words which have more than 10 WN 
senses. However, this could mean two things: either 
that WN does contain more information than a the- 
saurus, or that the WN senses are getting relatively 
more fine-grained. 
In the experiment we divided the test nouns into 
three groups of different degree of polyserny. How- 
ever, a rough count from WordNet 1.5 reveals that 
out of the 88200 nouns, only 0.07% belongs to the 
HI group, with an average of 13.18 senses; whereas 
0.55% belongs to the MED group, with an average 
of 7.05 senses. Up to 99.37% of the nouns come 
from the LO group, averaging 1.18 senses. In other 
words, the idiosyncrasy found for the HI group may 
have been magnified in the test samples, and we can 
expect in general a rather consistent relatiorLship be- 
tween WN and the other two resources. 
5.2 Aligning WN senses with others 
The third part reveals more details of the inter- 
relationship among the resources. Knight and 
Luk (1994) reported a trade-off between coverage 
and correctness. Our results, albeit for a smaller test 
and with different ambiguity grouping, are compara- 
ble with theirs. Thus our Accurately Mapped figures 
correspond effectively to their pct correct at their 
confidence level > 0.0. A similar average of slightly 
more than 60% accuracy was achieved. 
Overall, the Accurately Mapped figures support 
our hypothesised structural relationship between a 
conventional dictionary, a thesaurus and WN, show- 
ing that we can use this method to align senses in one 
resource with those in another. As we expected, no 
statistically significant difference was found for ac- 
curacy across the three groups of words. This would 
mean that the algorithm gives similar success rates 
regardless of how many meanings a word has. 
In addition, we have also analysed the unsuccess- 
ful cases into four categories as shown earlier. It can 
be seen that "false alarms" were more prevalent than 
"misses", showing that errors mostly arise from the 
inadequacy of individual resources because there are 
no targets rather than from failures of the mapping 
process. Moreover, the number of "misses" can pos- 
sibly be reduced, for example, by a better way to 
identify genus terms, or if more definition patterns 
are considered. 
Forced Error refers to cases without any satisfac- 
tory target, but somehow one or more other targets 
score higher than the rest. We see that this figure is 
significantly higher in the HI group than in the other 
two groups for the mappings between WN and RO- 
GET, showing that there are relatively more senses 
in WN which can find no counterpart in ROGET. So 
WN does have something not captured by ROGET. 
The polysemy factor 79 can also tell us something 
regarding how fine-grained the senses are in one re- 
source with respect to the other. The significantly 
lower 79 in the HI group implies that as more mean- 
ings are listed for a word, these meanings can nev- 
ertheless be grouped into just a few core meanings. 
Unless we require very detailed distinction, a cruder 
discrimination would otherwise suffice. 
77 
I 
I 
I 
I 
! 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
Thus, the Forced Error and Polysemy Factor data 
show that both "more information" (in the sense of 
more coverage of the range of uses of a word) and 
"more granularity" contribute to the extra; senses in 
WN in the HI group. However, no precise conclusion 
can be drawn because this is rather variable even 
within one resource. 
Another observation was made regarding: the map- 
ping between WN and ROGET. Unlike the: mapping 
between LDOCE and WN which is easy to check 
by comparing the definitions, synonyms and so on, 
judging whether a mapping from WN to ROGET 
is correct is not always straightforward. This is be- 
cause either the expected target and the mapped 
target are not identical but are nevertheless close 
neighbours in the Roget class hierarchy, o:r because 
different targets would be expected depending on 
which part of the definition one's focus is on. For 
instance, "cast" has the sense of "the actors in a 
play". Strictly speaking it should be put under "as- 
semblage', but we may be unwilling to say a map- 
ping to "drama" is wrong. As we have said, WN 
and ROGET have different classificatory structures. 
Nevertheless, we may be able to take adw~tage of 
this difference as discussed in the next section. 
5.3 Making use of the findings 
Clearly successful mappings are influenced by the 
fineness of the sense discrimination in the resources. 
How finely they are distinguished can be inferred 
from the similarity score matrices generated from 
the algorithm for the two pairs of mappings. Read- 
ing the matrices row-wise shows how vaguely a cer- 
tain sense is defined, whereas reading them column- 
wise reveals how polysemous a word is. The presence 
of unattached senses also implies that using only one 
single resource in any NLP application is likely to be 
insufficient. 
This is illustrated for one of the test words (note) 
in Figure 2. (. = correctly mapped, o = ,expected 
target, x = incorrectly mapped, and * = fi~rced er- 
ror) Tables 6 and ? show the corresponding WN 
synsets and LDOCE senses. It can be seen that Dt 
and D~_ are both about musical notes, whereas D~ 
and Ds can both be construed as letters. 
Consequently, using this kind of mapping data, 
we may be able to overcome the inadequacy of WN 
in at least two ways: (i) supplementing the missing 
senses to achieve an overall balanced sense discrim- 
ination, and (ii) superimposing the WN taxonomy 
with another semantic classification scheme such as 
that found in ROGET. 
For the first proposal, we can, for example, con- 
flare the mapped senses and complement them with 
the detached ones, thus resulting in a more com- 
plete but not redundant sense discrimination. In 
the above case, we can obtain the following new set 
of senses for note: 
78 
Ot 
D2 
03 
D4 
Ds 
06 
D? 
Ds 
D9 
Olo 
SI $2 S3 $4 S5 S~ 
o 
e 
S? $8 
e 
e 
o 
Figure 2: LDOCE to WN mapping for note 
Synset WN Synsets 
S, eminence, distinction, preeminence, note 
$2 
$3 
$4 
note, promissory note, note of hand 
bill, note, government note, bank bill, banker's 
bill, bank note, banknote, Federal Reserve 
note, greenback 
note (tone of voice) 
$5 note, musical note, tone 
$6 note, annotation, notation 
S? note, short letter, line 
Ss note (written record) 
Sg note (emotional quality) 
Table 6: WordNet synsets for note 
1. promissory note 
2. bank note 
3. tone of voice 
4. musical note 
5. annotation 
6. letter 
7. vritten record 
8. eminence 
9. emotional quality 
10. element 
Note that 1 to 7 are the senses mapped and con- 
tinted, 8 and 9 are the unattached syusets in WN, 
and 10 is the unattached sense in LDOCE. 
The second proposal is based on the observation 
that the classificatory structures in WN and RO- 
GET may be used to complement each other because 
each of them may provide a better way to capture se- 
mantic information in a text at different times. As in 
our "cast" example, the WN taxonomy allows prop- 
erty inheritance and other logical inference from the 
information that "cast" is an assemblage, and thus 
is a social group; while the ROGET classification 
also captures the :'drama" setting, so that we know 
it is not just any group of people, but only those in- 
volved in drama. Imagine we have another situation 
as follows: 
He sang a song last night. The notes were 
too high for a bass. 
The hypernym chains for the underlined nouns in 
WN are as follows (assuming that we have spotted 
the intended senses): 
I 
I 
! 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
Sense LDOCE Definition 
musical sound, usu. of a particular length and 
PITCH 
Dt 
D 2 a written sign for any of these sounds 
Ds a quality of voice 
D4 any quality ; ELEMENT 
Ds 
Ds 
a record or reminder in writing 
a remark added to a piece of writing and placed 
outside the main part of the writing (a~ at the 
side or bottom of a pa~e, or at the emi) 
a short usu. informal letter 
a formal letter between governments 
a'piece of paper money 
D7 
Ds 
D9 
D1o PROMISSORY NOTE 
Table 7: LDOCE definitions for note 
song -- (a short musical composition ...) 
=> musical composition, opus, ... 
8> music 
=> a.rt, fine art 
=> creation 
=> artifact, artefact 
=> object, inanimate objec~ .... 
=> entity 
Uol;e, musical noge, ~one 
=> musical notation 
8> notation, notational system 
8> writing, symbolic representation 
8> written co~unication, ... 
=> co~unicat ion 
8> social relation 
~> relation 
8> abstraction 
bass, basso -- (an adult male singer ...) 
=> singer, vocalist 
"> musician, instrumentalist, player 
8> performer, performing artist 
=> entertainer 
=> person, individual .... 
=> life form, organism .... 
=> ent ity 
=> causal agent, cause, ... 
=> entity 
Again, it is important that bass should be able to in- 
herit the properties from person or note from written 
communication, and so on, as WN now allows us to 
do. But at the same time, it can be seen that the 
nouns can hardly be related to one another in the 
WN hierarchical structure except at the top node en- 
tity, and it is then difficult to tell what the discourse 
is about. However, if we align the senses with the 
ROGET classes, we can possibly solve the problem. 
Consequently, the details of how we can fle~dbly use 
the two classifications together can be a future di- 
rection of this research. 
6 Conclusion 
In general we cannot expect that a single resource 
will be sufficient for any NLP application. WordNet 
is no exception, but we can nevertheless enhance its 
79 
utility. The study reported here began to explore 
the nature of WordNet in relation to other lexical 
resources, to find out where and how it differs from 
them, and to identify possible ways to absorb ad- 
ditional information. Apart from linking WordNet 
and LDOCE, as Knight and Luk (1994) did, we also 
experimented with ROGET to broaden the amount 
and type of information. The results suggest that, 
with an algorithm like the one described, WordNet 
can be fine-tuned and combined with other available 
resources to better meet the various information re- 
quirement of many applications. 

References 
R. Amsler. 1981. A taxonomy for English nouns and 
verbs. In Proceedings o/ACt '81, pages 133-138. 
N. Calzolari. 1984. Detecting patterns in a lexical data 
base. In Proceedings of COLING-84, pages 170-173. 
N. Calzolari. 1988. The dictionary and the thesaurus 
can be combined. In M.W. Evens, editor, Relational 
Models of the Lexicon: Representing Knowledge in Se- 
mantic Networks. Cambridge University Press. 
M.S. Chodorow, R.J. Byrd, and G.E. Heidorn. 1985. 
Extracting semantic hierarchies from a large on-line 
dictionary. In Proceedings of ACt '85, pages 299-304. 
B. Kirkpatrick. 1987. Roger's Thesaurus of English 
Words and Phrases. Penguin Books. 
J. Klavans and E. Tzoukermann. 1995. Combining cor- 
pus and machine-readable dictionary data for building 
bilingual lexicons. Machine Translation, 10:185-218. 
J. Klavans, M. Chodorow, and N. Wacholder. 1990. 
From dictionary to knowledge base via taxonomy. In 
Proceedings o/the Sixth Conference of the University 
of Waterloo, Canada. Centre for the New Oxford En- 
glish dictionary and Text Research: Electronic Text 
Research. 
K. Knight and S.K. Luk. 1994. Building a large-scale 
knowledge base for machine translation. In Proceed. 
ings of the Twelfth National Conference on Artificial 
Intelligence ( A A A I- g,t ), Seattle, Washington. 
J. Markowitz, T. Ahlswede, and M. Evens. 1986. Se- 
mantically significant patterns in dictionary defini- 
tions. In Proceedings of ACt '86, pages 112-119. 
G.A. Miller, R. Beckwith, C. Fellbanm, D. Gross, and 
K. Miller. 1993. Introduction to WordNet: An on- 
line lexical database. Five Papers on WordNet. 
P. Procter. 1978. Longman Dictionary of Contemporary 
English. Longman Group Ltd. 
P.M. Roget. 1852. Roget's Thesaurus of English Words 
and Phrases. Penguin Books. 
P. Vossen and A. Copestake. 1993. Untangling def- 
inition structure into knowledge representation. In 
T. Briscoe, A. Copestake, and V. de Pairs, editors; In- 
heritance, Defaults and the Lexicon. Cambridge Uni- 
versity Press. 
D. Yarowsky. 1992. Word-sense disambiguation using 
statistical models of Roger's categories trained on 
large corpora. In Proceedings o/COTING-92, pages 
454-460, Nantes, France. 
