The Use of WordNet in Information Retrieval 
Rila Mandala, Tokunaga Takenobu, and Tanaka Hozumi 
Department of Computer Science 
Tokyo Institute of Technology 
{rila, take, t anaka}@cs, t itech, ac. j p 
Abstract 
WordNet has been used in information retrieval 
research by many researchers, but failed to im- 
prove the performance of their retrieval system. 
Thereby in this paper we investigate why the 
use of WordNet has not been successful. Based 
on this analysis we propose a method of making 
WordNet more useful in information retrieval 
applications. Experiments using several stan- 
dard information retrieval test collections show 
that our method results in a significant improve- 
ment of information retrieval performance. 
1 Introduction 
Development of WordNet began in 1985 at 
Princeton University (Miller, 1990). A team 
lead by Prof. George Miller aimed to create 
a source of lexical knowledge whose organiza- 
tion would reflect some of the recent findings of 
psycholinguistic research into the human lexi- 
con. WordNet has been used in numerous nat- 
ural language processing, such as part of speech 
tagging (Segond et al., 97), word sense disam- 
biguation (Resnik, 1995), text categorization 
(Gomez-Hidalgo and Rodriguez, 1997), infor- 
mation extraction (Chai and Biermann, 1997), 
and so on with considerable success. However 
the usefulness of WordNet in information re- 
trieval applications has been debatable. 
Information retrieval is concerned with lo- 
cating documents relevant to a user's infor- 
mation needs from a collection of documents. 
The user describes his/her information needs 
with a query which consists of a number of 
words. The information retrieval system com- 
pares the query with documents in the collec- 
tion and returns the documents that are likely 
to satisfy the user's information requirements. 
A fundamental weakness of current information 
retrieval methods is that the vocabulary that 
searchers use is often not the same as the one by 
which the information has been indexed. Query 
expansion is one method to solve this problem. 
The query is expanded using terms which have 
similar meaning or bear some relation to those 
in the query, increasing the chances of matching 
words in relevant documents. Expanded terms 
are generally taken from a thesaurus. 
Obviously, given a query, the information re- 
trieval system must present all useful articles to 
the user. This objective is measured by recall, 
i.e. the proportion of relevant articles retrieved 
by the system. Conversely, the information re- 
trieval system must not present any useless ar- 
ticle to the user. This criteria is measured by 
precision, i.e. the proportion of retrieved arti- 
cles that are relevant. 
Voorhees used WordNet as a tool for query 
expansion (Voorhees, 1994). She conducted ex- 
periments using the TREC collection (Voorhees 
and Harman, 1997) in which all terms in the 
queries were expanded using a combination of 
synonyms, hypernyms, and hyponyms. She set 
the weights of the words contained in the orig- 
inal query to 1, and used a combination of 0.1, 
0.3, 0.5, 1, and 2 for the expansion terms. She 
then used the SMART Information Retrieval 
System Engine (Salton, 1971) to retrieve the 
documents. Through this method, Voorhees 
only succeeded in improving the performance 
on short queries and a tittle with no significant 
improvement for long queries. She further tried 
to use WordNet as a tool for word sense dis- 
ambiguation (Voorhees, 1993) and applied it to 
text retrieval, but the performance of retrieval 
was degraded. 
Stairmand (Stairmand, 1997) used WordNet 
to compute lexical cohesion according to the 
method suggested by Morris (Morris and Hirst, 
199 I), and applied this to information retrieval. 
31 
! 
! 
I 
! 
I 
I 
I 
I 
i 
I 
I 
I 
I 
I 
I 
I 
I 
I 
He concluded that his method could not be ap- 
plied to a fully-functional information retrieval 
system. 
Smeaton (Smeaton and Berrut, 1995) tried 
to expand the queries of the TREC-4 collec- 
tion with various strategies of weighting expan- 
sion terms, along with manual and automatic 
word sense disambiguation techniques. Unfor- 
tunately all strategies degraded the retrieval 
performance. 
Instead of matching terms in queries and doc- 
uments, Richardson (Richardson and Smeaton, 
1995) used WordNet to compute the semantic 
distance between concepts or words and then 
used this term distance to compute the similar- 
ity between a query and a document. Although 
he proposed two methods to compute seman- 
tic distances, neither of them increased the re- 
trieval performance. 
2 What's wrong with WordNet? 
In this section we analyze why WordNet has 
failed to improve information retrieval perfor- 
mance. We run exact-match retrieval against 
9 small standard test collections in order to 
observe this phenomenon. An information re- 
trieval test collection consists of a collection of 
documents along with a set of test queries. The 
set of relevant documents for each test query 
is also given, so that the performance of the 
information retrieval system can be measured. 
We expand queries using a combination of syn- 
onyms, hypernyms, and hyponyms in WordNet. 
The results are shown in Table 1. 
In Table 1 we show the name of the test col- 
lection (Collection), the total number of docu- 
ments (#Doc) and queries (#Query), and all 
relevant documents for all queries (#Rel) in 
that collection. For each document collection, 
we indicate the total number of relevant docu- 
ments retrieved (Rel-ret), the recall (~), 
the total number of documents retrieved (Ret- 
docs), and the precision t Rel-ret ~ for each of Ret-docs j 
no expansion (Base), expansion with synonyms 
(Exp. I), expansion with synonyms and hyper- 
nyms (Exp. II), expansion with synonyms and 
hyponyms (Exp. III), and expansion with syn- 
onyms, hypernyms, and hyponyms (Exp. IV). 
From the results in Table 1, we can conclude 
that query expansion can increase recall per- 
formance but unfortunately degrades precision 
32 
performance. We thus turned to investigation of 
why all the relevant documents could not be re- 
trieved with the query expansion method above. 
Some of the reasons are stated below : 
• Two terms that seem to be interrelated 
have different parts of speech in WordNet. 
This is the case between stochastic (adjec- 
tive) and statistic (noun). Since words in 
WordNet are grouped on the basis of part 
of speech in WordNet, it is not possible to 
find a relationship between terms with dif- 
ferent parts of speech. 
• Most of relationships between two terms 
are not found in WordNet. For example 
how do we know that Sumitomo Bank is a 
Japanese company ? 
• Some terms are not included in WordNet 
(proper name, etc). 
To overcome all the above problems, we pro- 
pose a method to enrich WordNet with an au- 
tomatically constructed thesaurus. The idea 
underlying this method is that an automati- 
cally constructed thesaurus could complement 
the drawbacks of WordNet. For example, as 
we stated earlier, proper names and their inter- 
relations among them are not found in Word- 
Net, but if proper names and other terms have 
some strong relationship, they often cooccur in 
the document, so that their relationship may be 
modelled by an automatically constructed the- 
saurus. 
Polysemous words degrade the precision of in- 
formation retrieval since all senses of the origi- 
nal query term are considered for expansion. To 
overcome the problem of polysemous words, we 
apply a restriction in that queries are expanded 
by adding those terms that are most similar to 
the entirety of query terms, rather than select- 
ing terms that are similar to a single term in 
the query. 
In the next section we describe the details of 
our method 
3 Method 
3.1 Co-occurrence-based Thesaurus 
The general idea underlying the use of term co- 
occurrence data for thesaurus construction is 
that words that tend to occur together in doc- 
uments are likely to have similar, or related, 
Collection 
ADI 
Table 1: Term Expansion Experiment 
#Doc #Query i #Rel, 
82 35 170 Rel-ret 
Recall 
Ret-docs 
Precision 
Results using WordNet 
CACM 3204 64 796 Rel-ret 
Recall 
Ret-docs 
Precision 
i 
CISI 1460 112 3114 Rel-ret 
Recall 
Ret-docs 
Precision 
CRAN 1398 225i 1838' Rel-ret 
! Recall 
,~ Ret-docs 
Precision 
J i 
INsPEc 12684 84 2543 Rel-ret 
i Recall 
Ret-docs 
Precision 
I i 
LISA 6004 35 " 339 Rel-ret 
Recall 
Ret-docs 
Precision 
I I 
ME\[) 1033 30 696 - Rel-ret 
Recall 
Ret-docs 
i Precision 
I i 
NPL 11429 100 2083 Rel-ret 
Recall 
Ret-docs 
Precision 
Th'vIE 
i 
423 24 324 i Rel-ret 
Recall 
Ret-docs 
Precision 
Base 
157 
0.9235 
2,063 
0.0761 
738 
0.9271 
67,950 
0.0109 
2,952 
0.9479 
87,895 
0.0336 
1,769 
0.9625 
199,469 
0.0089 
2,508 
0.9862 
564,809 
0.0044 
339 
1.0000 
148,547 
0.0023 
639 
0.9181 
12,021 
0.0532 
2,061 
0.9894 
267,158 
0.0077 
324 
1.000 
23,014 
0.0141 
Exp. I 
159 
0.9353 
2,295 
0.0693 
756 
0.9497 
86,552 
0.0087 
3015 
0.9682 
98,844 
0.0305 
1,801 
0.9799 
247,212 
0.0073 
2,531 
0.9953 
735,931 
0.0034 
339 
1.0000 
171,808 
0.0020 
662 
0.9511 
16,758 
0.0395 
2,071 
0.9942 
395,280 
0.0052 
324 
1.000 
29,912 
0.0108 
Exp. II 
166 
0.9765 
2,542 
0.0653 
766 
0.9623 
101,154 
0.0076 
3,076 
0.9878 
106,275 
0.0289 
1,823 
0.9918 
284,026 
0.0064 
2,538 
0.9980 
852,056 
0.0030 
339 
1.0000 
184,101 
0.0018 
670 
0.9626 
22,316 
0.0300 
2,073 
0.9952 
539,048 
0.0038 
324 
1.000 
33,650 
0.0096 
Exp. III 
169 
0.9941 
2,737 
0.0617 
773 
0.9711 
109,391 
0.0070 
3,104 
0.9968 
108,970 
0.0284 
1,815 
0.9875 
287,028 
0.0063 
2,536 
0.9972 
869,364 
0.0029 
339 
1.0000 
188,289 
0.0018 
671 
0.9640 
22,866 
0.0293 
2,072 
0.9942 
577,033 
0.0036 
324 
1.000 
32,696 
0.0095 
Exp. IV 
169 
0.9941 
2,782 
0.0607 
773 
0.9711 
116,001 
0.0067 
3,106 
0.9974 
109,674 
0.0283 
1,827 
0.9940 
301,314 
0.0060 
2,542 
0.9996 
912,810 
0.0028 
339 
1.0000 
189,784 
0.0018 
673 
0.9670 
25,250 
0.0267 
2,074 
0.9957 
678,828 
0.0031 
324 
1.000 
34,443 
0.0094 
meanings. Co-occurrence data thus provides a 
statistical method for automatically identifying 
semantic relationships that are normally con- 
tained in a hand-made thesaurus. Suppose two 
words (A and B) occur fa and fb times, respec- 
tively, and cooccur fc times, then the similarity 
between A and B can be calculated using a sim- 
ilarity coefficient such as the Dice Coefficient : 
2xf  
3.2 Predicate-Argument-based 
Thesaurus 
In contrast with the previous section, this 
method attempts to construct a thesaurus ac- 
cording to predicate-argument structures. The 
use of this method for thesaurus construction 
is based on the idea that there are restrictions 
on what words can appear in certain environ- 
ments, and in particular, what words can be ar- 
guments of a certain predicate. For example, a 
cat may walk, bite, but can not fly. Each noun 
may therefore be characterized according to the 
33 
verbs or adjectives that it occurs with. Nouns 
may then be grouped according to the extent to 
which they appear in similar constructions. 
First, all the documents are parsed using the 
Apple Pie Parser, which is a probabilistic chart 
parser developed by Satoshi Sekine (Sekine and 
Grisbman, 1995). Then the following syntactic 
structures are extracted : 
• Subject-Verb 
• Verb-Object 
• Adjective-Noun 
Each noun has a set of verbs and adjective 
that it occurs with, and for each such relation- 
ship, a dice coefficient value is calculated. 
2Xh.b(vi,n i) Csub(Vi, nj) -- l(v~)+L,b(nj)' 
where fsub(Vi, nj) is the frequency of noun 
nj occurring as the subject of verb vi, 
fsub(nj) is the frequency of the noun nj oc- 
curring as subject of any verb, and f(vi) is 
the frequency of the verb vi 
Cobj(Vi, n3) = 2×/obj(.,,~) f(vl)+fobj (nj)' 
where fobi(vi, ni) is the frequency of noun 
n i occurring as the object of verb vi, 
fobj(nj) is the frequency of the noun nj oc- 
curring as object of any verb, and f(vi) is 
the frequency of the verb vi 
• C~aj(a~,n3)= 2×/od~(.,,.~) f(ai)'l'fadj(nj) ' 
where f(ai, nj) is the frequency of noun 
nj occurring as argument of adjective ai, 
fadi(nj) is the frequency of the noun n i oc- 
curring as argument of any adjective, and 
f(a 0 is the frequency of the adjective ai 
We define the object similarity of two nouos 
with respect to one predicate, as the minimum 
of each dice coefficient with respect to that 
predicate, i.e. 
SI~'t/I, ub(Vi, rlj, nk)=min{C.ub(Vi, nj), C.ub(Vi, nk)} 
SIMobi(vi, n i, n~)=rnin{Cobj (vi, nj), Cob1 (vi, nh) } 
$IM~dj(ai, n i, nk)=min{C~dj(a~, n j), C~dj(a,, nk)} 
Finally the overall similarity between two 
nouns is defined as the average of all the similar- 
ities between those two nouns for all predicate- 
argument structures. 
34 
3.3 Expansion Term Weighting Method 
A query q is represented by a vector -~ = 
(ql, q2, ..., qn), where the qi's are the weights of 
the search terms ti contained in query q. 
The similarity between a query q and a term 
tj can be defined as belows : 
simqt(q, tj) = ~ qi * sirn(ti, tj) 
ti E q 
Where the value of sim(ti, tj) can be de- 
fined as the average of the similarity values in 
the three types of thesaurus. Since in Word- 
Net there are no similarity weights, when there 
is a relation between two terms in WordNet, 
their similarity is taken from the average of the 
similarity between those two terms in the co- 
occurrence-based and in predicate-argument- 
based thesauri. 
With respect to the query q, all the terms in 
the collection can now be ranked according to 
their simqt. Expansion terms are terms tj with 
high simqt(q, tj). 
The weight(q, tj) of an expansion term tj is 
defined as a function of simqt(q, tj): 
weight(q, tj) = simqt(q, tj) 
t, eq qi 
where 0 _< weight(q, tj) <_ 1. 
An expansion term gets a weight of 1 if its 
similarity to all the terms in the query is 1. Ex- 
pansion terms with similarity 0 to all the terms 
in the query get a weight of 0. The weight of an 
expansion term depends both on the entire re- 
trieval query and on the similarity between the 
terms. The weight of an expansion term can 
be interpreted mathematically as the weighted 
mean of the similarities between the term tj and 
all the query terms. The weight of the original 
query terms are the weighting factors of those 
similarities. 
Therefore the query q is expanded by adding 
the following query : 
~e = (al, a2, ..., at) 
where aj is equal to weight(q, tj) iftj belongs to 
the top r ranked terms. Otherwise a i is equal 
to 0. 
The resulting expanded query is : 
where the o is defined as the concatenation op- 
erator. 
The method above can accommodate the pol- 
ysemous word problem, because an expansion 
term which is taken f~om a different sense to the 
original query term is given very low weight. 
4 Experimental Results 
In order to evaluate the effectiveness of the pro- 
posed method in the previous section we con- 
ducted experiments using the WSJ, CACM, IN- 
SPEC, CISI, Cranfield, NPL, and LISA test col- 
lections. The WSJ collection comprises part of 
the TREC collection (Voorhees and Harman, 
1997). As a baseline we used SMART (Salton, 
1971) without expansion. SMART is an in- 
formation retrieval engine based on the vector 
space model in which term weights are calcu- 
lated based on term frequency, inverse docu- 
ment frequency and document length normal- 
ization. The results are shown in Table 2. This 
table shows the average of 11 point uninterpo- 
lated recall-precision for each of baseline, expan- 
sion using only WordNet, expansion using only 
predicate-argument-based thesaurus, expansion 
using only cooccurrence-based thesaurus, and 
expansion using all of them. For each method 
we give the percentage of improvement over the 
baseline. It is shown that the performance us- 
ing the combined thesauri for query expansion 
is better than both SMART and using just one 
type of thesaurus. 
Table 2: 
Thesauri 
Experiment Result using Combined 
CoU Ba~e WordNet 
only 
wsJ 0.245 o.28z 
(+2.0%) 
CACM 0.269 0.281 
~+4.5%) 
INSPEC 0.273 0.283 (+3.7%) 
czsz 0.2is 0.23z (+7 2%) 
Cran 0.412 -0.421 
~+= 3%) 
NPL 0.201 0.210 
(+4.2%) 
LISA i 0.304 0.313 
I , (+3,1%) 
Expand~l with 
Pred-ar$ Cooccur 
only 
0.258 (+5.2%) 
0.29\[ 
(+8.3%) 
0.284 
(+4.3%) 
O. 238 (+9.4%) 
0.441 (+7.0%) 
0.217 
(+8.t%) 
0.327 
(+7.o%) 
Combined 
only 
0.294 0.384 
(+t0.8%) (+58.7%) 
0.297 0.533 
(+zo.8%) (+98.2%) 
0.328 0.472 
(+20.4%) (+73.1%) 
0.262 0.301 
(+21.8%) (+81.3%) 
0.487 0.667 
(+zs.3%) (+82.z%) 
0.238 0.333 
(+Z7.5%) (+65.5%) 
0.369 0,485 
(+21.4%) (+~9.7%) 
35 
5 Discussions 
In this section we discuss why our method of 
using WordNet is able to improve the perfor- 
mance of information retrieval. The important 
points of our method are : 
• the coverage of WordNet is broadened 
• weighting method 
The three types of thesaurus we used have 
different characteristics. Automatically con- 
structed thesauri add not only new terms but 
also new relationships not found in WordNet. 
If two terms often cooccur together in a docu- 
ment then those two terms are likely bear some 
relationship. Why not only use the automati- 
cally constructed thesauri ? The answer to this 
is that some relationships may be missing in 
the automatically constructed thesauri. For ex- 
ample, consider the words tumor and turnout. 
These words certainly share the same context, 
but would never appear in the same document, 
at least not with a frequency recognized by a 
cooccurrence-based method. In general, dif- 
ferent words used to describe similar concepts 
may never be used in the same document, and 
are thus missed by the cooccurrence methods. 
However their relationship may be found in the 
WordNet thesaurus. 
The second point is our weighting method. 
As already mentioned before, most attempts at 
automatically expanding queries by means of 
WordNet have failed to improve retrieval effec- 
tiveness. The opposite has often been true: ex- 
panded queries were less effective than the orig- 
inal queries. Beside the "incomplete" nature 
of WordNet, we believe that a further problem, 
the weighting of expansion terms, has not been 
solved. All weighting methods described in the 
past researches of query expansion using Word- 
Net have been based on "trial and error" or ad- 
hoc methods. That is, they have no underlying 
justification. 
The advantages of our weighting method are: 
• the weight of each expansion term considers 
the similarity of that term with all terms in 
the original query, rather than to just one 
or some query terms. 
• the weight of the expansion term accom- 
modates the po\[ysemous word problem. 
This method can accommodate the polysemous 
word problem, because an expansion term taken 
from a different sense to the original query term 
sense is given very low weight. The reason for 
this is that, the weighting method depends on 
all query terms and all of the thesauri. For ex- 
ample, the word bank has many senses in Word- 
Net. Two such senses are the financial institu- 
tion and the river edge senses. In a document 
collection relating to financial banks, the river 
sense of bank will generally not be found in the 
eooccurmnce-based thesaurus because of a lack 
of articles talking about rivers. Even though 
(with small possibility) there may be some doc- 
uments in the collection talking about rivers, ff 
the query contained the finance sense of bank 
then the other terms in the query would also 
concerned with finance and not rivers. Thus 
rivers would only have a relationship with the 
bank term and there would be no relationships 
with other terms in the original query, resulting 
in a low weight. Since our weighting method 
depends on both query in its entirety and sim- 
ilarity in the three thesauri, the wrong sense 
expansion terms are given very low weight. 
6 Related Research 
Smeaton (Smeaton and Berrut, 1995) and 
Voorhees (Voorhees, 1994) have proposed an ex- 
pansion method using WordNet. Our method 
differs from theirs in that we enrich the cover- 
age of WordNet using two methods of automatic 
thesatmm construction, and we weight the ex- 
pausion term appropriately so that it can ac- 
commodate the polysemous word problem. 
Although Stairmand (Stairmand, 1997) and 
Richardson (Richardson and Smeaton, 1995) 
have proposed the use of WordNet in informa- 
tion retrieval, they did not used WordNet in the 
query expansion framework. 
Our predicate-argument structure-based the- 
satmis is based on the method proposed by Hin- 
die (Hindle, 1990), although Hindle did not ap- 
ply it to information retrieval. Instead, he used 
mutual information statistics as a Similarity co- 
efficient, wheras we used the Dice coefficient for 
normalization purposes. Hindle only extracted 
the subject-verb and the object-verb predicate- 
arguments, while we also extract adjective-noun 
predicate-arguments. 
Our weighting method follows the Qiu 
36 
method (Qiu and Frei, 1993), except that Qiu 
used it to expand terms only from a single auto- 
matically constructed thesarus and did not con- 
sider the use of more than one thesaurus. 
7 Conclusions 
This paper analyzed why the use of WordNet 
has failed to improve the retrieval effectiveness 
in information retrieval applications. We found 
that the main reason is that most relationships 
between terms are not found in WordNet, and 
some terms, such as proper names, are not in- 
eluded in WordNet. To overcome this problem 
we proposed a method to enrich the WordNet 
with automatically constructed thesauri. 
Another problem in query expansion is that 
of polysemous words. Instead of using a word 
sense disambiguation method to select the apro- 
priate sense of each word, we overcame this 
problem with a weighting method. Experiments 
proved that our method of using WordNet in 
query expansion could improve information re- 
trieval effectiveness. 
Future work will include experiments on 
larger test collections, and the use of WordNet 
in methods other than query expansion in infor- 
mation retrieval. 
8 Acknowledgements 
The authors would like to thank Mr. Timothy 
Baldwin (TIT, Japan) for his comments on the 
earlier version of this paper, Dr. Chris Buck- 
Icy (Cornell Univesity) for the SMART support, 
and Mr. Satoshi Sekine (New York University) 
for the Apple Pie Parser support. 

References 
J.Y. Chai and A. Biermann. 1997. The use of 
lexical semantics in information extraction. 
In Proceedings of the Workshop in Automatic 
Information Extraction and Building of Lez- 
ical Semantic Resources, pages 61-70. 
J.M. Gomez-Hidalgo and M.B. Rodriguez. 
1997. Integrating a lexical database and a 
training collection for text categorization. In 
Proceedings o? the Workshop in Automatic 
Information Extraction and Building o? Lez- 
ical Semantic Resources, pages 39-44. 
D. Hindle. 1990. Noun classification from 
predicate-argument structures. In Proceed- 
ings of 28th Annual Meeting of the ACL, 
pages 268-275. 
G.A Miller. 1990. Special issue, wordnet: An 
on-line lexical database. International Jour- 
nal of Lezicography, 3(4). 
J. Morris and G. Hirst. 1991. Lexical cohesion 
computed by thesaural relations as an indica- 
tor of the structure of text. In Proceedings of 
A CL Conference, pages 21--45. 
Qiu and H.P. Frei. 1993. Concept based query 
expansion. In Proceedings of the 16th A CM 
SIGIR Conference, pages 160--169. 
P Pa~nik. 1995. Disambiguating noun grouping 
with respect to wordnet senses. In Proceed- 
ings of 3rd Workshop on Very Large Corpora. 
R. Richardson and A.F. Smeaton. 1995. Us- 
ing wordnet in a knowledge-based approach 
to information retrieval. Technical Report 
CA-0395, School of Computer Applications, 
Dublin City University.. 
G. Salton. 1971. The SMART Retrieval Sys- 
tem: Experiments in Automatic Document 
Processing. Prentice-Hall. 
F. Segond, A. Schiller, G. Grefenstette, and 
J. Chanod. 97. An experiment in semantic 
tagging using hidden markov model tagging. 
In Proceedings of the Workshop in Automatic 
Information Extraction and Building of Lex- 
ical Semantic Resources, pages 78-81. 
S. Sekine and R. Grishman. 1995. A corpus- 
based probabilistic gr~rnrnar with only two 
non-terminals. In Proceedings of the Interna- 
tional Workshop on Parsing Technologies. 
A.F. Smeaton and C. Berrut. 1995. Running 
tree-4 experiments: A chronological report 
of query expansion experiments carried out 
as part of tree-4. Technical Report CA-2095, 
School of Comp. Science, Dublin City Univer- 
sity. 
M.A. Stairmand. 1997. Textual context analy- 
sis for information retrieval. In Proceedings of 
the ~Oth A CM-SIGIR Conference, pages 140-- 
147. 
E.M. Voorhees and D. Harman. 1997. Overview 
of the fifth text retrieval conference (trec- 
5). In Proceedings of the Fifth Text REtrieval 
Conference, pages 1-28. NIST Special Publi- 
cation 500-238. 
E.M. Voorhees. 1993. Using wordnet to disarn- 
biguate word senses for text retrieval. In Pro- 
ceedings of the 16th A CM-SIGIR Conference, 
pages 171-180. 
E.M. Voorhees. 1994. Query expansion using 
lexical-semantic relations. In Proceedings of 
the 17th ACM-SIGIR Conference, pages 61- 
69. 
