Genus Disambiguation: A Study in Weighted Preference* 
Rebecca Bruce and Louise Guthrie 
Compnting Research Laboratoly 
Box 30001 
New Mexico State University 
Los Cruces, NM 88003-0001 
ABSTRACT 
The atttolnatic coustmction of all IS A taxonomy of 
noun senses from a machine readable dictionary 
(MRD) has long been sought, but achieved with only 
limited success. The task requires the solution to two 
problems: 1) To define an algorithm to automatically 
identify the genres or hypemym of a noun definition, 
and 2) to define an algorithm for lexical disambigua- 
tion of the genus term. In the last ~ew years, effec- 
tive methods for solving the first problem have been 
developed, but the problem of creating an algorithm 
for lexical disambiguation of the genus terms is one 
that has proven to be very difficult. In COL1NG 90 
we described our itutial work on the automatic crea- 
tion of a taxonomy of noun senses from Longman's 
Dictiorlary of Contemporary English (LDOCE). The 
algorithm for lexical disambiguation of the genus 
teml was accurate about 80% of the time aid made 
use of the semantic categories, the subject area luark- 
ings and the frequency of use uffonnatiou in LDOCE. 
In this paper we report a series of experimcuts which 
weight the three factors in various ways, and describe 
our improvements to the algorithm (to about 90% 
accnracy). 
1. Introduction 
Much of tile previous research on the construc- 
tion of networks of ganns terms front MRD's 
(Amsler and White 1979; Chodorow et al. 1985; 
Nakanmra and Nagao 1988; Vossen 1990) rexluired 
human intervention 1o distinguish fire sanses. 
Recantly, several researchers (Veronis and Ide 1990; 
Klavans et. al 1990; Copestake 1990; Vossen 1991) 
have suggested techniques for arttomatic disambigua- 
tion of these taxonomies based on neural net tech- 
niques, word overlap, or bilingual dictionaries. The 
* ThiB r~trch w~ ilupported by NSF Grant No. IRI- 
8811108. 
techniques we have used to construct a network of 
rialto senses autoluatically from tile Longman Dic- 
tionary of Coutenlporaly Falglish (LDOCE) differ 
snbst~mtialiy t+rom any of \[l'tose methods. 
In (Guthrie et al. 1990), we suggested and 
algorithm for disanlbiguating the gentls terms of nout~ 
definitmns in LDOCE. The procedure we nsed was 
based on the assumption that the semantic relation- 
ship between the headword and its genus should be 
reflected m their 1,1X)CE semantic categories. In 
other words, the semantic category of tim genus word 
should be identical to, or an ancestor of, the semantic 
category of the headword (an ancestor is a super- 
ordinate term in tire hierarchy ot semaltic codes). 
tJsthg a tandont saInplc of 520 noun word sanse from 
I,DOCE, we tested this assmnption. 
The semantic categories used (them ate thirty- 
four in all) were detined by tile LlYOCE lexicogra- 
phers, who placed sixteen of ttle basic categories in a 
hierarchy. The notion of a "more general semantic 
categoly" was somewhat subjective, as is illustrated 
in tile next section. 
The disautbiguation algorithm presented th 
(Guthrie et at. 1990) utilized three factors in deter- 
mmmg the correct gcnns sense. The algorithm is 
stated as follows: 
• Choose tile genns sense with tile same senlalltiC 
category as the headword (or closest more gan- 
eral category if this is not possible). 
+ In the case of a tie, chonsca sense with has the 
sanle pragnlatic c(×le 
" In case there is still a tie,, or no germs sense 
meeting tim above criteria, choose the most fle- 
quently used sense:l: of the gemls word. 
:c In lh~ 2nd edition of LDOCE, rio: publidaors st¢,t~ fltat 
the ot'd~l" ill which word ~lBe8 me liate~l correspondn to am 
fieatuency with which each ~nne i+ u~e.d (ie. the tir+t ~nae 
li~¢d in tile most conmmnly u~d, ate.). W~ have obnelvad 
ACRES DE COL1NG-92, NAND~S, 23-28 AO~r 1992 1 1 8 7 PROC. OF COL1NG-92, NANrES, AUG. 23 28, 1992 
The algorithm was successful abont 80% of the time. 
In an effort to improve the disambiguation 
algorithm, we condncted a series of experiments 
designed to identify more completely the contribution 
of each factor consider hi the algorithm. Since we 
considered three factors in determining the correct 
genus sense (the semantic code relationship, the prag- 
matic code relationship, and the frequency informa- 
tion), we designed experiments to first test each fac- 
tor separately, and then again in combination, weight- 
ing each input according to its individual predictive 
value. Below we describe those experiments, begin- 
ning with the formulation of each factor, and undhig 
with the assignment of weights to the contribution of 
each input in file final disambiguation algorithm. 
2. Sense Selection Based on LDOCE Semantic 
Codes 
This section describes our investigation of the 
use of semantic category information for disambigua- 
lion, and outlines the problems in using that type of 
information. The basic hierarchical strlmtum of the 
semantic codes provided by LDOCE is depicted in 
Figure 1. In addition to the codes positioned in that 
tree structure, seventeen other codes, which we refer 
to as "composite" are defined as follows: 
E = solid or liquid 
U = collective and animal or human 
O = animal or human (sex tmspecified) 
K = male (ammal or human) 
R = female (animal or human) 
V = plant or animal (not human) 
W = abstract and inanimate 
Y = abstract or animate 
X = not concrete or animal (abstract or htunan) 
Z = unmarked (no semantic restriction) 
1 = human and solid 
2 = abstract and solid 
3 = "it" as subject or object 
4 = physical quantities 
5 = organic materials 
6 = liquid and abstract 
7 = gas and liquid 
To evaluate our assumption that the semantic 
category of the genus word is the same or more gen- 
that the listing order of senses in the let edition of LDOCE is 
similm" to that of the 2rid, tnd have found empirical evidence 
in tim work of Guo (19891 mad this Itudy to show that • simi- 
htr connection botwtam the ord*r in which word ~n~ Jro 
listed And the, fr~luoney with which they arm uJcd (in 
LDOCE) holds for the lit edition u well. 
era) than the semantic category of the headword, it 
was necessary to define what we meant by "more 
general" for the composite categories. We did this by 
incorporating the composite codes into the hierarchi- 
cal structure display in Figure 1, and defining a 
semantic distance between word senses based on the 
placement of their respective codes in the hierarchy. 
It was obvious from the start that the addition of 
these cedes te the tree depicted in Figure 1 would 
create a tangled hierarchy. The problem was to 
decide where these codes should be placed in the tree 
stnlctnre in order to preserve inheritance. For exmn- 
pie, shenld "E" (the code for "solid or liquid") be 
placed above or below "solid" and "liquid", and 
would a similar placement hold for code 7, which 
reads "gas AND liquid" (as opposed to "liquid OR 
solid")? 
T C 
(abstract) (coacrete) 
\] Q 
(inanimate) (animate) 
S L G P A H 
(mtid) (liquid) (gas) (plant) (animal) (l~umanl 
J N B D M F 
(movable tnotmovable (animal (animal (humnn (buntnn 
solid) u~l) \[emale) male) male} female) 
Figure 1: 
Basic Hierarchy of LDOCE Semantic Codes 
To answer such questions, two types of studies 
were conducted. The first was an in-depth look at the 
words marked with composite codes (nouns marked 
to identify a semantic category and adjectives and 
verbs marked as to their selection restrictions). The 
second was a survey of the genus senses for head- 
words with composite semantic codes. As might be 
expected, there were inconsistencies in the assign- 
ment of nouns categories. For example, within the 
"liquid" categories, we observed that nouns which 
represent both liquids and solids can be found in both 
categories L and E, mad abstractions of liquids can be 
found in categories L, 6, and 7. This is not surpris- 
ing, as it is difficult to create distinct categories for 
overlapping concepts. 
ACRES DE COLING-92, NANTEs, 23-28 AOt;l" 1992 1 ! 8 8 PROC. OI: COLING-92, NANTES, AUG. 23-28, 1992 
Our proposed placement of composite codes 
within the hierarchy structure provided by LDOCE is 
presented in Figure 2. In constructing Figure 2, we 
attempted to create a hierardly which would reflect 
not only the data gathered on the properties of words 
assigned to each category, but also the most fre- 
quently occurring superset for each composite code, 
based on tire results of tile second study. 
Z 
(no semantic rlnltriction) /2 
T,W,X,Y~2,4,6,7 C 
(abstract) (concrete) 
LW Q,Y,S 
(inanimate) (animate) 
S,E,1,2,5 L,E,~,7 G,7 P,V A,O,V It,O,X,l 
(Iolid) (liquid) (gas) (plant) (animal) (human) 
J N F,,R D,K M,K F,R 
(movable (m~ movable (msmull (aalmal (human (human 
salkl) solid) femnle) male) auk) female) 
Figure 2: 
Revised Hierarchy of LDOCE Semantic Codes 
Based on this study of the semantic codes used 
in LDOCE, three inlplelnentations of a partial genus 
sense selection algorittun (partial becanse at this time 
we are only considehng the contribution made by the 
semantic code comparison to sense selection) were 
found to be possible. They are as follows: 
I. Selection of the genus sense with a minimum 
semantic distance fiom the headword sense, 
where semantic distance is measured by the 
placement of the respective codes in the hierar- 
chy presented in Figure 2. (This formulation of 
a genus sense selection criteria is the basis of 
the algorithm reported in Guthrie et al. 1990.) 
2. Choose the genus sense with a semantic code 
belonging to fire stone code set as fire code of 
the headword, where the code sets are the 
nodes of the tree structure presented in Figure 
2. 
3. Select the genus sea\]se with a semantic code 
identical to the headword. 
3. Sense Selection Based on LDOCE Pragmatic 
Codes 
Tile pragmatic codes in LDOCE are another set 
of terms organized into a hierarchy, although the 
hierarchy provided by LDOCE is quite fiat. As 
stated earlier, these terms are used to classify words 
by subject area. The LDOCE pragmatic coding sys- 
tem divides all possible subjects into 124 major 
categories, ranging frmn aeronautics, aerospace, and 
agriculture, to winter.sports, and zoology. The hierar- 
chy is only two layers deep, and the 124 majol 
categories have equal aa~d unrelated status. 
Slator (1988) m\]plemented a scheme which 
imposed deeper structure onto the LDOCE pragmatic 
code hierarchy. He restructured the LDOCE prag- 
matic code hierarchy by making Communication, 
Economics, Entertainment, Household, Politics, Sci- 
ence, and Transportation flmdamental categories, and 
grouping all other pragmatic codes under those head- 
ings. His restructuring of tile code hierarchy revealed 
that words classified under Botany have pragmatic 
connectious to words classified as Plant-Names, as 
well as connections with other words classified under 
Science. 
We investigated four implementations of a 
germs sea~se selection algorittun based on pragmatic 
codes. The first implementation utilized the hierar- 
chy developed by Slator. In that schelne, file prag- 
matic cedes were arranged in a tree structure in 
which each node of the tree is a single pragmatic 
e(xle. 
In addition, pragmatic code sets were defined 
direedy from Slator's hierarchy by creating seven 
large groups cort~..sponding to the seven subtrees of 
tile top level of the hierarchy. Each of the seve~l 
code sets contained all codes descendant from tire 
correspending top level node. Within this construc- 
tion, lack of common set menthership is a strong 
indication of disjoint subject areas. 
In summa\[y, we proposed four approaches to 
genus sense selection based on praglnatic codes: 
1. Choose the ganus sense with minimmn prag- 
matic distance from the headword sense, where 
pragmatic distance is measured by the place- 
ment of the respective codes in the hierarchy 
implenlented by Slator. 
2. Select the genus sense with a pragmatic code 
belonging to the sane code set as the code of 
the headword. Seven code sets were con- 
stmcted corresponding to the seven major divio 
sinus of Slator's baerarcby. 
3. Rule out all headword/genus sense combina~ 
tions with pragmatic codes that are not in the 
same code set. 
AclT,:s DE COL1NG-92, NANTES, 23-28 AO~I 1992 1 1 8 9 Paoc. OF COLING-92. NANTES, AUo. 23-28. 1992 
4. Select the genns sense with a pragmatic code 
identical to the headword. 
4. Results of the Experimentation 
All tests of the proposed sense selectien 6riteria 
were mn on the same random sample of 520 
definitions. Table I provides a summary of the 
relevant test results. Although each selection 
mechanism was evaluated separately, because of the 
large nmnber of word senses having either redundaut 
code markings, or no markings at all (particularly 
with pragmatic codes), it was necessary to introduce 
a default or "tie breaking" mechanism for all selec- 
tion criteria other than usage frequency. Usage fie- 
quency was established as the default selection 
mechanism for all tests. When no sense selection (or 
no nnique sense selection) could be made based on 
the criteria beiug tested, the sense selection was 
based on usage fi'equency (ie., of the competing 
senses, the sense cccurrmg first in the listing order 
was selected). 
The variation in performance between all 
approaches developed for genus sense selection was 
relatively small - no more than 8%. Both the best 
mad the worst performance of a single sense selection 
parameter was achieved using pragmatic code rela- 
tionships. The best performance (80% success rate) 
resulted from requiring identical code markings for 
headword and genus senses. The worst disambigua- 
tion performance was the resnlt of sense selection 
based on common pragmatic code set membership. 
The variation in disambiguation performance was 
small in the experiments which used only the seman- 
tic code information. The maximum success rate of 
77% resulted fi'om stipulating common code set 
membership, while the minimmn success rate was 
75% for identical code designation. 
Some of the test results were uI~expected: for 
instance, we did not expect selectien of the first sense 
listed to yield a 76% success rate. Net did we expect 
sense selection based on a subset/superset relation- 
ship between codes to be as unsuccessful as it was, 
yielding no more than a 78% success rate for both 
pragmatic and semantic codes. 
Although the experiments showed that a direct 
inatch of pragmatic codes was the most successfifl 
single selectiou mechanism, the result is somewhat 
misleading. Because many words have no pragmatic 
cede, the defanlt rule was applied often, resulting in 
the selection of the most frequently used sense, l-lav- 
ing said that, it remains true that the tests show prag- 
matic code information to be the best predictor of the 
correct genus sense, when it is present. 
SUMMARY OF DISAMBIGUATION EXPERIMENTS 
GENUS SENSE SELECTION TEST RESULTS 
MECHANISM ~,i • i i 
Se.le~on b4u~ oft t~a'mntic codes: 
,~rupcrt~t rd~onthip 
,iml:~,emeaated with code hiex~chy 75% corn~ 
common code ~t membership 77% cc~mct 
idcatiod code dcsigntdon 75% 
Sclc~ic4a bated o~ Ft, agmatic codea: 
ootra'noa code set membenhip 72% correct 
q,,a~,,d) 
tx~rrmm,n code let membenthip 72% eotre~ 
(exduIive) 
,,idled txlde deAignalioft 80% correct 
Seie.oion bated on Usage frequency: 
Weighted. 3 pammemr S~teetion Algorithm 
common :mnumdc eodc tO. - weight 1 
id,mtieal p.mgraatic code - weight I 80% 
u~ frotu~cy - tie breaker 
eonma, ma *umatntic code v.~ - weight 1 
ideatiett Intgauttic code - we/S/at 2 80% correct 
usage frequency - ti~ b~tker 
ram.male ~ hierarchy - weight 1 I 2 .... 
potgnuttte code hicnm~hy - weight 2 79% correct 
u~ frequoacy - ~ ~er 
,mmmatic code tet - weight 1 
idmttieal pragmatic eodc - w~ght 2 90% cotreta 
mttgc ft'equea~ - tie hi~k~ 
b~u~l-ca~ ~xcttai~s.indt~led 
Table 1: Suatmary of Disambiguation Experiments 
Table 1 also displays file results of tests per- 
formed using all three factors in combination. These 
experiments were conducted to determine the 
optimum weight to assign each of the three factors 
when considering their ctanulative predictive capabil- 
ity. The selection of weights was based on the per- 
formance of each factor individually. Again, the 
variation in performance across all tests of different 
weighings was small (less than 1%). The highest 
success rate was achieved when pragmatic code 
information received tile greatest weight. 
As a result of these tests, our disambiguation 
algorithm was forumlated as follows: 
• Choese the most frequently used genus sense 
unless an altemate sense choice is indicated by 
a strong relationship between headword and 
genus codes, either semantic or pragmatic. 
* If the sense selection based on semantic codes 
differs from that inferred by the pragmatic 
ACYES DE COLING-92, NANqES, 23-28 AOt~,r 1992 i 1 9 0 I)ROC. OF COLING-92, NANTES, AUtL 23-28, 1992 
codes, base file seine selection on tile prag- 
nlatic cedes. 
• Select among conlpeting germs senses with 
identical code markings by choosing the most 
frequently used sense. 
By a "strong relationship" in the case of 
semantic codes, we nlean menlbership in file saiue 
code set. This is not surprising due to the limited 
scope of the code sets, and the inhel~nt overlap of the 
composite codes. Strong relationship for pragmatic 
codes means an exact ulatch. 
5. The Final Disambiguation Algurlthm 
Review of tile output data from e, ach disaarbi- 
guation trial using tile tilrec parmncter algorithm 
revealed that tile majority of the failures were on a 
very small number of frequently occurring germs 
words. Often, the pragmatic and senaintic 
classifications of these word senses were either 
deficient (lacking in code information), or redrmdant 
(more than one word sense having the Sanle nmrk- 
ings). Such situations frequently arise with very 
abstract words (e.g. pat, quality, piece, arid 
ntmaber) where fllere are nnnlerous word seaises, and 
most (if not all) senses have identical semmltic codes 
mid no pragmatic codes. 
The filial modificahon to onr gentts sense selec- 
tion algorithm was introduced to solve this problenl: 
the correct sense selections fol words with errors in 
their code information, as well as certain very general 
words are pre-selected, and assumed to be constant. 
Fewer than ten words required haild coding of the 
correct sense and ahnoat all were abstract words such 
as part or quality. While it is tlue that tile majority 
of these words are "disturbed heads" (Gnthrie et al. 
1990), and will, in the fnture, not seive as geims 
terms but rather as identifiers of alternate link types, 
we still require that they be sense disambignated to 
serve as relation descriptors. This fiiml modification 
to the sense selection algorithm mcleased pelfol- 
malice by 10%, resulting in success rate of 90%. 
6, References 
Amsler, Robert A., and Jotm S. White (1979). 
DevelopmeJtt of a Computational Methodology 
for Deriving Natural Lairguage Semailtic Struc- 
tures via Analysis of Maehine-leadable Dic- 
tionaries. Technical Report MCS77-01315, 
NSF. 
Copestake A. (1990). An approach to building the 
hierarchical element of a lexical knowledge 
base from a nmchine readable dictionary, 
Proceedings of the First International 
Workshop on htheritance inNatural Language 
Processing, Tilburg, The Nethellands, pp. 19- 
29. 
Clnxiurow, Martin S., Roy J. Byrd, and George E. 
lteidorn (1983). Extracting Semantic Hierar'- 
cities fiurn a Large On-Line Dichonary. 
Prot:eedings of the 23rd Annual Meeting of the 
ACL, Chicago, IL, USA, pp.299-304. 
Gno, Cheng-Ming (1989). Constructing a Machine 
Tractable Dictionary From Longman Diction- 
ary of Contemporary Farglish, Memoranda in 
Computer and Cognitive Science, MCCS-89- 
156. Computing Research Laboratory, New 
Mexico State University. 
Gutinie, Louise, Brian Slator, Yorick Wilks, and 
Rebecca Bluce (1990). Is there content ill 
Empty tleads? Proceedings of the 13th Interna- 
tional Conference on Computational Linguis- 
tics (COLING-90), Helsil~i, Finland, 3, 
pp.138-143. 
Klavails, J., Chodorow, M., Wacholder, N. (1990). 
From Dictionary to Knowledge Base Via Tax- 
onomy. Proc. of tile 6th Conference UW 
Center for ttw. New OED, Waterloo, pp. 110- 
132. 
Nakanlura, Jnn-ichi, and Makoto Nagao (1988). 
Extraction of Semantic hfformation fronl an 
OJdinary English Dictionary and its Evaluation. 
Proceedings of the 12th International Confer- 
enee on Computational ldnguistics (COIJNG- 
88), Budapest, Hungary, pp.459-464. 
Slator, Brian M (1988). Constructing Contextually 
Organized Lexical Semailtic Knowledge-bases. 
Proceedings of the Third Annam Rocky Moun- 
tain CorCercnce on Artificial Intelligence 
(RMCAI-88), De, nvel, CO, pp. 142-148. 
lde, N.N. and J Veronis (1990). Very Large Neural 
Networks for Word Sense Disambignation. 
European Conference on Artificial Intelligence, 
ECAI '90, Stockhohn. 
Vossen, P. (199l). Polysemy and Vagueness of 
Meailing l)eseiiptions in the Longman Diction- 
aiy of Contemporary English. In J. Svartvik 
mid 11. Wekkel (eds.), 7bpics in English 
ldnguisties. Mouton de Gluyter. 
Acql~s DF COLING-92, NAN'I'ES, 23-28 AOt;r 1992 1 1 9 l PROC. OF COLING-92, NANq'ES, AUG. 23-28, 1992 
