Statistical Matching of Two Ontologies 
Satoshi Sekine 
Computer' Scmnce Department 
New York Umvermty 
715 Broadway, 7th floor 
New York, NY 10003 USA 
\[seklne \[klyo7793\] @cs nyu edu 
Kiyoshi Sudo Takano Ogino 
Electromc Dlctmnary Research 
78-1 Sakumahlgan, Kanda, Chmda-ku 
Tokyo, 101-0026 Japan 
oglno@edr co 3P 
1 Introductmn 
Standardizing ontologms ~s a challenging task 
Ontologms have been created based on different 
backgrounds, different purposes and different peo- 
ple However, standardizing them is useful not 
only for applications, such as Machine Translation 
and Information Retrmval, but also to Improve the. 
ontologms themselves During the process of stan- 
dardization, people can find bugs or gaps in on- 
tologms So standardlzatmn bnngs benefits com- 
pared to just using them separately There is a 
committee for standardlzmg ontologaes at ANSI, 
the "ANSI Ad-Hoc Group for Ontology Stan- 
dards" (Hovy 1996) 
Although there have been a few attempts to 
merge and compare ontologaes, th,s work ~s still 
at a prehmmary stage of research (Ogmo et al 
1997) attempts manual mergang of EDR (EDR 
1996) (Mlyoshl et al 1996) and WordNet (Word- 
net) (Miller 1995), (Utlyama and Hashlda 1997) 
used statistical methods to merge EDR and Word- 
Net (Pangloss) is also working on standardizing 
ontologms It is certain that manual methods have 
great difficulty in matching the entire ontologles 
It would require three thousand years for a per- 
son to check all possible node pairings, if the two 
ontologms have 40 000 nodes each.and eachjudge- 
ment takes a minute So automatic methods are 
needed to find matches automatically or at least 
to narrow down the candidates for matching 
In this paper, we investigate a simple statistical 
method for matching two ontologms The method 
can appl~ to any ontologms which are formulated 
from ls-a relationships In our experiments, we 
used EDR and \VotdNet Tins ~ork is sumlar to 
the work in (UtL~ama and Hashlda 1997) They 
defined the task as the MWM (Maximum V~elgnt 
klatch) of bipartite graphs, an approach which 
is bas~cally common to most ontology matching 
schemes The information they used is partially 
fuzzy, i e for calculating the distance between two 
nodes, they used the information from each node 
and its neighborhood, not distinguishing between 
mformatmn from parent and child nodes How- 
ever, since the structure of the ontologms (the re- 
lation between parent and children) is slgmficant, 
it might be better to utilize such structural refor- 
mation In our experiments, we will focus on this 
issue, rather than trying to achieve a higher per- 
formance The importance of parent, child and 
grandchild information will be examined We will 
conduct several experiments with or without some 
of the mformatlon It is also important to dlsco~er 
what welghtmg balance gives good matches 
2 Ontologies 
First we will briefly explain the ontologms we used 
m our experiments 
2.1 EDR 
The EDR Concept Dmtlonary contains 400,000 
concepts hsted m the Japanese and Enghsh Word 
Dmtlonanes of 200,000 words each The EDR 
Concept Dictionary is one of the five types of EDR 
dictionaries, the others are the Word Dmtlonarms 
for English and Japanese the Blhngual Dictio- 
nary, the Coocurrence Dictionary, and the Tech- 
mcal Telmmology Dxctlonar} The EDR Con- 
cept Dictionary consists of three sub-dmuonanes 
the Headconcept Dlctxonaz} contains concept ex- 
planations m natural language (both m Engh~h 
and Japanese),~the Concept Classification Dmuo- 
nar} contains a set of ls-a relationships, and the 
Concept Description Dictionary contains pairs of 
concepts that have certain semantic relationships 
other than ls-a relationship 1 e object, agent 
9oal, zmplement a-object (object of a particular at- 
tribute), place, scene and cause 
The Concept Classification Dmtlonar~ classifies 
all the 400 000 concepts based on their meaning 
A polysemous ~ord is put into several word cias- 
sffieatmns (concepts) As multiple inheritance l~ 
allowed, the entire structure is not a tree but a 
DAG (directed acychc graph) There are 6,000 
intermediate nodes and the maximum depth is 16 
2 2 WordNet 
WordNet (Wordnet) is an English ontology The 
nodes are represented by a set of synonym words 
(called ' s? nsets ') WordNet contains 60,557 noun 
69 
synsets, 11,363 adjective synsets, and 3,243 ad- 
verb synsets Between synsets, there are rela- 
tions whmh include (but are not hmited to) hy- 
pernymy/hyponymy, antonymy, entailment and 
meronymy/holonymy A word or collocatmn may 
appear m more than one synset, and in more than 
one part of speech The words m a synset axe log- 
ically grouped such that they are interchangeable 
m some context 
3 Experiments 
The basic idea of the matching m to find the dm- 
tance (similarity) between a node in EDIt and a 
node m WordNet There could be several strate- 
gins for defining a distance between two nodes, ~e 
will use the words attached to each node and its 
parent, child and grandchild m the computatmn 
We did not use the descmptmns of concepts 
As a prehmmary experiment, we restricted the 
number of nodes to be considered, because both 
ontologms are big We used the nodes at the top 
5 levels (distance from the top is at most 5) and 
deleted nodes which have no English words and no 
descendents In EDIt (some EDIt nodes have only 
Japanese words) This left 14,712 nodes In EDIt 
and 5,185 m WordNet Even with these restric- 
tion, the number of possible pmrmgs Is 76,281,720 
Our target m to find good matches among them 
3 1 Definition of Distance 
The dmtance between nodes is defined based on 
the notion which ~s commonly used, the dine co- 
efficient Assume the node N1 m ontologyl has 
nl words and N~ m ontology2 has n2 words, and 
there are m words m common The dice coefficmnt 
(DC) is defined as follows 
2m DC(NI,N2) = 
nl -t- n 2 
Now we define the basic distance as 1 minus the 
~alue The smaller the distance, the closer the two 
nodes 2m 
dzst(N1, N~) = 1 (1) 
n 1 -4- n2 
We now define the distance of two nodes 
(N1,N~.) based on the basra dlstance definition 
The words m parents, children and glandchildren 
are also used Such nodes are taken as a bag of 
nodes, le only one set of words is created for 
each category regardless of the number of node~ 
Such a bag of nodes is represented as N parent and 
so on The distance of each category is calcu- 
lated just hke the basic d~stance In the following 
equation, cat should be replaced by parent, ztsel/, 
chzld and gchdd (for grandchild) 
dzstcat(N1,N2) = dzst(N~at,N~ ~t) 
2rn cat = 1 
Then interpolation is used to merge the four 
basic distances in order to keep the range 
from 0 to 1 We Introduce four coefficients 
cParent,cttsel/,acMld,c gch~ld tO define the node dls- 
tance , D(Nt, N2) 
D(N1,N2) = c p .... t dzstP~.e,~t(N1,N2 ) + 
c 't~e~/ d:st'ts~i(Ni,N~) + 
cCa'ta dzstChad(Ni,N2) + 
cgchtld dzstgChaU(N1, N2) 
cParent "1" cttself "f" cChtld + C gch'ld : 1 (2) 
The coefficients (cent's) will be the lraportant 
factor in the experiments As will be described 
m the next section, we use several combinations 
of the coefficients to observe which mformation ts 
important 
3 2 Experiments 
We conducted eight experiments using different 
combinations of the coefficients The first expem- 
ment uses only the reformation in the nodes them- 
selves, while other expemments use the node and 
parent, the node and children, or all four sets 
Table 1 shows the coefficient combinations used 
m the expemments 
E? 
2 
3 
4 
5 
6 
7 
8 
parents self child gchild 
00 10 00 00 
00 03 07 00- 
00 05 05 00 
O0 07 03 O0 
03 07 O0 O0 - 
02 05 03 O0 
02 06 02 00 
02 05 02 01-- 
Table.1 Coefficmnt- C0mbmatlon 
3 2 1 Analysm of the statmt~cal results 
Before descmblng the e~aluatlon results, some 
interesting anal~ ses are presented m thin sectmn 
These analyses do not concern directly the evalu- 
atlon of the experiment, but indicate the natme 
of the expemments Ol the nature of the ontolog~es 
Number of outputs 
We used a threshold to resulct the nuInber of 
outputs If the distance ~s greater than 0 9, the 
result is not generated Table 2 shows the number 
of outputs m each experiment Itecall that there 
are 76,281,720-possible pairings of nodes It is 
interesting to see that the numbers are almost the 
same The number of outputs in E'cperlment-4 
is shghtly smaller, we believe thin is because the 
weight asmgned to the nodes themsel~es, wluch 
gl~es the greatest contmbutmn, ~s low 
70 
Experiment (Coefficients) 
(00, ~ o, 00,00) 
2 (00,07,03,00) 
3 (00, 05, 05,00) 
4 (00, 03, 0 7, 00) 
5 (03, 07, 00,00) 
" 6 (02,05,03,00) 
7 (02, 06, 0 2, 00) 
8 (02, 05, 02, 0 1) 
Output 
10,275 
10,151 
10,151 
9,093 
10,799 
10,098 
10,206 
10,028 
Table 2 Number of Outputs 
The numbers are around 10,000, which repre- 
sents 0 013% of the possible matches This sug- 
gests that there is a posslblllty of narrowing down 
the matches to be examined by a human, as the 
distance 0 9 ,s very large and the number of out- 
puts ,s so small To prove th,s assumption, we 
have to conduct an evaluatmn to see ff there are 
good matches which were not generated Th,s ,s 
beyond the evaluatmn m thls paper, because it 
reqmres manual matching from scratch We will 
discuss this later 
Complete Match 
We can find the number of complete matches 
(which have exactly the same word(s)) by count- 
mg the pmrs w~th d,stance 0 0 m Expenment-1 
The number of complete matches is 1778, whlch ,s 
qmte large compared to the number of nodes un- 
der conslderatmn m WordNet (about 5,000) Also, 
by counting up the number of pmrs w,th distance 
0 0 m Experiment-5, we can find parent-complete 
matches whmh are complete matches where the 
parents also have the same words The number of 
parent-complete matches is 1 This is surprisingly 
small, even cons,dermg that we used only subsets 
of the 0ntologms The only parent-match is the 
following- 
parent Invertebrate 
child arthropod 
Naturally people mlght guess that there would be 
more parent-complete matches For example, the 
name of a mammal might be a plaus,ble candi- 
date (where the parent is "mammal" and child 
is, for example, "elephant") However, this is not 
the case "Elephant" and "mammal" appear as 
follows (unrelated nodes are not sho~n) 
EDR 
<no Engl~.sh word, Japanese=mammal> 
+ ..... <mammal, J-Descnptxon- 
\[ an ~nstance of mammal> 
+ ..... <elephant> 
WordNet 
<mammal> 
+ ..... <probos c~dean, probosc~dlan> 
+ ...... <elephant> 
Thls is one of the typlcal problems of ontolog} 
deslgn, how detail concepts should be mtrocuced 
Also, there is a translatlon problem m EDR, ,e 
sometimes there ,s words or a descnptmn m only 
one language 
There are some other "reasons why the number 
of parent-matches ,s so small 
• Some nodes m EDR have no words assocl- 
ated wlth them Thls is how the EDR Class,- 
ficatlon Dmtlonary was deslgned It ~s based 
on the classfficat,on of words into some pre- 
defined boxes, and not creating hmrarchy of 
words It would be better to use the con- 
cept descnptlons of the dlctlonary, although 
it is not clear how to compare a s)nset (set of 
words) and a descnptlon Also, we mlght be 
able to use mformatlon written m Japanese 
when there ,s no Enghsh word but there are 
Japanese words 
• WordNet uses a synset to represent a node, 
whereas EDR's node Is pnmarlly represented 
by a descriptlon, there could be differences 
caused by thls The average numbers of 
words m a node are also different 
There were no chlldren-matches, whmh are 
complete matches where the words m the child 
nodes are also the same The closest matches m 
Experiment-2 and 3 are the following 
EDR 
parent (*) year 
children school year 
WordNet 
parent (*) year 
children anomallstlc year, lunar 
year, school year, academlc year, 
solar year, troplcal year, astro- 
nomlcal year, equinoctial year 
(There are actually 4 child nodes ) 
3 2 2 Evaluatmn 
As ~t ~s lmposs~ble to evaluate all the results, ~e 
selected four ranges (rank 1 to 20, 501 to 520, 2001 
to 2020, and 9001 to 9020) and the data m these 
ranges was evaluated manually E~aluatmn ~as 
done by putting the matches into three categories 
• A Two nodes are completel:y the same con- 
cept 
• B Other than A and C 
• C Two nodes me completely d~fferent con- 
cepts 
Category B includes several different things, in- 
cluding partml matches and ambiguous cases b3 
the manual evaluatmn However, the number of 
results m th~s category was not so large, so ~t 
should not affeSt the overall evaluatmn Table 3 
shows the evaluatmn result The columns repre- 
sent the four ranges and the each row represents 
one of the e,ght experiments An element has 
71 
Experiment 1-20 501-520 2001-2020 9001-9020- 
1(00, 10,00,00) 
2 (00, 07,03, 00) 
3 (00, 05,05, 00) 
4 (00,03,07, 00) 
5 (03, o 7, o o, oo) 
6 (02,05,03,00) 
7(02,06,02,00) 
8 (02, 05,02,01) 
311116 8/1111 611/13 6/1113 
611113 6/1113 
211117 101317 10/1/9 7/1/12 
11/1/8 6/1/13 
11/1/8 6/1/13 11/1/8 6/1/13 
4/2/14 
3/3/14 
3/3/14 
4/4/12 
2/3/15 
2/3/15 
2/3/15 
2/3/15 
5/4/11 
1/2/17 i'/2/17 
5/5/10 6/5/9 
5/9/6 1/7/12 
5/6/9 
Table 3 Evaluauon Result 
.q,~ • 
three numbers, corresponding to the categorms A, 
B and C, separated by "/" We can't make a direct 
comparison to other methods For example, while 
(Utlyama and Hashlda 1997) also used EDR and 
WordNet, they used only.connected components 
and we i/v/pose d the level restnctmn However, 
relative comparisons among our 8 experiments ar, e 
meaningful and important We will discuss them 
m the next section 
3 3 Dmcussmn 
Using only the nodes themselves (Exp-1) 
In Experiment-i, only the words m the nodes be- 
mg compared are used The evaluatmn result was 
not very good For example, there are only 3 
matches of category A m the highest range Based 
on an exammatmn of the results, we observed that 
this is due to word polysemy Even ff two nodes 
have a word m common, the word could have sev- 
eral meanings, and hence the corresponding nodes 
could have different meamngs For example, the 
word "love" can mean "emotion" or "no point in 
tenms" To see how the results we obtained m~ght 
arise, suppose a word has 4 senses in ontology1 
and 5 m ontology2, and there are 3 senses which 
are the same m the' two ontologms Then there are 
20 pairings of the senses and out of them only3 
can be judged as category A Although this is just 
an assumptmn, the reahty m~ght not be that far 
from this explanation based on the observation of 
the result 
Adding chdd nodes (Exp-2,3,4) 
In Experiment-2,3 and 4, we used the mforma- 
tmn of the nodes themselves and their child nodes 
The evaluatmn results for Experiment-2 and 3 are 
the same, both of them have 6 A's in the h~ghest 
range The number is twine that in Expenment- 
1 This improvement is due to dlsamblguatmn of 
polysemous words For example, the same sense 
of a polysemous word might have similar words 
in the child nodes, whereas it might be rare that 
different senses have the same words m the two 
ontologms 
In Experiment-4, we put more weight on child 
nodes rather than the nodes themselves This 
experiment was conducted based on the assump- 
tmn that the number of words m child nodes may 
be much larger than the number of words in the 
nodes themselves However, th~s turns out to give 
a degradation at the higher range Observing the 
result, the matches at the h~gher range have ~er~ 
few words m the child nodes If the number of 
chdd nodes are small in both ontologms and they 
have many m common, the d~stance between the 
nodes becomes extremely small Th~s could be 
both beneficml and harmful It can p,ck up some 
matches which could not be found m Experiment- 
1, but the matches could be good or bad ones The 
followmg example is a good one which is actually 
found at the ninth rank m Experiment-4 
EDIt 
parent(*) No Engllsh word, J-descrlptlon 
"target anlmals huntlng or flshlng" 
chlldren game, k111 
WordNet 
pareni; (*) prey, quarry 
children game 
Adding parent nodes (Exp-5) 
In Experiment-5, the words in the nodes them- 
selves and their parent nodes are used It can 
be naturally thought that the words in the par- 
ent nodes are useful to dlsamblguate polysemous 
words The result confirmed this In the high- 
est range, category A has 10 matches out of 20 
which ,s three t,mes as much as m Experiment-I, 
and twice that m Experiment-2 and 3 
Using parents, self and chddren (Exp-6,7) 
In Expernnent-6 and 7 ~olds in parent, self and 
child nodes are used with different welghtmgs All 
e~aluaUon results are ~dentlcal e<cept the lowest 
range, and these have the largest number of A's 
at the hlghest range among all of the experiments 
This mdmates that three sources together is better 
than any two or an~ single source of reformation 
Adding grandchdd nodes (Exp-8) 
Finally, m Experiment-8, words m all four kinds 
of nodes, parent, self, child and grandchild, are 
used The evaluation result is the same as that m 
72 
Experiment-6, and we could not see improvement 
by adding grandchild information Actually, by 
observing the result, we can see that the informa- 
tion at the grandchild level is not so useful 
Observing the evaluation process 
From the evaluation process, we understand that 
a human uses not only the four kinds of mforma- 
tmn, but also mformatmn ,n grandparent or the 
successor's nodes Some ,mprovement rmght be 
obta, ned if we used such mformatmn Also, we 
m,ght be able to achmve more improvement by 
using sibhng nodes, and the result of distance cal- 
culation of other nodes 
As we presented by the example of "mammal" 
and "elephant", there are the cases where m one 
ontology a relatmnshlp m parent-child, but m the 
other ontology ~t m a grandparent-grandchild re- 
laUonsh,p or a slbhng-relationship It would be 
better ff we took the charactenstms of each ontol- " 
ogy and differences of the ontologms into account 
m the calculatmn In particular, the reformation 
m ancestors might be very useful 
Other distance definitions 
In our method, we simply used the dice coefficmnt 
However, we can use more comphcated or sophmt,- 
cared measures For example, (Resmk 1995) pro- 
posed a measure of semant,c similarity based on 
the notmn of information content Although thin 
proposal defines mmflanty between two nodes m 
a single taxonomy or ontology, we may be able to 
apply ,tm our mtuatmn 
(Aglrre et al 1995) proposed conceptual dm- 
tance between nodes on ontologles captured by a 
Conceptual Density formula It is also a defimtmn 
m a single ontology 
Recently, (O'Hara and et al 1998) conducted 
an experiment of matchmg two ontologms, Word- 
consider the characteristics of the ontologms One 
goal of our future ~ork is to understand how to 
incorporate such characteristics into these statm- 
t,cal methods 
5 Acknowledgements 
We would like to thank ProfRalph Grmhman 
at New York Un,verslty for his suggestions and 
anonymous revmwers who gave us some severe 
comments 
other heunstms It m not so clear hov~"to compare 
the method to our method, as they used several 
heur,stms whmh m not d,rectly comparable to our 
method However we noticed that it is a very Im- 
portant to mvestlgate their methods 
4 Conclusion 
We proposed a statmtmal method of matching two 
ontologms Since It m impossible to exhaust,vely 
consider all matches by hand, automatic methods 
to mak~ matches or to narrow down the cand,date 
matches are needed Although the experiments 
are prehmmary, they show what kinds of mforma- 
tmn m useful m statmtlcal matching We found 
that parent nodes, bemdes the nodes themselves, 
are the most useful for matching by dmamb,guat- 
mg the synonyms of words The best performance 
was achieved by using words m parent, ,tself and 
child nodes We observed that ~t is important to 
Net and the Mlkrokosmos Ontology They used .i~..
the definmon proposed m (Resmk 1995) among .' 

References 

Eneko Aglrre and German Rlgau, "A Proposal for 
Word sense Dmamb,guatmn using Conceptional 
distance" Proc of the 1st Internatzonal Confer- 
ence on Recent Advances m natural Language 
Processing 1995 

EDRElectromc D1ctmnary Version 1 5 Techmcal 
Guide EDR TR2-O07, 1996 

Eduard Hovy "Creating a Large Ontology", ANSI 
Ad Hoc Group on Ontology, Stanford Umver- 
sity, 1996 

George Miller "WordNet A lex,cal database for 
English" Communzcatzons of the ACM, 38{1i) 
pp39-~1, 1995 

Hldeo Mlyoshl, Ken j, Sugiyama, Masah~ro 
Kobayash, and Takano Ogmo "An Overview of 
the EDR Electronic D,ctionary and the Current 
Status of Its Utflmatmn", Proc o/COLING-g6, 
1996 

Takano Ogmo, Hldeo Mlyoshl, Masahlro 
Kobayashl, Fumlhlto Nmhmo and Junhch, Tsu- 
ju "An Experiment on Matching EDR Concept 
Classfficahon Dictionary with WordNet', Proc 
o/IJICAI-97, 1997 

Tom O'Hara, Kaw Mahesh and Serge, Nlren- 
burg, "LexlcalAcqu~sltmn with WordNet and 
the Mlkrokosmos Ontolog)" Proc o/the COL- 
ING/ACL Workshop on Usage of WordNet m 
Natural Language Processzng Systems t998 

Pangloss Project (InformaUon Scmnces Insutute 
(ISl) / Uinver- 
mty of Southern Cahforma (USC)) homepage 
"http//www ,s, edu/natural-language/nlp.at. 
is, html" 

Phlhp Resmk, "Using Information Content to 
Evaluate Semant,c S,mflant~ m a Taxonomy" 
Proc o/ the 14th Internatwnal Joint Confer- 
ence on Artzficzal Intelhgence, 1995 

UTtYAMA Masao, HASHIDA Ko,tL "Bottom- 
up Ahgnment of Ontologms" Proc of IJCAI97 
Workshop on Ontologzes and Multdzngual Nat- 
ural Language Processing 1997 

WordNet Homepage 
"http//www cogsc, princeton edu/wn/" 
