Structural Semantic Interconnection: a knowledge-based approach to Word 
Sense Disambiguation 
 
Roberto NAVIGLI 
Dipartimento di Informatica, 
Università di Roma “La Sapienza” 
Via Salaria, 113 - 00198 Roma, Italy 
navigli@di.uniroma1.it 
 
Paola VELARDI 
Dipartimento di Informatica, 
Università di Roma “La Sapienza” 
Via Salaria, 113 - 00198 Roma, Italy 
velardi@di.uniroma1.it 
 
Abstract 
In this paper we describe the SSI algorithm, a 
structural pattern matching algorithm for 
WSD. The algorithm has been applied to the 
gloss disambiguation task of Senseval-3. 
1 Introduction 
Our approach to WSD lies in the structural 
pattern recognition framework. Structural or 
syntactic pattern recognition (Bunke and Sanfeliu, 
1990) has proven to be effective when the objects 
to be classified contain an inherent, identifiable 
organization, such as image data and time-series 
data. For these objects, a representation based on a 
“flat” vector of features causes a loss of 
information that negatively impacts on 
classification performances. Word senses clearly 
fall under the category of objects that are better 
described through a set of structured features.  
The classification task in a structural pattern 
recognition system is implemented through the 
use of grammars that embody precise criteria to 
discriminate among different classes. Learning a 
structure for the objects to be classified is often a 
major problem in many application areas of 
structural pattern recognition. In the field of 
computational linguistics, however, several efforts 
have been made in the past years to produce large 
lexical knowledge bases and annotated resources, 
offering an ideal starting point for constructing 
structured representations of word senses. 
2 Building structural representations of 
word senses 
We build a structural representation of word 
senses using a variety of knowledge sources, i.e. 
WordNet, Domain Labels (Magnini and Cavaglia, 
2000), annotated corpora like SemCor and LDC-
DSO
1
. We use this information to automatically 
 
1
LDC http://www.ldc.upenn.edu/ 
generate labeled directed graphs (digraphs)
representations of word senses. We call these 
semantic graphs, since they represent alternative 
conceptualizations for a lexical item. 
Figure 1 shows an example of the semantic 
graph generated for senses #1 of market, where 
nodes represent concepts (WordNet synsets), and 
edges are semantic relations. In each graph, we 
include only nodes with a maximum distance of 3 
from the central node, as suggested by the dashed 
oval in Figure 1. This distance has been 
experimentally established.  
m a rk et # 1
go o ds #1
t r a di n g# 1
g l o
s s
g
l o
s
s
m e r c hand i s e
 #1
k
i
n
d
-
o
f
m on o p ol y #1
k
i
n
d
-
o
f
e xp or t # 1
h
a
s
-
k
i
n
d
a c ti v i ty # 1
h
a
s
-
k
i
n
d
c o ns um e r
 go od s #1
g r oc e r y #2
k
i n
d
- o
f
k
i
n
d
-
o
f
l oa d #3
k
i
n
d
-
o
f
co m m e r c i a l
en t er p r i s e # 2
h a s -
p a r t
c o m m er c e#1
k
i
n
d
-
o
f
t r a n s po r t at i on # 5
h
a
s
- p
a
r t
bu s i ne s s
a c ti v i ty # 1
g
l
o
s
s
se r v i c e # 1
g
l o
s
s
t
o
p
i
c
i nd us t r y # 2
k i n
d - o
f
h
a
s
-
p
a
r
t
g
l o
s
s
k i n
d -
o f
f oo d #1
c lo t h in g # 1
g
l
o
s
s
g
l o
s
s
e n t e rp ri se # 1
k i n d - o
f
pr o du c t i on # 1
a r t i f ac t #1
k
i
n
d
-
o
f
ex p r e s s # 1
k i
n d
- o
f
c o ns u m pt i o n#1
g l o s s
F i gu r e  1.  G r a ph  r e pr e s e n t a t i ons  f or  s e ns e  #1 of  m a r ket .
A ll t h e  u s e d  s e m a n ti c  r e l a ti o n s  a r e  e x p l ic i tl y  
e nc ode d i n W o r dN e t , e xc e pt  f o r  t h r e e  r e l a t i on s  
na m e d t opi c , gl o s s  an d  dom ai n , ex t r ac t e d  
r e s p e c t i v e l y  f r om  a nnot a t e d c o r po r a , s e ns e  
de f i ni t i o ns  a n d dom a i n l a be l s .  
3 S u mma r y  d e s c r i p t i o n  o f t h e  S S I  a lg o r i th m  
T he  S S I  a l g or i t hm  c ons i s t s  of  a n i ni t i a l i z a t i on s t e p  
a nd a n i t e r a t i v e  s t e p.   
I n a  g e ne r i c  i t e r a t i on of  t he  a l g or i t hm  t he  i npu t  
is  a  l is t o f  c o - o c c u r r in g  te r m s  T = [ t
1
, …, t
n
] an d  
a l i s t  o f  as so c i a t e d  s en se s I =
], .. .,[
1 n
tt
SS
, i .e . t h e  
s e m a nt i c  i nt e r p r e t a t i on of  T , w h er e 
i
t
S
2
is  e it h e r  
t h e  ch o sen  s en se f o r  t
i
( i .e . , t he  r e s u l t  of  a  pr e v i ou s  
 
2
N o te  t h a t wi t h  
i
t
S w e r ef er  i n t er c h a n g ea b l y  t o  t h e s em a n t i c  
g r ap h  as s o c i at e d w i t h  a  s e n s e  o r  t o  t h e  s e n s e  na me .
                                             Association for Computational Linguistics
                        for the Semantic Analysis of Text, Barcelona, Spain, July 2004
                 SENSEVAL-3: Third International Workshop on the Evaluation of Systems
disambiguation step) or the empty set (i.e., the 
term is not yet disambiguated).  
A set of pending terms is also maintained, P =
}|{ =
i
t
i
St
. I is named the semantic context of T
and is used, at each step, to disambiguate new 
terms in P.  
The algorithm works in an iterative way, so that 
at each stage either at least one term is removed 
from P (i.e., at least a pending term is 
disambiguated) or the procedure stops because no 
more terms can be disambiguated. The output is 
the updated list I of senses associated with the 
input terms T.
Initially, the list I includes the senses of 
monosemous terms in T. If no monosemous terms 
are found, the algorithm makes an initial guess 
based on the most probable sense of the less 
ambiguous term. The initialisation policy is 
adjusted depending upon the specific WSD task 
considered. Section 5 describes the policy adopted 
for the task of gloss disambiguation in WordNet. 
During a generic iteration, the algorithm selects 
those terms t in P showing an interconnection 
between at least one sense S of t and one or more 
senses in I. The likelihood for a sense S of being 
the correct interpretation of t, given the semantic 
context I, is estimated by the function 
CxTf
I
: , where C is the set of all the 
concepts in the ontology O, defined as follows: 




 	


=
otherwise
SynsetstSensesSifISSS
tSf
I
0
)(})'|)',(({
),(

where Senses(t) is the subset of concepts C in O
associated with the term t, and 
})'...|)...(({')',(
1121
121
SSSSeeewSS
nn
e
n
eee
n
=



,
i.e. a function (’) of the weights (w) of each path 
connecting S with S’, where S and S’ are 
represented by semantic graphs. A semantic path 
between two senses S and S’,
'...
11
121
SSSS
nn
e
n
eee



,
is represented by a sequence of edge labels 
n
eee  ...
21
. A proper choice for both  and ’ may 
be the sum function (or the average sum function). 
A context-free grammar G = (E, N, S
G
, P
G
)
encodes all the meaningful semantic patterns. The 
terminal symbols (E) are edge labels, while the 
non-terminal symbols (N) encode (sub)paths 
between concepts; S
G
is the start symbol of G and 
P
G
the set of its productions. 
We associate a weight with each production 
A in P
G
, where NA
 and *)( EN 
 , i.e. 
 is a sequence of terminal and non-terminal 
symbols. If the sequence of edge labels 
n
eee  ...
21
belongs to L(G), the language generated by the 
grammar, and provided that G is not ambiguous, 
then 
)...(
21 n
eeew 
is given by the sum of the 
weights of the productions applied in the 
derivation 
nG
eeeS 
+
...
21
. The grammar G is 
described in the next section. 
Finally, the algorithm selects
),(maxarg tSf
I
CS

as 
the most likely interpretation of t and updates the 
list I with the chosen concept. A threshold can be 
applied to ),( tSf to improve the robustness of 
system’s choices. 
At the end of a generic iteration, a number of 
terms is disambiguated and each of them is 
removed from the set of pending terms P. The 
algorithm stops with output I when no sense S can 
be found for the remaining terms in P such that 
0),( >tSf
I
, that is, P cannot be further reduced. 
In each iteration, interconnections can only be 
found between the sense of a pending term t and 
the senses disambiguated during the previous 
iteration.  
A special case of input for the SSI algorithm is 
given by 
]..., ,,[ =I
, that is when no initial 
semantic context is available (there are no 
monosemous words in T). In this case, an 
initialization policy selects a term t 
 T and the 
execution is forked into as many processes as the 
number of senses of t.
4 The grammar 
The grammar G has the purpose of describing 
meaningful interconnecting patterns among 
semantic graphs representing conceptualisations 
in O. We define a pattern as a sequence of 
consecutive semantic relations 
n
eee  ...
21
where 
Ee
i


, the set of terminal symbols, i.e. the 
vocabulary of conceptual relations in O. Two 
relations 
1+ii
ee
are consecutive if the edges 
labelled with 
i
e
and 
1+i
e
are incoming and/or 
outgoing from the same concept node, that is 
1
)(
+

ii
ee
S
,
1
)(
+

ii
ee
S
,
1
)(
+

ii
ee
S
,
1
)(
+

ii
ee
S
. A meaningful 
pattern between two senses S and S’ is a sequence 
n
eee  ...
21
that belongs to L(G). 
In its current version, the grammar G has been 
defined manually, inspecting the intersecting 
patterns automatically extracted from pairs of 
manually disambiguated word senses co-occurring 
in different domains. Some of the rules in G are 
inspired by previous work on the eXtended 
WordNet project described in (Milhalcea and 
Moldovan, 2001). The terminal symbols e
i
are the 
conceptual relations extracted from WordNet and 
other on-line lexical-semantic resources, as 
described in Section 2. 
G is defined as a quadruple (E, N, S
G
, P
G
), 
where E = { e
kind-of
, e
has-kind
, e
part-of
, e
has-part
, e
gloss
, e
is-
in-gloss
, e
topic
, … }, N = { S
G
, S
s
, S
g
, S
1
, S
2
, S
3
, S
4
, S
5
,
S
6
, E
1
, E
2
, … }, and P
G
includes about 50 
productions.  
As stated in previous section, the weight 
)...(
21 n
eeew 
of a semantic path 
n
eee  ...
21
is given 
by the sum of the weights of the productions 
applied in the derivation 
nG
eeeS 
+
...
21
. These 
weights have been learned using a perceptron 
model, trained with standard word sense 
disambiguation data, such as the SemCor corpus. 
Examples of the rules in G are provided in the 
subsequent Section 5. 
5 Application of the SSI algorithm to the 
disambiguation of WordNet glosses 
For the gloss disambiguation task, the SSI 
algorithm is initialized as follows: In step 1, the 
list I includes the synset S whose gloss we wish to 
disambiguate, and the list P includes all the terms 
in the gloss and in the gloss of the hyperonym of 
S. Words in the hyperonym’s gloss are useful to 
augment the context available for disambiguation.  
In the following, we present a sample execution of 
the SSI algorithm for the gloss disambiguation 
task applied to sense #1 of retrospective: “an
exhibition of a representative selection of an 
artist’s life work”. For this task the algorithm uses 
a context enriched with the definition of the synset 
hyperonym, i.e. art exhibition#1: “an exhibition of 
art objects (paintings or statues)”.  
Initially we have: 
I = { retrospective#1 }
3
P = { work, object, exhibition, life, statue, artist, 
selection, representative, painting, art }
At first, I is enriched with the senses of 
monosemous words in the definition of 
retrospective#1 and its hyperonym: 
I = { retrospective#1, statue#1, artist#1 }
P = { work, object, exhibition, life, selection, 
representative, painting, art }
since statue and artist are monosemous terms in 
WordNet. During the first iteration, the algorithm 
finds three matching paths
4
:
retrospective#1
2

ofkind
exhibition#2, statue#1 
3

ofkind
 art#1 and statue#1 
3
For convenience here we denote I as a set rather 
than a list. 
4
With S
R
 
i
S’ we denote a path of i consecutive 
edges labeled with the relation R interconnecting S
with S’.
6

ofkind
object#1 
This leads to: 
I = { retrospective#1, statue#1, artist#1, 
exhibition#2, object#1, art#1 }
P = { work, life, selection, representative, painting 
}
During the second iteration, a 
hyponymy/holonymy path (rule S
2
) is found:  
art#1 
2

kindhas
painting#1 (painting is a kind 
of art)which leads to: 
I = { retrospective#1, statue#1, artist#1, 
exhibition#2, object#1, art#1, painting#1 }
P = { work, life, selection, representative }
The third iteration finds a co-occurrence (topic 
rule) path between artist#1 and sense 12 of life 
(biography, life history): 
artist#1 
topic
 life#12 
then, we get: 
I = { retrospective#1, statue#1, artist#1, 
exhibition#2, object#1, art#1, painting#1, life#12 
}
P = { work, selection, representative }
The algorithm stops because no additional 
matches are found. The chosen senses concerning 
terms contained in the hyperonym’s gloss were of 
help during disambiguation, but are now 
discarded. Thus we have: 
GlossSynsets(retrospective#1) = { artist#1, 
exhibition#2, life#12, work#2 }
6 Evaluation  
The SSI algorithm is currently tailored for noun 
disambiguation. Additional semantic knowledge 
and ad-hoc rules would be needed to detect 
semantic patterns centered on concepts associated 
to verbs. Current research is directed towards 
integrating in semantic graphs information from 
FrameNet and VerbNet, but the main problem is 
harmonizing these knowledge bases with 
WordNet’s senses and relations inventory.  A 
second problem of SSI, when applied to 
unrestricted WSD tasks, is that it is designed to 
disambiguate with high precision, possibly low 
recall. In many interesting applications of WSD, 
especially in information retrieval, improved 
document access may be obtained even when only 
few words in a query are disambiguated, but the 
disambiguation precision needs to be well over 
the 70% threshold. Supporting experiments are 
described in (Navigli and Velardi, 2003). 
The results obtained by our system in Senseval-
3 reflect these limitations (see Figure 2).  
The main run, named OntoLearn, uses a 
threshold to select only those senses with a weight 
over a given threshold. OntoLearnEx uses a non-
greedy version of the SSI algorithm. Again, a 
threshold is used to accepts or reject sense 
choices. Finally, OntoLearnB uses the “first 
sense” heuristics to select a sense, every since a 
sense choice is below the threshold (or no patterns 
are found for a given word).  
82.60%
75.30%
37.50%
68.50%
68.40%
32.30%
39.10%
49.70%
99.90%
0%
20%
40%
60%
80%
100%
OntoLearn OntoLearnB OntoLearnEx
P r e c is io n R ec al l At te m p t e d
F i gu r e  2.  R e s u l t s  of  t h r e e  r un s  s u b m i t t e d t o S e n s e v a l - 3.  
T a b l e  1 s how s  t he  pr e c i s i on a nd r e c a l l  o f  
O nt oL e a r n m a i n r un  by  s y nt a c t i c  c a t e g or y . I t  
s how s  t ha t , a s  e xp e c t e d,  t he  S S I  a l g or i t hm  i s  
c ur r e nt l y  t une d f o r  noun  di s a m bi g ua t i on. 
 
N ouns  V e r bs  A dj . 
P r e c i s i on 86.0%  69.4%  78.6%  
R e c a l l  44.7%  13.5%  26.2%  
A t t e m pt e d 52.0%  19.5%  33.3%  
T a b le  1 .  P r e c is io n  a n d  R e c a ll b y  s y n ta c t ic  c a te g o r y .  
T he  o f f i c i a l  S e ns e v a l - 3 e v a l ua t i on ha s  be e n  
p er f o r m e d  a g ai n st  a se t  o f  so  cal l ed  “g o l d e n  
g l os s e s ”  p r odu c e d by  D a n M ol dov a n a nd i t s  
g r oup
5
. T h i s  t es t  s e t  h o w ev er  h ad  s e v er a l  
pr obl e m s , t ha t  w e  pa r t l y  de t e c t e d a nd s ubm i t t e d  t o  
t he  o r g a ni s e r s . 
B esi d e s  so m e t ech n i c al  e r r o r s  i n  t h e d at a  s e t  
( pr e s e n c e  o f  W or dN e t  1.7  a nd 2.0 s e n s e s ,  m i s s i ng  
g l o sses,  e t c . )  t h e r e  a r e sen se- t ag g i n g  
i nc on s i s t e nc i e s  t ha t  a r e  v e r y  e v i de nt . 
F or  e xa m pl e , one  of  our  hi g he s t  pe r f or m i ng  
sen se t ag g i n g  r u l es i n  S S I  i s t h e di r e c t  
hy pe r o ny m y  p a t h .   T h i s r u l e r e ad s as f o l l o w s :  “i f  
t he  w or d w
j
a ppe a r s  i n t he  g l os s  of  a  s y ns e t  S
i
, a nd 
i f  one  of  t h e  s y ns e t s  o f  w
j
, S
j
, i s t h e d i r e ct  
hy pe r onym  of  S
i
, th e n ,  s e le c t S
j
as t h e  c o r r ec t  
sen se  f o r  w
j
”.  
A n e xa m pl e  i s  c us t om #4 de f i ne d a s  “ ha b i t ua l  
pa t r ona g e ” . W e  ha v e  t ha t :  
{ c us t om - n#4}
kin d _ of
     { t r a de ,pa t r on a g e - n#5}  
 
5
h ttp :// x w n . h l t. u td a lla s . e d u / w s d . h t m l 
t he r e f o r e  w e  s e l e c t  s e ns e  # 5 of  pa t r ona g e , w hi l e  
M ol d ov a n’ s  “ g ol d e n”  s e ns e  i s  #1. 
W e  do no t  i nt e nd t o d i s pu t e  w he t he r  t h e  
“q u e s t i o n ab l e”  s e n se a ss i g n m en t  i s t h e o n e  
pr ov i de d i n t he  g ol de n  g l os s  or  r a t he r  t he  
hy pe r onym  s e l e c t e d by  t he  W or dN e t  
l e x i c og r a ph e r s . I n a ny c a s e , t he  de t e c t e d pa t t e r n s  
s how  a  c l e a r  i nc on s i s t e n c y  i n t h e  da t a .  
T he s e  pa t t e r ns  ( 3 13)  ha v e  be e n s ubm i t t e d t o t he  
or g a ni s e r s , w ho t he n de c i de d t o r e m ov e  t he m  
fro m  th e  d a ta  s e t .    
7 C on c l u s i on  
T h e  in t e r e s ti n g  f e a tu r e  o f  th e  S S I  a lg o r it h m ,  
u n l i k e m an y  co - o cc u r r en c e  b ased  an d  s t at i s t i cal  
a ppr o a c he s  t o W S D , i s  a  ju s ti fi c a t io n  ( i . e.  a s e t  o f  
s e m a nt i c  p a t t e r ns )  t o s u ppor t  a  s e ns e  c hoi c e .  
F u r t h er m o r e,  ea c h  sen se  ch o i c e h a s  a w e i g h t  
r e pr e s e nt i ng  t he  c on f i d e nc e  of  t h e  s y s t e m  i n i t s  
out pu t . T he r e f o r e  S S I  c a n be  t une d f or  h i g h 
pr e c i s i on ( pos s i b l y  l ow  r e c a l l ) ,  a n a s s e t  t h a t  w e  
c o n s i d e r m o re  re a l i s t i c  fo r p ra c t i c a l W S D  
a ppl i c a t i ons . 
C ur r e n t l y , t he  s y s t e m  i s  t une d f or  nou n 
di s a m bi g ua t i on, s i n c e  w e  bui l d  s t r uc t u r a l  
r ep r e s e n t a t i o n s o f  w o r d  sen s es u si n g  l e x i ca l  
k now l e dg e  ba s e s  t h a t  a r e  c ons i de r a bl y  r i c h e r  f or  
nouns . E xt e ndi ng  s e m a nt i c  g r a phs  a s s oc i a t e d t o  
v e r bs  a nd a d di ng  a pp r o pr i a t e  i n t e r c onne c t i on  
r ul e s  i m pl i e s  h a r m oni z i ng  W or dN e t  a nd a v a i l a b l e  
l e x i ca l  r eso u r c es f o r  v er b s,  e. g .  F r a m eN et  a n d  
V e r bN e t .  T hi s  e x t e ns i on i s  i n pr og r e s s . 

References

H. Bunke and A. Sanfeliu (editors) (1990)
Syntactic and Structural pattern Recognition:
Theory and Applications World Scientific, Series
in Computer Science vol. 7, 1990.

A. Gangemi, R. Navigli and P. Velardi (2003)
The OntoWordNet Project: extension and
axiomatization of conceptual relations in
WordNet, 2nd Int. Conf. ODBASE, ed. Springer
Verlag, 3-7 November 2003, Catania, Italy.

B. Magnini and G. Cavaglia (2000)
Integrating Subject Field Codes into WordNet,
Proceedings of LREC2000, Atenas 2000.

Milhalcea R., Moldovan D. I. (2001)
eXtended WordNet: progress report. NAACL
2001 Workshop on WordNet and other lexical
resources, Pittsburg, June 2001.

Navigli R. and Velardi P. (2003) An Analysis
of Ontology-based Query Expansion Strategies,
Workshop on Adaptive Text Extraction and
Mining September 22nd, 2003 Cavtat-Dubrovnik
(Croatia), held in conjunction with ECML 2003.
