Word Sense Disambiguation in a Korean-to-Japanese 
MT System Using Neural Networks 
You-Jin Chung, Sin-Jae Kang, Kyong-Hi Moon, and Jong-Hyeok Lee 
Div. of Electrical and Computer Engineering, Pohang University of Science and Technology (POSTECH) 
and Advanced Information Technology Research Center(AlTrc) 
San 31, Hyoja-dong, Nam-gu, Pohang, R. of KOREA, 790-784 
{prizer,sjkang,khmoon,jhlee}@postech.ac.kr 
 
Abstract  
This paper presents a method to resolve 
word sense ambiguity in a 
Korean-to-Japanese machine translation 
system using neural networks. The 
execution of our neural network model is 
based on the concept codes of a thesaurus. 
Most previous word sense disambiguation 
approaches based on neural networks have 
limitations due to their huge feature set size. 
By contrast, we reduce the number of 
features of the network to a practical size by 
using concept codes as features rather than 
the lexical words themselves. 
Introduction 
Korean-to-Japanese machine translation (MT) 
employs a direct MT strategy, where a Korean 
homograph may be translated into a different 
Japanese equivalent depending on which sense 
is used in a given context. Thus, word sense 
disambiguation (WSD) is essential to the 
selection of an appropriate Japanese target word. 
Much research on word sense disambiguation 
has revealed that several different types of 
information can contribute to the resolution of 
lexical ambiguity. These include surrounding 
words (an unordered set of words surrounding a 
target word), local collocations (a short sequence 
of words near a target word, taking word order 
into account), syntactic relations (selectional 
restrictions), parts of speech, morphological 
forms, etc (McRoy, 1992, Ng and Zelle, 1997). 
Some researchers use neural networks in 
their word sense disambiguation systems 
Because of its strong capability in classification 
(Waltz et al., 1985, Gallant, 1991, Leacock et al., 
1993, and Mooney, 1996). Since, however, most 
such methods require a few thousands of 
features or large amounts of hand-written data 
for training, it is not clear that the same neural 
network models will be applicable to real world 
applications. 
We propose a word sense disambiguation 
method that combines both the neural net-based 
approach and the work of Li et al (2000), 
especially focusing on the practicality of the 
method for application to real world MT 
systems. To reduce the number of input features 
of neural networks to a practical size, we use 
concept codes of a thesaurus as features. 
In this paper, Yale Romanization is used to 
represent Korean expressions. 
1 System Architecture 
Our neural network method consists of two 
phases. The first phase is the construction of the 
feature set for the neural network; the second 
phase is the construction and training of the 
neural network. (see Figure 1.) 
For practical reasons, a reasonably small 
number of features is essential to the design of a 
neural network. To construct a feature set of a 
reasonable size, we adopt Li’s method (2000), 
based on concept co-occurrence information 
(CCI). CCI are concept codes of words which 
co-occur with the target word for a specific 
syntactic relation. 
In accordance with Li’s method, we 
automatically extract CCI from a corpus by 
constructing a Korean sense-tagged corpus. To 
accomplish this, we apply a Japanese-to-Korean 
MT system. Next, we extract CCI from the 
constructed corpus through partial parsing and 
scanning. To eliminate noise and to reduce the 
number of CCI, refinement proceesing is applied 
Japanese Corpus
COBALT-J/K
(Japanese-to-Korean
MT system)
Sense Tagged
Korean Corpus
Partial Parsing
& Pattern Scanning
Raw CCI
CCI Refinement
Processing
Refined CCI
Feature Set Construction Neural Net Construction
Feature Set
Network
Construction
Neural Network
Network
Learning
Stored in
MT Dictionary
Network
Parameters
Figure 1. System Architecture 
noun
nature  character          society institute  things 
0            1                     7          8          9
astro- calen- animal        pheno-
nomy    dar                          mena
00      01              06             09
goods drugs  food       stationary    machine
90      91       92             96             99
orga- ani- sin- intes- egg    sex
nism  mal              ews   tine    
060    061               066   067    068   069
supp- writing- count- bell
lies      tool      book
960     961      962               969
•••••
•••••
•••••
••••• •••••
•••••
•••••
•••••
•••••
•••••
L
1
L
2
L
3
L
4
Figure 2. Concept hierarchy of the Kadokawa
thesaurus
to the extracted raw CCI. After completing 
refinement processing, we use the remaining 
CCI as features for the neural network. The 
trained network parameters are stored in a 
Korean-to-Japanese MT dictionary for WSD in 
translation. 
2 Construction of Refined Feature Set 
2.1 Automatic Construction of Sense-tagged 
Corpus 
For automatic construction of the sense-tagged 
corpus, we used a Japanese-to-Korean MT 
system called COBALT-J/K
1
. In the transfer 
dictionary of COBALT-J/K, nominal and verbal 
words are annotated with concept codes of the 
Kadokawa thesaurus (Ohno and Hamanishi, 
1981),
1,100 semantic classe
Concept nodes in level 
divided into 10 subclasses. 
COBALT-
translations from a J
nom
codes at level
a result, a Korean se
1,060,000 se
Japanes
Newspaper of Econom
                              
1
Translato
pract
The quality of the constructed sense-tagged 
corpus is a critical issue. To evaluate the quality, 
we collected 1,658 sample sentences (29,420 
eojeols
2
) from the corpus and checked their 
precision. The total number of errors was 789, 
and included such errors as morphological 
analysis, sense ambiguity resolution and 
unknown words. It corresponds to the accuracy 
of 97.3% (28,631 / 29,420 eojeols). 
Because almost all Japanese common nouns 
represented by Chinese characters are 
monosemous little transfer ambiguity is 
exhibited in Japanese-to-Korean translation. In 
our test, the number of ambiguity resolution 
errors was 202 and it took only 0.69% of the 
overall corpus (202 / 29,420 eojeols). 
Considering the fact that the overall accuracy of 
the constructed corpus exceeds 97% and only a 
few sense ambiguity resolution errors were 
 
 
 
 
 which has a 4-level hierarchy of about 
s, as shown in Figure 2. 
L
1
, L
2
 and L
3
 are further 
We made a slight modification of 
J/K to enable it to produce Korean 
apanese text, with all 
inal words tagged with specific concept 
 L
4
 of the Kadokawa thesaurus. As 
nse-tagged corpus of 
ntences can be obtained from the 
e corpus (Asahi Shinbun, Japanese 
ics, etc.). 
                        
 COBALT-J/K (Collocation-Based Language 
r from Japanese to Korean) is a high-quality 
ical MT system developed by POSTECH. 
                                                     
found in the Japanese-to-Korean translation of
nouns, we regard the generated sense-tagged
corpus as highly reliable. 
2.2 Extraction of Raw CCI 
Unlike English, Korean has almost no syntactic 
constraints on word order as long as the verb
appears in the final position. The variable word 
order often results in discontinuous constituents. 
Instead of using local collocations by word order, 
Li et al. (2000) defined 13 patterns of CCI for 
homographs using syntactically related words in 
a sentence. Because we are concerned only with 
 
2
 An Eojeol is a Korean syntactic unit consisting of a
content word and one or more function words. 
Table 2. Concept codes and frequencies in CFP 
({<C
i
,f
i
>}, type
2
, nwun(eye)) 
Code Freq. Code Freq. Code Freq. Code Freq.
103 4 107 8 121 7 126 4 
143 8 160 5 179 7 277 4 
320 8 331 6 416 7 419 12
433 4 501 13 503 10 504 11
505 6 507 12 508 27 513 5 
530 6 538 16 552 4 557 7 
573 5 709 5 718 5 719 4 
733 5 819 4 834 4 966 4 
987 9 other
*
210     
 � ‘other’ in the table means the set of concept codes 
with the frequencies less than 4. 
Table 1. Structure of CCI Patterns 
CCI type Structure of pattern 
type
0
 unordered co-occurrence words 
type
1
 noun + noun  or  noun + noun 
type
2
 noun + uy + noun 
type
3
 noun + other particles + noun 
type
4
 noun + lo/ulo + verb 
type
5
 noun + ey + verb 
type
6
 noun + eygey + verb 
type
7
 noun + eyse + verb 
type
8
 noun + ul/lul + verb 
type
9
 noun + i/ka + verb 
type
10
 verb + relativizer + noun 
noun homographs, we adopt 11 patterns from 
them excluding verb patterns, as shown in Table 
1. The words in bold indicate the target 
homograph and the words in italic indicate 
Korean particles. 
For a homograph W, concept frequency 
patterns (CFPs), i.e., ({<C
1
,f
1
>,<C
2
,f
2
>, ... , 
<C
k,
f
k
>}, type
i
, W(S
i
)), are extracted from the 
sense-tagged training corpus for each CCI type i 
by partial parsing and pattern scanning, where k 
is the number of concept codes in type
i
, f
i
 is the 
frequency of concept code C
i
 appearing in the 
corpus, type
i
 is an CCI type i, and W(S
i
) is a 
homograph W with a sense S
i
. All concepts in 
CFPs are three-digit concept codes at level L
4
 in 
the Kadokawa thesaurus. Table 2 demonstrates 
an example of CFP that can co-occur with the 
homograph ‘nwun(eye)’ in the form of the CCI 
type
2
 and their frequencies. 
2.3 CCI Refinement Processing 
The extracted CCI are too numerous and too 
noisy to be used in a practical system, and must 
to be further selected. To eliminate noise and to 
reduce the number of CCI to a practical size, we 
apply the refinement processing to the extracted 
CCI. CCI refinement processing is composed of 
2 processes: concept code discrimination and 
concept code generalization. 
2.3.1 Concept Code Discrimination 
In the extracted CCI, the same concept code may 
appear for determining the different meanings of 
a homograph. To select the most probable 
concept codes, which frequently co-occur with 
the target sense of a homograph, Li defined the 
discrimination value of a concept code using 
Shannon’s entropy (Shannon, 1951). A concept 
code with a small entropy has a large 
discrimination value. If the discrimination value 
of the concept code is larger than a threshold, 
the concept code is selected as useful 
information for deciding the word sense. 
Otherwise, the concept code is discarded. 
2.3.2 Concept Code Generalization 
After concept discrimination, co-occurring 
concept codes in each CCI type must be further 
selected and the code generalized. To perform 
code generalization, Li adopted to Smadja’s 
work (Smadja, 1993) and defined the code 
strength using a code frequency and a standard 
deviation in each level of the concept hierarchy. 
The generalization filter selects the concept 
codes with a strength larger than a threshold. We 
perform this generalizaion processing on the 
Kadokawa thesaurus level L
4
 and L
3
. 
After processing, the system stores the 
refined conceptual patterns ({
type
real texts. These refined CCI are used as input
features for the neural network. 
specific des
explained in Li (2000). 
3 
3.1 
Because of it
the 
used in our sense clas
shown in Figure 3, each n
represents a 
C
1
, C
2
, C
3
, ...}, 
i
, W(S
i
)) as a knowledge source for WSD of 
 
The more 
cription of the CCI extraction is 
Construction of Neural Network 
Neural Network Architecture 
s strong capability for classification, 
multilayer feedforward neural network is 
sification system. As 
ode in the input layer 
concept code in CCI of a target 
. .
CCI type i
2
CCI type i
1
input
CCI type 0
input
CCI type 1
input
CCI type 8
input
CCI type 2
input
74
26
022
078
080
50
696
028
419
38
23
239
323
nwun
1
(snow)
nwun
2
(eye)
.
.
.
Figure 5. The Resulting Network for ‘nwun’ 
word and each node in the output lay
represents th
num
nodes in a hi
To determine a good topology for the network, 
we implemented a 2-layer (no hidden layer) and 
a 3-layer (with a single hidden layer of 5 nodes) 
network and compared their performance. The 
comparison result is given in Section 5. 
Each homograph has a network of its own. 
Figure 4
3
 demonstrates a construction example 
of the input layer for the homograph ‘nwun’ 
with the sense ‘snow’ and ‘eye’. The left side is 
the extracted CCI for each sense after refinement 
processing. We construct the input layer for 
‘nwun’ by merely integrating the concept codes 
in both senses. The resulting input layer is 
partitioned into several subgroups depending on 
their CCI types, i.e., type 0, type 1, type 2 and 
type 8. Figure 5 shows the overall network 
architecture for ‘nwun’. 
                              
3
 
for th  
concept codes for ‘
3.2 Network Learning 
We selected 875 Korean homographs requring 
the WSD processing in a Korean-to-Japanese 
translation. Among the selected nouns, 736 
nouns (about 84%) had two senses and the other 
139 nouns had more than 3 senses. Using the 
extracted CCI, we constructed neural networks 
and trained network parameters for each 
homograph. The training patterns were also 
extracted from the previously constructed 
sense-tagged corpus. 
The average number of input features (i.e. 
input nodes) of the constructed networks was 
approximately 54.1 and the average number of 
senses (i.e. output nodes) was about 2.19. In the 
case of a 2-layer network, the total number of 
parameters (synaptic weights) needed to be 
trained is about 118 (54.1×2.19) for each 
homograph. This means that we merely need 
storage for 118 floating point numbers (for 
sy
features) for
reasonable size to be used in real applications. 
• CCI type 0 : {26, 022}
• CCI type 1 : {080, 696}
nwun
1
(snow)
CCI type 0
input
CCI type 1
74
26
022
078
080
Refined CCI
4 
Our WSD approach is a h
co
knowledge-based 
overall WSD 
sense disa
First, we 
Korean-to-Ja
                        
 The concept codes in Figure 4 are simplified ones
e ease of illustration. In reality there are 87
• CCI type 8 : {38, 239}
Total 13 concept codes
integrate
input
CCI type 8
input
CCI type 2
input
13 nodes
nwun
2
(eye)
• CCI type 0 : {74, 078}
• CCI type 2 : {50, 028, 419}
• CCI type 8 : {23, 323}
50
696
028
419
38
23
239
323
Figure 4. Construction of Input layer for ‘nwun’
.
.
.
.
.
Output
(senses of the 
target word)
Inputs Outputs
.
.
Hidden
Layers
input
CCI type i
k
input
.
.
.
Figure 3. Topology of Neural Network 
er 
e sense of a target word. The 
ber of hidden layers and the number of 
dden layer are another crucial issue. 
nwun’. COBALT-K/
naptic weights) and 54 integers (for input 
 each homograph, which is a 
Word Sense Disambiguation 
ybrid method, which 
mbines the advantage of corpus-based and 
methods. Figure 6 shows our 
algorithm. For a given homograph, 
mbiguation is performed as follows. 
search a collocation dictionary. The 
panese translation system 
J has an MWTU (Multi-Word 
{078}
CCI type 0
CCI type 0
input
CCI type 1
input
CCI type 8
input
CCI type 2
input
CCI type 1
nwunmwul-i   katuk-han   kunye-uy   nwun-ul   po-mye
input               :눈물이가득한그녀의눈을보며…
[078] [274]concept code  : [503] [331]
target
word
CCI type        : (type 0) (type 0) (type 2) (type 8)
CCI type 2
CCI type 8
{none}
{503}
{331}
078
022
74
26
028
50
696
080
239
38
23
419
323
Input Layer
Similarity
Calculation
{274}
(0.000)
(0.285)
(0.250)
(1.000)
(0.000)
(0.857)
(0.000)
(0.000)
(0.000)
(0.285)
(0.000)
(0.000)
(0.250)
similarity values
Figure 7. Construction of Input Pattern by Using
Concept Similarity Calculation 
Neural Networks
Select the most frequent sense
Success
Success
Answer
NO
NO
NO
YES
YES
YES
Selectional Restrictions of the Verb
Collocation Dictionary
Success
 
Figure 6. The Proposed WSD Algorithm 
Translation Units) dictionary, which contains 
idioms, compound words, collocations, etc. If a 
collocation of the target word exists in the 
MWTU dictionary, we simply determine the 
sense of the target word to the sense found in the 
dictionary. This method is based on the idea of 
‘one sense per collocation’. Next, we verify the 
selectional restriction of the verb described in 
the dictionary. If we cannot find any matched 
patterns for selectional restrictions, we apply the 
neural network approach. WSD in the neural 
network stage is performed in the following 3 
steps. 
Step 1. Extract CCI from the context of the 
target word. The window size of the context is a 
single sentence. Consider, for example, the 
sentence in Figure 7 which has the meaning of 
“Seeing her eyes filled with tears, …”. The 
target word is the homograph ‘nwun’. We 
extract its CCI from the sentence by partial 
parsing and pattern scanning. In Figure 7, the 
words ‘nwun’ and ‘kunye(her)’ with the concept 
code 503 have the relation of <noun + uy + 
noun>, which corresponds to ‘CCI type 2’ in 
Table 1. There is no syntactic relation between 
the words ‘nwun’ and ‘nwunmul(tears)’ with the 
concept code 078, so we assign ‘CCI type 0’ to 
the concept code 078. 
Similarly, we can obtain all pairs of CCI types 
and their concept codes appearing in the context. 
All the extracted <CCI-type: concept codes> 
pairs are as follows: {<type 0: 078,274>, <type 
2: 503>, <type 8: 331>}. 
Step 2. Obtain the input pattern for the 
network by calculating concept similarities 
between the features of the input nodes and the 
concept code in the extracted <CCI-type: 
concept codes>. Concept similarity calculation 
is performed only between the concept codes 
with the same CCI-type. The calculated concept 
similarity score is assigned to each input node as 
the input value to the network. 
Csim(C
i
, P
j
) in Equation 1 is used to calculate 
the concept similarity between C
i
 and P
j
, where 
MSCA(C
i
, P
j
) is the most specific common 
ancestor of concept codes C
i
 and P
j
, and weight 
is a weighting factor reflecting that C
i
 as a 
descendant of P
j
 is preferable to other cases. 
That is, if C
i
 is a descendant of P
j
, we set weight 
to 1. Otherwise we set weight to 0.5. 
weight
PlevelClevel
PCMSCAlevel
PCCsim
ji
ji
ji
×
+
×
=
)()(
)),((2
),(
 (1)
The similarity values between the target 
(all 0.000)
(0.375)
(0.857)
(0.667)
(0.285)
(0.250) (0.250)
L
1
L
2
L
3
L
4
…
C
i
P
1
P
2
P
3
P
4
P
5
P
5
TOP
Figure 8. Concept Similarity on the Kadokawa
Thesaurus Hierarchy 
concept C
i
 and each P
j
 on the Kadokawa 
thesaurus hierarchy are shown in Figure 8. 
These similarity values are computed using 
Equation 1. For example, in ‘CCI-type 0’ part 
calculation, the relation between the concept 
codes 274 and 26 corresponds to the relation 
between C
i
 and P
4
 in Figure 8. So we assign the 
similarity 0.285 to the input node labeled by 26. 
As another example, the concept codes 503 and 
50 have a relation between C
i
 and P
2
 and we 
obtain the similarity 0.857. If more than two 
concept codes exist in one CCI-type, such as 
<CCI-type 0: 078, 274>, the maximum 
similarity value among them is assigned to the 
input node, as in Equation 2. 
In Equation 2, C
i
 is the concept code of the 
input node, and P
j
 is the concept codes in the 
<CCI-type: concept codes> pair which has the 
same CCI-type as C
i
. 
By adopting this concept similarity calculation, 
we can achieve a broad coverage of the method. 
If we use the exact matching scheme instead of 
concept similarity, we may obtain only a few 
concept codes matched with the features. 
Consequently, sense disambiguation would fail 
because of the absence of clues. 
Step 3. Feed the obtained input pattern to the 
neural network and compute activation strengths 
for each output node. Next, select the sense of 
the node that has a larger activation value than 
all other output node. If the activation strength is 
lower than the threshold, it will be discarded and 
5 Experimental Evaluation 
For an experimental evaluation, 10 ambiguous 
Korean nouns were selected, along with a total 
of 500 test sentences in which one homograph 
appears. In order to follow the ambiguity 
distribution described in Section 3.2, we set the 
number of test nouns with two senses to 8 (80%). 
The test sentences were randomly selected from 
the KIBS (Korean Information Base System) 
corpus. 
The experimental results are shown in Table 3, 
where result A is the case when the most 
frequent sense was taken as the answer. To 
compare it with our approach (result C), we also 
performed the experiment using Li’s method 
(result B). For sense disambiguation, Li’s 
method features which are similar to our method. 
However, unlike our method, which combines 
all features by using neural networks, Li 
considers only one clue at each decision step. As 
shown in the table, our approach exceeded Li’s 
)),((max)(
ji
P
i
PCCsimCInputVal
i
=    (2)
Table 3. Comparison of WSD Results 
Precision (%) 
Word Sense No 
(A) (B) (C)
father & child 33 
pwuca
rich man 17 
66 64 72
liver 37 
kancang
soy source 13 
74 84 74
housework 39 
kasa
words of song 11 
78 68 82
shoes 45 
kwutwu
word of mouth 5 
90 70 92
eye 42 
nwun
snow 8
84 80 86
container 41 
yongki 82 72 88
the network will not make any decisions. This 
process is represented in Figure 9. 
courage 9
doctor 27 
uysa
intention 23
54 80 84
district 27 
cikwu
the earth 23 
54 84 92
whole body 39 
one’s past 6 
censin
telegraph 5 
78 84 80
one’s best 27 
military strength 13 
electric power 7 
cenlyek
past record 3 
54 50 72
Average Precision 71.4 73.6 82.2
 � (A) : Baseline   (B) : Li’s method 
(C) : Proposed method (using a 2-layer NN) 
nwun
1
(snow)
nwun
2
(eye)
.
.
.
threshold
(0.000)
(0.285)
(0.250)
(1.000)
(0.000)
(0.857)
(0.000)
(0.000)
(0.000)
(0.285)
(0.000)
(0.000)
(0.250)
Figure 9. Sense Disambiguation for ‘nwun’ 
in most of the results except ‘kancang’ and 
‘censin’. This result shows that word sense  
disambiguation can be improved by combining 
several clues together (e.g. neural networks) 
rather than using them independently (e.g. Li’s 
method). 
The performance for each stage of the 
proposed method is shown in Table 4. Symbols 
COL, VSR, NN and MFS in the table indicate 4 
stages of our method in Figure 6, respectively. 
In the NN stage, the 3-layer model did not show 
a performance superior  to the 2-layer model 
because of the lack of training samples. Since 
the 2-layer model has fewer parameters to be 
trained, it is more efficient to generalize for 
limited training corpora than the 3-layer model. 
Conclusion 
To resolve sense ambiguities in 
Korean-to-Japanese MT, this paper has proposed 
a practical word sense disambiguation method 
using neural networks. Unlike most previous 
approaches based on neural networks, we reduce 
the number of features for the network to a 
practical size by using concept codes rather than 
lexical words. In an experimental evaluation, the 
proposed WSD model using a 2-layer network 
achieved an average precision of 82.2% with an 
improvement over Li’s method by 8.6%. This 
result is very promising for real world MT 
systems. 
We plan further research to improve precision 
and to expand our method for verb homograph 
disambiguation. 
Acknowledgements 
This work was supported by the Korea Science 
and Engineering Foundation (KOSEF)  through 
the Advanced Information Technology Research 
Center(AITrc). 
Table 4. Average Precision and Coverage 
for Each Stage of thePproposed Method 
 
<Case 1 : 2-layer NN> 
 COL VSR NN MFS 
Avg. Prec 100.0% 91.2% 86.3% 56.1%
Avg. Cov 3.6% 6.8% 73.2% 16.4%
 
<Case 2 : 3-layer NN> 
 COL VSR NN MFS 
Avg. Prec 100.0% 91.2% 87.1% 56.0%
Avg. Cov 3.6% 6.8% 72.5% 17.1%
 
References 
Gallant S. (1991) A Practical Approach for 
Representing Context and for Performing Word 
Sense Disambiguation Using Neural Networks. 
Neural Computation, 3/3, pp. 293-309 
Leacock C., Twell G. and Voorhees E. (1993) 
Corpus-based Statistical Sense Resolution. In 
Proceedings of the ARPA Human Language 
Technology Workshop, San Francisco, Morgan 
Kaufman, pp. 260-265 
Li H. F., Heo N. W., Moon K. H., Lee J. H. and Lee 
G. B. (2000) Lexical Transfer Ambiguity 
Resolution Using Automatically-Extracted Concept 
Co-occurrence Information. International Journal 
of Computer Processing of Oriental Languages, 
13/1, pp. 53-68  
McRoy S. (1992) Using Multiple Knowledge Sources 
for Word Sense Discrimination. Computational 
Linguistics, 18/1, pp. 1-30 
Mooney R. (1996) Comparative Experiments on 
Disambiguating Word Senses: An Illustration of 
the Role of Bias in Machine Learning. In 
Proceedings of the Conference on Empirical 
Methods in Natural Language Processing, 
Philadelphia, PA, pp. 82-91 
Ng, H. T. and Zelle J. (1997) Corpus-Based 
Approaches to Semantic Interpretation in Natural 
Language Processing. AI Magazine, 18/4, pp. 
45-64 
Ohno S. and Hamanishi M. (1981) New Synonym 
Dictionary. Kadokawa Shoten, Tokyo 
Smadja F. (1993) Retrieving Collocations from Text: 
Xtract. Computational Linguistics, 19/1, pp. 
143-177 
Waltz D. L. and Pollack J. (1985) Massively Parallel 
Parsing: A Strongly Interactive Model of Natural 
Language Interpretation. Cognitive Science, 9, pp. 
51-74 
