Proceedings of EACL '99 
Detection of Japanese Homophone Errors by a Decision List 
Including a Written Word as a Default Evidence 
Hiroyuki Shinnou 
Ibaraki University 
Dept. of Systems Engineering 
4-12-1 Nakanarusawa 
Hitachi, Ibaraki, 316-8511, JAPAN 
shinnou@lily, dse. ibaraki, ac. jp 
Abstract 
In this paper, we propose a practical 
method to detect Japanese homophone 
errors in Japanese texts. It is very 
important to detect homophone errors 
in Japanese revision systems because 
Japanese texts suffer from homophone 
errors frequently. In order to detect ho- 
mophone errors, we have only to solve 
the homophone problem. We can use the 
decision list to do it because the homo- 
phone problem is equivalent to the word 
sense disambiguation problem. However, 
the homophone problem is different from 
the word sense disambiguation problem 
because the former can use the written 
word but the latter cannot. In this pa- 
per, we incorporate the written word into 
the original decision list by obtaining the 
identifying strength of the written word. 
The improved decision list can raise the 
F-measure of error detection. 
1 Introduction 
In this paper, we propose a method of detect- 
ing Japanese homophone errors in Japanese texts. 
Our method is based on a decision list proposed by 
Yarowsky (Yarowsky, 1994; Yarowsky, 1995). We 
improve the original decision list by using writ- 
ten words in the default evidence. The improved 
decision list can raise the F-measure of error de- 
tection. 
Most Japanese texts are written using Japanese 
word processors. To input a word composed of 
kanji characters, we first input the phonetic hira- 
gana sequence for the word, and then convert it 
to the desired kanji sequence. However, multiple 
converted kanji sequences are generally produced, 
and we must then choose the correct kanji se- 
quence. Therefore, Japanese texts suffer from ho- 
mophone errors caused by incorrect choices. Care- 
lessness of choice alone is not the cause of homo- 
phone errors; Ignorance of the difference among 
homophone words is serious. For example, many 
Japanese are not aware of the difference between 
'.~.,'~,' and '~,~,', or between '~.~.' and ,~,t. 
In this paper, we define the term homophone 
set as a set of words consisting of kanji charac- 
ters that have the same phone 2. Then, we define 
the term homophone word as a word in a ho- 
mophone set. For example, the set { ~/~-~ (proba- 
bility), ~7 (establishment)} is a homophone set 
because words in the set axe composed of kanji 
characters that have the same phone 'ka-ku-ri-tu'. 
Thus, q/~' and '~f_' are homophone words. In 
this paper, we name the problem of choosing the 
correct word from the homophone set the homo- 
phone problem. In order to detect homophone 
errors, we make a list of homophone sets in ad- 
vance, find a homophone word in the text, and 
then solve the homophone problem for the homo- 
phone word. 
Many methods of solving the homophone prob- 
lem have been proposed (Tochinai et al., 1986; 
Ibuki et al., 1997; Oku and Matsuoka, 1997; Oku, 
1994; Wakita and Kaneko, 1996). However, they 
are restricted to the homophone problem, that is, 
they are heuristic methods. On the other hand, 
the homophone problem is equivalent to the word 
sense disambiguation problem if the phone of the 
homophone word is regarded as the word, and the 
homophone word as the sense. Therefore, we can 
solve the homophone problem by using various 
1 '~'.-~.,~. and '~.~..m~,' have a same phone 'i-sift'. The 
meaning of '~,' is a general will, and the meaning of 
'~:~'.~.,,... is a strong positive will. '~.~.' and '~' have 
a same phone 'cho-kkan'. The meaning of 'l-ff__,~. i is an 
intuition through a feeling, and the meaning of '~' 
is an intuition through a latent knowledge. 
ZWe ignore the difference of accents, stresses and 
parts of speech. That is, the homophone set is the 
set of words having the same expression in hiragana 
characters. 
180 
Proceedings of EACL '99 
statistical methods proposed for the word sense 
disambiguation problem(Fujii, 1998). Take the 
case of context-sensitive spelling error detection 
3, which is equivalent to the homophone problem. 
For that problem, some statistical methods have 
been applied and succeeded(Golding, 1995; Gold- 
ing and Schabes, 1996). Hence, statistical meth- 
ods axe certainly valid for the homophone prob- 
lem. In particular, the decision list is valid for 
the homophone problem(Shinnou, 1998). The de- 
cision list arranges evidences to identify the word 
sense in the order of strength of identifying the 
sense. The word sense is judged by the evidence, 
with the highest identifying strength, in the con- 
text. 
Although the homophone problem is equivalent 
to the word sense disambiguation problem, the 
former has a distinct difference from the latter. 
In the homophone problem, almost all of the an- 
swers axe given correctly, because almost all of the 
expressions written in the given text are correct. 
It is difficult to decide which is the meaning of 
'crane', 'crane of animal' or 'crane of tool'. How- 
ever, it is almost right that the correct expression 
of '~' in a text is not '~-~' but '~1~'. In 
the homophone problem, the choice of the writ- 
ten word results in high precision. We should use 
this information. However, the method to always 
choose the written word is useless for error detec- 
tion because it doesn't detect errors at all. The 
method used for the homophone problem should 
be evaluated from the precision and the recall of 
the error detection. In this paper, we evaluate it 
by the F-measure to combine the precision and 
the recall, and use the written word to raise the 
F-measure of the original decision list. 
We use the written word as an evidence of the 
decision list. The problem is how much strength 
to give to that evidence. If the strength is high, 
the precision rises but the recall drops. On the 
other hand, if the strength is low, the decision list 
is not improved. In this paper, we calculate the 
strength that gives the maximum F-measure in a 
training corpus. As a result, our decision list can 
raise the F-measure of error detection. 
2 Homophone disambiguation by a 
decision list 
In this section, we describe how to construct the 
decision list and to apply it to the homophone 
problem. 
SFor example, confusion between 'peace' and 
'piece', or between 'quiet' and 'quite' is the context- 
sensitive spelling error. 
2.1 Construction of the decision list 
The decision list is constructed by the following 
steps. 
step 1 Prepare homophone sets. 
In this paper, we use the 12 homophone sets 
shown in Table 1, which consist of homophone 
words that tend to be mis-chosen. 
Table 1: Homophone sets 
Phone Homophone set 
sa-i-ken 
ka-i-hou 
kyo-u-cho-u 
ji-shi-n 
ka-n-shi-n 
ta-i-ga-i 
{ ~, ~¢~ } 
{~, ~} 
{ t~-~, ~ } {~,~#} 
{ ~,~,, r~,c, } { ~, ~,~% } 
u-n-ko-u { ~, ~T } 
do-u-shi { NN, N± } 
ka-te-i { ~_, ~..~:? } 
ji-kko-u { ~, ~ } 
syo-ku-ryo-u { ~, ~ } 
syo-u-ga-i { ~=-~, \[~=-~ } 
step 2 Set context information, i.e. evidences, to 
identify the homophone word. 
We use the following three kinds of evidence. 
• word (w) in front of H: Expressed as w- 
• word (w) behind H: Expressed as w+ 
• fi~tu words 4 surrounding H: We pick up 
the nearest three fir/tu words in front of and 
behind H respectively. We express them as 
w±3. 
step 3 Derive the frequency frq(wi,ej) of the 
collocation between the homophone word wl 
in the homophone set {Wl,W~,-.-,wn} and 
the evidence e j, by using a training corpus. 
For example, let us consider the homophone set 
{ ~_~1~ (running (of a ship, etc.)), ~_~7 (running 
(ofa train, etc.))} and the following two Japanese 
sentences. 
Sentence 1 r~g~)~J~;o~ ~ - b J~'~7~_ 
(A west wind of 3 m/s did not prevent the 
plane from flying.) 
4The firitu word is defined as an independent word 
which can form one bun-setu by itself. Nouns, verbs 
and adjectives are examples. 
181 
Proceedings of EACL '99 
Table 2: Answers and identifying strength for 
Evid. 
~: + (to+) 
(of-) 
~T~ ±3 (plane±3) 
°.. 
~+ (hour+) 
~.~ ±3 (midnight±3) 
~K~ ±3 (shorten±3) 
.,. 
default 
I Freq. of Freq. of ,~_~, ,~, 
77 53 
252 282 
4 0 
14 11 
0 48 
0 4 
1468 1422 
evidences 
Ans. I Identifying 
Strength 
~ 0.538 
~ 0.162 
~ 5.358 
~.~t~ 0.345 
~ 8.910 
~ 5.358 
~ 0.046 
Sentence 2 F-~-~7)~'~~s~:~ '~.-f,= o J 
(Running hours in the early morning and dur- 
ing the night were shortened.) 
From sentence 1, we can extract the following 
evidences for the word '~': 
and from sentence 2, we can extract the following 
evidences for the word '~': 
"~#r~? +", "¢) -", "~+~ ±3", "~@ +3", 
"@r~ +Y', "~ +3", "~ +3". 
step 4 Define the strength est(wi, ej) of estimat- 
ing that the homophone word wl is correct 
given the evidence e j: 
est(wi, ej ) = log( w, P(Pif:j l),e ~ ) 2.,k#i ~ kl 
j\] 
where P(wi\]ej) is approximately calculated 
by: 
frq(wi, ej ) + a P(wl \[ej) = )-~k frq(wk, ej) + a" 
a in the above expression is included to avoid 
the unsatisfactory case of frq(wl, ej) = O. In 
this paper, we set a : 0.15. We also use the 
special evidence default, frq(wl, default) is 
defined as the frequency of wl. 
step5 Pick the highest strength est(wh,ej) 
among 
5As in this paper, the addition of a small value is 
an easy and effective way to avoid the unsatisfactory 
case, as shown in (Yarowsky, 1994). 
{est(wl, ), ea(w , e#), • • •, e e#)), 
and set the word wk as the answer for the 
evidence ej. In this case, the identifying 
strength is est(wk, ej). 
For example, by steps 4 and 5 we can construct 
the list shown in Table 2. 
step 6 Fix the answer wkj for each ej and sort 
identifying strengths est(wkj, ej) in order of 
dimension, but remove the evidence whose 
identifying strength is less than the identi- 
fying strength est(wkj,default) for the evi- 
dence default from the list. This is the deci- 
sion list. 
After step 6, we obtain the decision list for the 
homophone set { ~_~, ~.~ } as shown in Table 3. 
Table 3: Example of decision list 
~id. ~gth 
1 ~lJ~ ±3 (train±3) ~.~ 9.453 
2 ~ ±3 (ship±3) ~.~l~ 9.106 
3 ~ ±3 ~.~ 8.910 
(midnight±3) 
701 ~r,~- (hour-) ~.~ 0.358 
746 ¢)+ (of+) ~.~ 0.162 
. ... , ... . ...... 
760 default ~_~ 0.046 
2.2 Solving by a decision llst 
In order to solve the homophone problem by the 
decision list, we first find the homophone word w 
in the given text, and then extract evidences E for 
the word w from the text: 
E = {el, e:,..., e, }. 
182 
Proceedings of EACL '99 
Next, picking up the evidence from the deci- 
sion list for the homophone set for the homophone 
word w in order of rank, we check whether the ev- 
idence is in the set E. If the evidence ej is in the 
set E, the answer wkj for ej is judged to be the 
correct expression for the homophone word w. If 
wkj is equal to w, w is judged to be correct, and 
if it is not equal, then it is shown that w may be 
the error for wkj. 
3 Use of the written word 
In this section, we describe the use of the writ- 
ten word in the homophone problem and how to 
incorporate it into the decision list. 
3.1 Evaluation of error detection systems 
As described in the Introduction, the written word 
cannot be used in the word sense disambiguation 
problem, but it is useful for solving homophone 
problems. The method used for the homophone 
problem is trivial if the method is evaluated by 
the precision of distinction using the following for- 
mula: 
number of correct discriminations 
number of all discriminations 
That is, if the expression is '~\]~' (or '~.~'), 
then we should clearly choose the word '~t~' 
(or the word '~') from the homophone set { 
~_~t~, ~_~T }. This distinction method probably 
has better precision than any other methods for 
the word sense disambiguation problem. However, 
this method is useless because it does not detect 
errors at all. 
The method for the homophone problem should 
be evaluated from the standpoint of not error dis- 
crimination but error detection. In this paper, we 
use the F-measure (Eq.1) to combine the precision 
P and the recall R defined as follows: 
number of real errors in detected errors P= 
R= number 
number of detected errors 
of real errors in detected errors 
number of errors in the tezt 
2PR 
F- P+R (1) 
3.2 Use of the identifying strength of the 
written word 
The distinction method to choose the written 
word is useless, but it has a very high precision 
of error discrimination. Thus, it is valid to use 
this method where it is difficult to use context to 
solve the homophone problem. 
The question is when to stop using the deci- 
sion from context and use the written word. In 
this paper, we regard the written word as a kind 
of evidence on context, and give it an identifying 
strength. Consequently we can use the written 
word in the decision list. 
3.3 Calculation of the identifying 
strength of the written word 
First, let z be the identifying strength of the writ- 
ten word. We name the set of evidences with 
higher identifying strength than z the set a, and 
the set of evidences with lower identifying strength 
than z the set f~, 
Let T be the number of homophone problems 
for a homophone set. We solve them by the orig- 
inal decision list DLO. Let G (or H) be the ratio 
of the number of homophone problems by judged 
by a (or f~ ) to T. Let g (or h) be the precision of 
a (or f~), and p be the occurrence probability of 
the homophone error. 
The number of problems correctly solved by a 
is as follows: 
aT(1 - p), (2) 
and the number of problems incorrectly solved by 
a is as follows: 
GTp. (3) 
The number of problems detected as errors in Eq.2 
and Eq.3 are GT(1 - p)(1 - g) and GTpg respec- 
tively. Thus, the number of problems detected as 
errors by a is as follows: 
GT((1 - p)(1 - g) + pg). (4) 
In the same way, the number of problems detected 
as errors by/~ is as follows: 
HT((1 - p)(1 - h) + ph). (5) 
Consequently the total number of problems de- 
tected as errors is as follows: 
T(G((1 -p)(1 -g) + pg) +H((1 -p)(1 - h)+ph)). 
(6) 
The number of correct detections in Eq.6 is 
Tp(Gg + Hh). Therefore the precision P0 is as 
follows: 
Po = p(Gg + Hh)/{G((1 - p)(1 - g) + pg) 
+ H((1 - p)(1 - h) + ph)} 
Because the number of real errors in T is Tp, the 
recall R0 is Gg+Hh. By using P0 and R0, we can 
get the F-measure F0 of DL0 by Eq. 1. 
Next, we construct the decision list incorporat- 
ing the written word into DL0. We name this deci- 
sion list DL1. In DL1, we use the written word to 
solve problems which we cannot judge by c\[. That 
183 
Proceedings of EACL '99 
iEvid. Ans. Strength 
DLO 
% 
Evid. Ans. Strength 
x+~ 
Evid. Arts. 
written f.ritten~ .ord ~ .,,rd / 
DLI 
Strength 
x+ ~ 
X 
Figure 1: Construction of DL1 
is, DL1 is the decision list to attach the written 
word as the default evidence to a (see Fig.l). 
Next, we calculate the precision and the recall 
of DL1. Because a of DL1 is the same as that of 
DL0, the number of problems detected as errors by 
a is given by Eq.4. In the case of DL1, problems 
judged by ~ of DL0 are judged by the written 
word. Therefore, we detect no error from these 
problems. 
As a result, the number of problems detected as 
errors by DL1 is given by Eq.4, and the number of 
real errors in these detections is TGpg. Therefore, 
the precision P1 of DL1 is as follows: 
p1 = Pg 
(1 - p)(1 - g) + pg" 
Because the number of whole errors is Tp, the 
recall R1 of DL1 is Gg. By using P1 and t/1, we 
can get the F-measure F1 of DL1 by Eq.1. 
Finally, we try to define the identifying strength 
z. z is the value that yields the maximum F~ un- 
der the condition F1 > F0. However, theoretical 
calculation alone cannot give z, because p is un- 
known, and functions of G,H,g, and h are also 
unknown. 
In this paper, we set p = 0.05, and get values of 
G, H, g, and h by using the training corpus which 
is the resource used to construct the original deci- 
sion list DL0. Take the case of the homophone set 
{'~', '~.~T'}. For this homophone set, we try to 
get values of G, H, g, and h. The training corpus 
has 2,890 sentences which include the word '~.~\]~' 
or the word '~.~'. These 2,890 sentences are ho- 
mophone problems for that homophone set. The 
identifying strength of DL0 for this homophone 
set covers from 0.046 to 9.453 as shown in Table 3. 
Next we give z a value. For example, we set z = 
2.5. In this case, the number of problems judged 
by a is 1,631, and the number of correct judgments 
in them is 1,593. Thus, G = 1631/2890 = 0.564 
and g = 1593/1631 = 0.977. In the same way, 
under this assumption z -- 2.5, the num- 
ber of problems judged by j3 is 1,259, and 
the number of correct judgments in them 
is 854. Thus, H = 1259/2890 = 0.436 and 
h = 854/1259 = 0.678. As a result, if z = 2.5, 
then P0 = 0.225, R0 = 0.847, F0 = 0.356, 
P1 = 0.688, R1 = 0.551 and F1 = 0.612. In Fig.2, 
Fig.3 and Fig.4, we show the experiment result 
when z varies from 0.0 to 10.0 in units of 0.1. By 
choosing the maximum value of F1 in Fig.4, we 
can get the desired z. In this homophone set, we 
obtain z = 3.0. 
4 Experiments 
First, we obtain each identifying strength of the 
written word for the 12 homophone sets shown 
in Table 1, by the above method. We show this 
result in Table 4. LRO in this table means the 
lowest rank of DL0. That is, LR0 is the rank of 
the default evidence. LR1 means the lowest rank 
of DL1. That is, LR1 is the rank of the evidence of 
the written word. Moreover, LR0 and LR1 mean 
the sizes of each decision list DL0 and DL1. 
Second, we extract sentences which include a 
word in the 12 homophone sets from a corpus. We 
note that this corpus is different from the training 
corpus; the corpus is one year's worth of Mainichi 
newspaper articles, and the training corpus is one 
year's worth of Nikkei newspaper articles. The 
extracted sentences are the test sentences of the 
experiment. We assume that these sentences have 
no homophone errors. 
Last, we randomly select 5% of the test sen- 
tences, and forcibly put homophone errors into 
these selected sentences by changing the written 
184 
Proceedings of EACL '99 
1 
0.9 
0.8 
0.7 
0.6 
0.5 
0,4 
0.3 
0.2 
o 
¢ 
o~ 'DL-I" o 
'DI.-O" + 
o~ g 
o~ 
I r I = r I B = B 
1 2 3 4 S 6 7 It 9 
Figure 2: Precisions Po and P1 
Table 4: Identifying strength of the expression 
Identifying 
homophone set strength LR0 LR1 
of expression 
{ ~, ~ } 4.9 
{ ~, ~ } 4.6 
{ ~, ~j~ } 4.3 
{ ~, ~$P} 4.8 
{ ~,,~,, r~,t:, } 
{/~-, ~t. } 
5.7 
3.9 
{ ~.~, ~.~T } 3.0 
{ ~\],:~,, ~\]=\]= } 4.5 
5.1 ,~° { ,~+~, ~+~ } 
4.3 
{ ~}~, J~}~ } 4.7 
{ t~-~-=, ~=-~ } 5.1 
1062 844 
1104 671 
1120 667 
1134 622 
1007 424 
921 921 
760 319 
811 788 
799 469 
760 665 
697 255 
695 397 
0,9 
o.8 
0.7 
0.6 
0.5 
0,4 
0.3 
o.2 
o.1 
0 
0.7 
0.6 
0s 
0.4 
0.3 
0,2 
01 
0 
\ 
e 
o 
I i I I 
1 2 3 4 
% 
~oooo0oo ~ 
i i i i i 
S 6 7 8 9 
Figure 3: Recalls Ro and Rt 
'DL'I' o 
'DL'O' + 
o 
j - 
% 
% 
% o~ 
I r f I f I I L I 
1 2 3 4 5 6 7 8 9 
Figure 4: F-measures Fo and Ft 
homophone word to another homophone word. 
As a result, the test sentences include 5% errors. 
From these test sentences, we detect homophone 
errors by DL0 and DL1 respectively. 
We conducted this experiment ten times, and 
got the mean of the precision, the recall and the 
F-measure. The result is shown in Table 5. 
For all homophone sets, the F-measure of our 
proposed DL1 is higher than the F-measure of the 
original decision list DL0. Therefore, it is con- 
cluded that our proposed method is effective. 
5 Remarks 
The recall of DL1 is no more than the recall of 
DL0. Our method aims to raise the F-measure 
by raising the precision instead of sacrificing the 
recall. We confirmed the validity of the method by 
experiments in sections 3 and 4. Thus our method 
has only a little effect if the recall is evaluated 
with importance. However, we should note that 
the F-measure of DL1 is always not worse than 
the F-measure of DL0. 
We set the occurrence probability of the homo- 
phone error at p = 0.05. However, each homo- 
phone set has its own p. We need decide p exactly 
because the identifying strength of the written 
word depends on p. However, DL1 will produce 
better results than DL0 if p is smaller than 0.05, 
because the precision of judgment by the written 
word improves without lowering the recall. The 
recall does not fall due to smaller p because It0 
and R1 are independent of p. Moreover, from the 
definitions of P0 and Pt, we can confirm that the 
precision of judgments by the written word im- 
proves with smaller p. 
185 
Proceedings of EACL '99 
Table 5: Result of experiments 
homophone set Number of 
problems 
{ ~, t~ } 1,254 
{ ~, ~-~ } 1,938 { } 
{ { r ,c, } 
{ ) 
4,845 
3,682 
2,032 
618 
588 { ~,~,~,, ~\]:J= } 
1,436 
{ ~, ~¢ } 1,220 
{ ) 
mean 
1,563 
1,074 
1,636 
I DLO DL1 
Po \[ Ro I Fo et \] R1 I Fx 
0.190 0.824 0.309 0.310 0.774 0.443 
0.295 0.899 0.443 0.573 0.835 0.680 
0.583 0.957 0.724 0.616 0.934 0.742 
0.343 0.911 0.499 0.470 0.725 0.571 
0.773 0.987 0.867 0.804 0.981 0.884 
0.708 0.980 0.822 0.806 0.980 0.885 
0.127 0.745 0.217 0.289 0.420 0.342 
0.391 0.939 0.552 0.440 0.913 0.594 
0.789 0.990 0.879 0.903 0.910 0.906 
0.548 0.966 0.700 0.617 0.911 0.736 
0.091 0.692 0.161 0.135 0.287 0.183 
0.681 0.976 0.802 0.760 0.858 0.806 
II 0.46010-906 I 0-581 II 0.560 10.79410.648 1,824 
The number of elements of all homophone sets 
used in this paper was two, but the number of 
elements of real homophone sets may be more. 
However, the bigger this number is, the better 
the result produced by our method, because the 
precision of judgments by the default evidence of 
DL0 drops in this case, but that of DL1 does not. 
Therefore, our method is better than the original 
one even if the number of elements of the homo- 
phone set increases. 
Our method has an advantage that the size of 
DL1 is smaller. The size of the decision list has 
no relation to the precision and the recall, but a 
small decision list has advantages of efficiency of 
calculation and maintenance. 
On the other hand, our method has a problem in 
that it does not use the written word in the judg- 
ment from a; Even the identifying strength of the 
evidence in a must depend on the written word. 
We intend to study the use of the written word 
in the judgment from a. Moreover, homophone 
errors in our experiments are artifidal. We must 
confrm the effectiveness of the proposed method 
for actual homophone errors. 
6 Conclusions 
In this paper, we used the decision list to solve the 
homophone problem. This strategy was based on 
the fact that the homophone problem is equivalent 
to the word sense disambiguation problem. How- 
ever, the homophone problem is different from the 
word sense disambiguation problem because the 
former can use the written word but the latter 
cannot. In this paper, we incorporated the writ- 
ten word into the original decision list by obtain- 
ing the identifying strength of the written word. 
We used 12 homophone sets in experiments. In 
these experiments, our proposed decision list had 
a higher F-measure than the original one. A fu- 
ture task is to further integrate context and the 
written word in the decision list. 
Acknowledgments 
We used Nikkei Shibun CD-ROM '90 and 
Mainichi Shibun CD-ROM '94 as the corpus. The 
Nihon Keizai Shinbun company and the Mainichi 
Shinbun company gave us permission of their col- 
lections. We appreciate the assistance granted by 
both companies. 
References 
Atsushi Fujii. 1998. Corpus-Based Word 
Sence Disambiguation (in Japanese). Journal 
of Japanese Society for Artificial Intelligence, 
13(6):904-911. 
Andrew R. Golding and Yves Schabes. 1996. 
Combining Trigram-based and Feature-based 
Methods for Context-Sensitive Spelling Correc- 
tion. In 3~th Annual Meeting of the Association 
for Computational Linguistics, pages 71-78. 
Andrew R. Golding. 1995. A Bayesian Hybrid 
Method for Context-Sensitive Spelling Correc- 
tion. In Third Workshop on Very Large Corpora 
(WVLC-95), pages 39-53. 
Jun Ibuki, Guowei Xu, Takahiro Saitoh, and Ku- 
nio Matsui. 1997. A new approach for Japanese 
Spelling Correction (in Japanese). SIG Notes 
NL-117-21, IPSJ. 
186 
Proceedings of EACL '99 
Masahiro Oku and Koji Matsuoka. 1997. A 
Method for Detecting Japanese Homophone 
Errors in Compound Nouns based on Char- 
acter Cooccurrence and Its Evaluation (in 
Japanese). Journal of Natural Language Pro- 
cessing, 4(3):83-99. 
Masahiro Oku. 1994. Handling Japanese Homo- 
phone Errors in Revision Support System; RE- 
VISE. In 4th Conference on Applied Natural 
Language Processing (ANLP-9$), pages 156- 
161. 
Hiroyuki Shinnou. 1998. Japanese Homohone 
Disambiguation Using a Decision List Given 
Added Weight to Evidences on Compounds (in 
Japanese). Journal of Information Processing, 
39(12):3200-3206. 
Koji Tochinai, Taisuke Itoh, and Yasuhiro Suzuki. 
1986. Kana-Kanji Translation System with Au- 
tomatic Homonym Selection Using Character 
Chain Matching (in Japanese). Journal of In- 
formation Processing, 27(3):313-321. 
Sakiko Wakita and Hiroshi Kaneko. 1996. Ex- 
traction of Keywords for "Homonym Error 
Checker" (in Japanese). SIG Notes NL-111-5, 
IPSJ. 
David Yarowsky. 1994. Decision Lists for Lex- 
ical Ambiguity Resolution: Application to Ac- 
cent Restoration in Spanish and French. In 32th 
Annual Meeting of the Association for Compu- 
tational Linguistics, pages 88-95. 
David Yarowsky. 1995. Unsupervised Word Sense 
Disambiguation Rivaling Supervised Methods. 
In 33th Annual Meeting of the Association for 
Computational Linguistics, pages 189-196. 
187 
