To what extent does case contribute to verb sense 
disambiguation? 
FUJI1 Atsushi, INUI Kentaro, TOKUNAGA Takenobu and TANAKA Hozmni 
Det)artlnent of CoInl)utox Scien(:e 
Tokyo Institute of Technology 
2-12-11 ()ookayama Meguroku ~\[bkyo 152, JAPAN 
{fujii,inui,take,tanaka}@cs.titech.ac.jp 
Abstract 
Word sense disambugation has re- 
cently been utilized in corpus-based aI)- 
proaches, reflecting the growth in the 
number of nmehine readable texts. One 
(:ategory ()f al)l)roa(:hes disambiguates an 
input verb sense based on the similar- 
ity t)etween its governing (:its(; fillers and 
those in given examl)les. In this palter , 
we introdu<:c the degree of (:<mtriblltion 
of cast; to verb sells(', disambignation intt) 
this existing method, in this, greater di- 
versity of semanti(: range of case filler ex- 
amples will lead to that ease contributing 
to verb sense disambiguation more. We 
also report th(; result of a coml)arative 
ext)eriment, in which the t)erfornlance of 
disaml)igui~tion is iml)rt)ved t)y consider- 
ing this notion of semantic contribution. 
1 Introduction 
Word sense disambiguation is a crucial task in 
many kinds of natural language I)rot:essing at)l)li- 
cations, such as word selection in iIla(;hine trans- 
lation (Sato, 1991), pruning of syntactic struc- 
tures in parsing (l,ytinen, 1986; Nagao, 11994) 
an(l text retrieval (Krovets and Croft, 1992; 
Voorht'.es, 1993). Various researches on word 
sense disamil)ignation have recently been utilized 
in (:orlms-based apt)roache.s, reflecting the growth 
in the numlmr of machine readable texts. Unlike 
rule-basel1 ~l)l)roa('.hes, eortms-l)asext al)proa(:hes 
free us fl'om the task of generalizing observed 1)he- 
nt)Illena to l)roduce rnles for word sense, disaln- 
\])igmttion, e.g. subt:ittegorization rules. Cortms- 
based al)proaches are exet:ut(;(1 based on the in- 
tuitively t'easibh', assmnption that the higher the 
degree of similarity betwee, n the context of an ill- 
put word and tim context ill which tit(; word ap- 
l)cars in a sens(~' in a tort)us , the more plausible it 
becomes that the word is used in the same s(.~nse. 
Corpus-/)ased m(;thotls are. classified into two ap- 
1)rt)aches: examI)le-I)ased approaches (Kurohashi 
and Nagao, 1994; Urmnoto, 1994) and statistic- 
based apl)roa(:hes (l~rown et al., 1991; 1)tLglm and 
Itai, 1!)94; Niwa and Nitta, 11994; Schiitze, 1992; 
Ym'owsky, 1995). We follow the examt)h>based 
apl)roach ill exl)laining its effe.etivity for verb sense 
disamibiguation in Japanese. 
A representative example-based method for 
verb sense disambiguation was proposed by Kuro- 
hashi and Nagao (Kurohashi's inethod) (Kuro- 
hashi rand Nagao, 1994). Their method uses an 
0,xamph; database, containing examples of colloca- 
tions as in figure 1. Figure 1 shows a fragment 
of tim entry associated wittl the Japan(;se verb 
to'ru. As with most words, the ve, rb to'r"¢t has multi- 
pie senses, examples of whit:h are "to take/steal," 
"to attain," "to subst'ril)e" and "to reserve," The 
database gives one or more case frame(s) associ- 
ated with tilt', verbs for each of their senses. In 
.Japanese, a coutI)lelnt;nt Of a verb, which is a con- 
stituent of the case frame of the verb, consists 
of a nonii phrase (case filler) followed by a case 
marker such ms ga (nominative) or o (accusative). 
The database has ~m example set of case fillers for 
each case. As shown in figure 1, examples of a 
comi)lement c.an be considered as an extensional 
description of the selectional restriction on it. 
The task (:onside.red in this paper is %o in- 
terpret" a verb in an input s('.ntcnt:e, i.e. to 
choose ()lit) sense from a set of candidate senses 
of the verb. Given an input sentence, Kuro- 
hashi's method interprets the verb in the input by 
computing semantic similarity between the input 
and exalnples. For this computation, Kurohashi's 
nmthod experimeIltally uses the Ja,panese word 
thesaurus Bunruigoihyo (National-Language R(> 
search Institute, 1964). As with Inost thesauruses, 
the length of the 1lath between two words in Bun- 
r'uigoihyo is exl)e, eted tt) reflect the similarity be,- 
tween them. Figure 2 ilhlstrates a fragment of 
B'unruigoihyo in(:hlding some of the nouns in fig- 
ure 1. I,et us take the example sentence (1). 
(1) hisho .qa sh, indaish, a o tor,u. 
(set:retm'y-NOM) (siegel,trig (:ar-ACC) (?) 
lit this examph',, it may t)e judged according to 
tigure 2 that h, ish, o ("secretary") and shindaisha 
("sleeping car")in (1)i~l(, ,'~emantically similar 
to joshu ("assistant") att(l hikbki ("airplane"), re- 
Sl)ectively, which are cxamI)les that collocate with 
t(rru ("to reserve"). As sut'h, the sense of rot'u, in 
(1) can be interpreted as "to reserve." llowever, 
in Kurohashi's nmthod, several usefifl properties 
for verb disambuguatittn are missing: 
1. httuitively speaking, the, contribution of the 
5 9 
sur{ (pickpocket) } kanojo (she) 
ga an'i ( }n'ot her) 
/,:a ,,, (he) } 
l,:a*to2o (she) 
shachO (conlpany president) ga 
gal,:~sei (student) 
kane (money) } 
saifu (wallet) 
otoko (man) o u,n- (m,r~o) 
aidea {idea) 
menkyoshd (license) 
sh ikaku (qualification) 
biza (visa) 
tora (to take/steal) 
o attain) tora (to 
} { } ka,'e (he) shinbun (newspaper) ch.ichi (father) 9a o /,,yak,, (client) zasshi (journal) toru (to subscribe) 
d,,ntai (group) kippu (ticket) ,,,a,maa:~j, ~,~ (pas,~oHge,') ,aa h~'V~ (room) o 
josh,,, (assistant) hikdki (airplane) 
tortt (to reserve) 
Figure 1: A fragment of an example database, and the entry associated with Japanese verb torn 
~ kare £anojo otoko oshu 
~isho lt?na 
~-- aidea 
~-shin b ~tn \[~zasshi 
shgkanshi 
shikaka 
r---menkyosh5 ~-biza 
-~a%e 
saifl~ 
~__~ kippa 
hikdki 
shindaisha 
heya 
o~ocha 
Figure 2: A fragment of Bunruigoihyo 
accusative to verb sense disambiguation is 
greater than that of the nominative with the 
case of verb ~t(-)ru. 1' 
2. The seleetional restriction of a certain case is 
stronger than those of others. For example, 
in tile accusative, the selectional restriction 
of "to subscribe" is stronger than that of "to 
take/steal" which Mlows various kinds of ob- 
jects as its case filler. 
In this p~tt)er, we improve on Kurohashi's method 
by introducing a formalization of these notions, 
and report the result of a comparative experiment. 
2 Motiw, tion 
Property 1 in section 1 is exemplified by the 
input sentence (2). 
(2) shach5 ga sh£kanshi o toru. 
(presideut-NOM) (magazine-ACe) (?) 
The nominative, shachd ("company president"), 
in (2) is found in the %o attaiIf' ease frame of torn 
and there is no other co-occurrence in any other 
sense of toru; therefore, the nominative supports 
an interpretation "to attain." On the other hand, 
© 
nominative accusative 
Figure 3: The semantic ranges of the nominative and 
accusative with verb torn 
the accusative, ,sh, gtkanshi ("magazine"), is most 
similar to the examples included ill tile accusative 
of the "to subscribe" and therefore the accusative 
supports another interpretation "to snt)scribe." 
Although tile most plausible interpretation here 
is actually the latter, Kurohashi's method would 
choose tile former since (a) the degree in which 
the nominative sut)ports "to attain" happe.ns to be 
stronger than the degree in which the accusatiw'~ 
supports "to subscribe," and (b) their method al- 
ways relies equally on the similarity in the nomi- 
native and the accusative. Itowever, in the case of 
torn, since the semantic range of nouns collocating 
with the verb in the nominative does not seem to 
have a strong delinearization in a semantic sense, 
it would be difficult, or even risky, to properly 
interpret the verb sense based on tile similarity 
in the nominative. In contrast, since the ranges 
are diverse in the accusative, it would lm fe.asible. 
to rely more strongly on the similarity in the ac- 
cusative. This argument can be illustrated as in 
figure 3, in which the symbols "1" and "2" de- 
note example case fillers of different case fraines 
respectively, and an input sentence includes two 
case fillers denoted by "x" and "y." The figure 
shows the distribution of example case fillers tie- 
noted by those symbols in a semantic space, where 
the semantic similarity between two case fillers is 
represented by the physical distance between two 
symbols. In the nominative, since "x" ha.ptmns to 
60 
})e iliuch cl()s(;r to & "2" th~Ln ~tlly "1~" "X" IIh~y 
be estimated to belong to the range of "2"s al- 
l, hough "x" ae('.ually belongs to both sets of "l"s 
a.nd "2"s. Ill the accusative, however, "y" would 
he prol)erly estimated to belong to "l"s due. to 
(;tie mutuM indet)en(lence of the two ac(:usative 
case filler sets, even though examples (lid not fully 
(:over e~tch of the ranges of "t"s and "2"s. Note 
that this diiferen(:e would he critieM if example 
(1,~t~ w(;re sparse. This argument suggests that 
we introduce, the degree of (:ontribution of case to 
verb sense disaml)iguation. One may argue that 
this l)roperty ca.n tie generMized as the notion tha.t 
the system Mways r(~lies only on the similarity in 
the a(;(:usa.tiv(~ \[or v(;r\[) setlse (lisami)iguation. Al- 
though some tYl)i(:M verbs show this genera.1 no- 
tion, it is not gum'~mte:ed for any ki~,d of vert). 
Our al)l)roach, whi('h c.omputes the degree (if (:on- 
tril)ntion fl)r e~(:h vert) resi)(~(:tively , (:all tmndl(' 
exel)tionM cas(~s ~ts we.ll as tyl)ical ones. 
1)roperty 2 is exemplified 1)y the inlmt sentence 
(3). 
^ 
(3) ord.~n ga omoch, a o toru. 
(1,rother-NOM) (toy-ACC) ('?) 
In (3) th(! mosl: plausible inte.rpretati(m of l.or,u is 
"to st(~al." Tim nonlina.tiv(~ does llot give mu(:h 
inf(~rtna.ti()n for interl)r(Mtip; the vert) for t;h(~ same 
reason as exa.uiph+ (2). lu the accusative, the 
datallase in tigure\] has two example case lillers 
that arm (;(lU;fl\]y similar to om, ocha ("toy"): saiftt 
("wallet") and h, ikaki ("airplane"). These exam- 
i)les equMly SUl)t)ort two (lifferent interi)ret;ttions: 
"t() steal" mM "to res(;rve," which me.ires thnt the 
verl) sense aml)igui(;y still rcmMns. \]lea'e, one ina.y 
noti(:e thai; since tile a(:(;ust~l;ive examples in tile 
C;tSe \[l'i/,lIle of \[,OT'lt ("to reserve:') ~Ll'e, less diverse 
in niea.uing than the other case fr;tmes, the se\[(!e- 
l;ion;tl restrit:l;ion on the ;t(:(:us~tiv(; of to'v'tt ('%o re- 
starve') is relatively strong, ~md thus that it can be 
estiniated tt) lie reJatively ilnplausible for ornocha 
("toy") to sa.tis\[y it. If su(:h reasoning is correct, 
given that the ex~mll)les in the accusative of tor"u 
("to steal" ) are most widely distributed, the inlmt 
verl) (:an lie interl)reted as "to steal." The consid- 
eration M)ove motivated us to introduce the no- 
tion of rela.tive strength t)f select\]ohM restriction 
into our e~xaJnple-1)ased verb sense disalnbigu~tion 
method. 
3 Algorithm 
We assume that inputs ~re simple sentences, 
e~mh one of which consists of a sequellce of eases 
fl)llowe.d by their governing verb. The. task is to 
identify the sense of each input verb. The set of 
verl) senses we use are those defined in the existing 
machine re~tdal)le (li(:ti()llary "IPAL" (IPA, 1987), 
which also (:olltains example case fillers as shown 
in figure .t. As well as Kuroh~tshi's method the 
similarity between two (:as(; tillers, or more pre- 
('isely the semantic-head nouns of them, is corn- 
Table 1: The relation I)t'.tweell the length of path I)e- 
|;ween two i\[()llns A" {Mid Y (lt:7/,(.k', }:)) ill IJtL:l~r,Lil:o'i- 
bye and the similarity hetween them (.sirn(X, Y)) 
\[~a.n(X,Y) l 0,. . : 2 9468 l012 t 
\[ s.zm(A, ~ ) tl 10 8 7 5 0 
tinted by using IIv, rwuigoih, yo (National-Language 
l{esearch lnstil;ute, 1964). Following Kurohashi's 
method, we define .sim(X,~), whi(:h stands for 
the silnilarity 1)etween words X mM Y, as in ta- 
ttle 1. It should he noted here that both nl(~t;h()ds 
~tre theoreti(:ally indel)endent of wh;tt resources 
}ire use(t. 
~lb illustl'~te tit(; overall a.lgorithm, we r(~t)la.(:(~ 
the illustra.tive cases mentioned in section 1 wilh a 
slightly re(ire gelmral case as in figure. 4. The iut)ut 
is {nc,-'mc), nc:'m.ce, v}, where he. i all!notes the 
case filler in the case ci, a.nd 'ntc~ denotes the case 
maker of <:i. The candidates of ilH;(~rl)ret;ttion for 
v, which ~re ,sl, ,s2 ~md s3, are deriv(;d froln the 
datal)ase. The. d;ttal)ase also gives a set ~;si c i of 
case filler ex~mq)les for each case. c:.i (if each sense 
si. " " den()tes thnt the eorresl)ondit~t~; case is 
not allowed. 
~'S I ,/:\[ gS I ,(:2 I' ('gl) 
datat)ase &'s~,e i (-.~c., ,c., ,-.s.,c _ .c.~. t: (,s:~) 
i" ~-','13,,: 2 
" (,~31 
Figure 4: An inl)uL aud Lhe database 
in the course of tlle verb sense disanll)iguation 
process, the system tirst discards the candidates 
whose case Dame coi~straint is grammatically vi- 
olated by the input (this parallels Kurohashi's 
method). Ill the c}lse of figure 4, .s:) is dist:arded 
bec3.use the ('.&se fl'~Li\[ie of v (,s3) does ilOt su\])- 
eategrize the. case ct i. lit ('ontrast, s~ will not be 
reject(;d ~tt this step. This is based on the fact 
that in ,J;tl)~UleSe , t'~ts(!s t:tm lie easily omitted if 
they ;~re inferable from the given context. 
Thereafter, the system comt)utes the 1)la.usibil- 
ity of the remaining candidates of interpret~ttion 
and chooses the most pla,usit)le interpretatiou as 
its output, in Kurohashi's method, tim plausil>il- 
ity of tui interl)retation is eonq)uted t)y aver;tging; 
the degree of similarity between the inl)ut com- 
1)leinent and the exalnple complements 'e for each 
case &S in e(\[u&tiOll (1): where P(,q) is thc \[)\[~LU- 
I Since I I'AI, does not necessarily eliIlll~(~lligte all the 
possible optional cases, the ~LbSellCe of C;tse C I from "v 
(.~a) in the figure may denotl; that ¢:1 is optioual. If 
so, the interpretation s:) sht)uld not be dis(:arded in 
this stooge. To avoid this problem, we use the same 
technique as used in Kurohashi's method. That is, 
we deline several particular ea.ses befl)reha.nd, such as 
l, he nomin~d;ive, the accusative i~Iltl the (l~ttive, to be. 
obligatory, and impose tilt; graulm~rti(:~tI t:ase fHtllle 
t:onstrmnt as ~d)ove only in those obligatory (:ases. ()p- 
tionality of case needs to be further exl)h)red. 
2g's2,ca is not taken into consideration in the com- 
put~ttion since ca does not ~H)pe~tr in tile input. 
(52 
sibility of interpreting the input verb as sense 3, 
and SIM(nc, $~,c) is the degree of the similarity 
between the input complement nc and example 
complements $s,c. ws is the weight on an inter- 
pretation 3 such that more obligatory cases im- 
posed by s being found in tile input, will lead to 
a greater value of the weight a. 
P(3) = w3 E SIM(nc, Ss,c) (1) 
c 
SIM(nc, £3,c) is the maximum degree of similar- 
ity between nc and each of £3,e as in equation 
(2). 
SIM(,  c, &,e) = max sim(,+c, (2) ec~8,c 
In our method, on the other hand, for the rea- 
son indicated in section 1, we introduce two new 
factors: 
• contribution of case to verb sense disambigna- 
lion (CCD), 
• relative strength of selectional restriction 
(RSSR). 
First, in regard to CCD, we compute the plausi- 
bility of an interpretation by the weighted average 
of the degree of similarity for each case as in equa- 
tion (a), replacing equation (1). 
P(3) = w3 Ec g3,e)" CCD(c) 
Ec CUD(c) (3) 
Here, CCD(c) is a newly introduced weight, such 
that CCD(c) is greater when the degree of case 
e's contribution is higher. 
Second, in regard to RSSR, the stronger the se- 
lectional restriction on a case of a case frame is, 
the less plausible all input complement satisfies 
that restriction as mentioned in section 1. Note 
here that tile plausibility of an interpretation of an 
input verb can be regarded as the plausibility that 
the input complements satisfy the selectional re- 
striction associated with that interpretation. This 
leads us to replace SIM(nc, Es,c) in equation (3) 
with PSS(nc, £s,c), which denotes the plausibil- 
ity that the case filler nc satisfies the selectional 
restriction described by the example case fillers 
~S,C. 
P(3) = w3 Ec PSS('nc, g3,c) • CCD(c) 
EcCCD(c) (4) 
From the assumption that PSS(nc,Es,c) should 
be greater for a larger SIM(ne,£s,c) and lesser 
relative strength of the selectional restriction de- 
scribed by £s,c, we can derive equation (5). 
PSS(nc, £s,c) = SIM(nc, Ss,c) - RSSR(3, c) 
Here, RSSR(3, c) denotes the relative strength of 
tile selectional restriction on a case c associated 
with a sense 3. 
3For more detail, see Kurohashi's paper (Kuro- 
hashi and Nagao, 1994). 
4 Computation of CCD and RSSR 
The degree of contribution of case to verb sense 
disambiguation (CCD) is computed in the follow- 
ing way. The degree of contribution of a case 
should be high if the semantic range of the exam- 
ple case fillers in that case is diverse in the case 
frame (see figure 3). Let a certain verb have n 
senses (sl, 32,..., s~) and the set of example case 
fillers of a case c associated with 3~ be $3~,c. Then, 
the degree of c's contribution to disambiguation, 
CCD(c), is expected to be higher if the example 
case filler sets {£si,c I i = 1,..., n} share less ele- 
ments. This can be realized by equation (6). 
CCD( ) = 
1 I&.d + I&j, l - n &j, l 
i=1 j=i+t 
(6) 
a is the constant for parameterizing to what ex- 
tent CCD influences verb sense disambiguation. 
When a is larger, CCD more strongly influences 
the system's output. Considering the data sparse- 
ness problem, we do not distinguish two nonns 
X and Y in equation (6) if X and Y are similar 
enough, as in equation (7). 
{X} + {Y} = {X} if 3im(X,Y) >= 9 (7) 
Relative strength of selectional restriction 
(RSSR) is computed in the following way. Tile 
selectional restriction on a ease of a case frame is 
expected to be strong if the example case fillers 
of tile case are similar to each ()tiler. Given a set 
of example case fillers ill a case associated with 
a verb sense, the strength of the selectional re- 
striction on that case (SSR) can be estimated by 
averaging the similarity between any combination 
of two elements of that set. Thus, given a set Es,c 
of example case fillers in a case c associated with 
a verb sense s, tile SSR of c associated with s Call 
be estimated by equation (8), where £~,c is an i4h 
element of £3,c, and m is the number of elements 
in £s,c, i.e. m = \[$3,c\[. 
E =I Ej=++, 
SSR(s, c) = ,+C2 if m > 1 
maximum otherwise 
(8) 
In the case m = 1, that is, the case has only one 
example case filler, tile SSR becomes maxinmm, 
because the selectional constraint associated with 
the case is highest (following table 1, we assign 11 
as the maximum to SSR). The relative strength of 
selectional restriction (RSSR) of a case associated 
with a verb sense is estimated by the ratio of tile 
SSR of tile case to the summation of the SSRs 
of each case associated with the verb sense, as in 
62 
equation (9) 4 
ssR(. , ,0 (9) 
a Evahmtion 
Our experiment compared the performance of 
the following methods: 
1. tOlrohashi's method: equation (1) 
2. our method (considering CCD): equation (3) 
3. our method (considering /)oth CCD and 
RSSR): equation (4) 
In method 2 and 3, the influence of CCD, i.e. (~ in 
equation (6), was extremely large. We will show 
the relation between the w~riation of c~ and tile 
performance of the system later in this section. 
The training/test data used in tile ext)eriment 
contained over one thousand simple Japanese sen- 
tences collected from slews articles. The examples 
given by IPAL were also used as training data s. 
!),ach of tile sentences in the training/test data 
used in our experiment consisted of one or more 
complement(s) followed by one of the ten verbs 
enumerated in table 2. For each of the ten verbs, 
we conducted six-fold cross validation; that is, we 
divided the training/test data into six equal parts, 
and conducted six trials in each of which a differ- 
ent one of the six parts was used as test data and 
the rest was used as training data. We shall call 
the former the "test set" and the latter the "train- 
ing set," in each (:ase. 
When inore than one interpretation of an input 
verb is assigned the highest t)lausibility score, any 
of the above methods will (;hoose as its outt)ut the 
one that appears most frequently in the training 
data. Therefore, tile applicability in each method 
is 100%, given that the applicability is tile ratio 
of the number of the cases where the system Rives 
only one intert)retation, to the numt)er of inputs. 
Thus, in tile ext)eriment, we compared the preci- 
sion of each method, which is in our case equal to 
the ratio of the nuinber of correct outputs, to tile 
nulnt)er of int)uts. 
Since tile 1)erformance of any corpus-based 
method depends on the size of training data, we 
tirst investigated how the precision of each method 
was improved as the training data increased. In 
this, we initially used only the examples given by 
IPAL, and progressively increased the size of the 
training data used, by considering an extra part 
of the training set (five parts of the total six data 
portions used) at each iteration, until finally tak- 
ing all five l)arts in the training of our system. 
4Note that., in equation (5), while SIM is an integer, 
PlSSI/. ranges in its value h'om 0 to 1. Therefore, II, SSI{, 
is influential only when several verb senses take the 
same value of SIM for a given ease. 
'~The number of examples given by IPAL was, on 
~verage, :1.7 for each ease of each case frame. 
The results are shown in figure 5, in which the 
x-axis denotes the ratio of the data used froln the 
training set, to tile total size of the training set. 
85 I I i ' 
i J ! i --4 
8O 
65 j..'" ; .... i CCD -~ = 
CCD'~RSSR -~--" 
KurOhashi .t~.. : 
8o .......... i .......................... ............ ........... 
55-- ~ ~_ i ~__ 
20 40 80 80 100 
proportion of training sat used (%) 
Figure 5: The precision of each method, for each size 
of training data 
What can be derived fl'om figure 5 are the fol- 
lowing. First, as more training data was consid- 
ered, tile precision got higher for each method. 
Second, tile consideration of CCD, i.e. contri- 
bution of case. to verb sense disambiguation, im- 
proved on Kurohashi's method regardless of tile 
size of training data. (liven the whole training 
set, the precision improved from 75.2% to 82.4% 
(7.2% gain). Third, the introduction of the notion 
of RSSR did not fltrther improve on the inethod 
using only CCD. 
Table 2 shows tile performance for each verb 
on using the whole training set. The column of 
"lower bound" denotes tile precision gained in a 
naive method such that the system always chooses 
tile interpretation most frequently al)pearing in 
the training data (Gale et al., 1992). Tile col- 
umn of "two highest CCD" gives the two highest 
CCD values from the cases for each verb, which 
are calculated using whole training set. 
Finally, let us see to what extent we should al-. 
low CCD to influence verb sense disambiguation. 
Figure 6 shows the performance with the paramet- 
ric constant ~ in equation (6) set to w~rious val- 
ues. c~ = (/ corresponds with Kurohashi's method, 
in which CCD is never considered. As shown in 
figure 6, the stronger influence we allow CCD to 
have, the better performance we gain. 
6 Conclusion 
In this paper, we proposed a slew example-based 
method for verb sense (tisambiguation, which lin- 
t)roved the performance of the existing method by 
considering the degree of contribution of case to 
verb sense disambigu~tion. 
The performance of our method significantly de- 
pends on the method of assigning degree of sim- 
ilarity to a t)air of case fillers. Since Bunr'i~itloi- 
hyou is fundamentally based on human intuition, 
it does not reflect the similarity between a pair 
of case fillers computationaly. Proposed methods 
63 
Table 2: Performance for each verb (ga: nominative, ni: dative, o: accusative, kava: locative, de: instrumental) 
# of lower 
candidates bound (%) 
66.9 
25.6 
53.9 
45.2 
two highest COl) 
o (0.98) 0a (0.86) 
o (0.99) ni (9.98) 
o (0.98) ni (0.95) 
',~i (0.90) 0" (0.9')) 
o (0.95) ni (0.94) 25.0 
19.8 de (1.0) o (0.98) 
26.2 kara (1.O) o (0.99) 
o (1.O) ga (0.94) 
~(* (0.96) ,~i (o.ro) 
81.1 
48.3 
59.3 o (1.0) de (0.71) 
precision (%) 
77.2 80.0 
66.3 76.9 
82.6 88.0 
82.5 81.0 
73.2 70.4 
59.2 84.9 
56.0 71.4 
100 98.9 
05.0 70,O" 
96.3 96.3 
r5.2 I 82.4_~ 
data 
wn'b size 
ataer~t 136 4 
kakeru 160 29 
kztwa, eru 107 5 
n o'r~t 126 I O 
osamcr'u 108 8 
tsul,'wrn 12('; 15 
to*'~l 84 29 
~n~u 90 2 
wokaru 60 5 
ya'm, ertt 54 2 
tottd l1 t1111 43,7 
83 
so ........ ........... ............ .............. ...................... i ............... 
~z9 ....... 
zo .............. ....... ; .................. ................ .................. .................. 
zz ! ....... 
0 5 10 15 20 25 30 
Figure 6: The relation between the degree of CCD 
and 1)recision 
of word clustering (Tokunaga et al., 1995, etc.) 
can 1)otentially be used ill conjunction with our 
method to overcome this human reliance. 
In our current implenmntation, we consider the 
collocation between case fillers and verbs, but ig- 
nore the combination of case fillers. Instead of a 
database as in figure 1, we could store a set of com- 
binations of example case fillers, e.g. the combina- 
tion of s~wi ("pickpocket") and saifu ("wallet"), 
but not that of suri and otoko ("man"). Itow- 
ever, this way of data storage would require the 
collection of a much larger number of examples 
than the current method. This issue needs to be 
fl~rther investigated. 
Acknowledgments 
The authors would like to thank Dr. Man- 
aim Okumura (JAIST, Japan), Dr. Michael 
Zock (LIMSI, France) and Mr. Timothy Baldwin 
(TITech, Jat)an) for their comments on the earlier 
version of this paper. 
References 
Peter F. Brown, Stephen A. Della Pietra, and Vincent 
J. Della Pietra. 1991. Word-Sense Dismnbiguation 
Using Statistical Methods. In the Proc. of ACL, 
pages 264-270. 
ldo l)agan and Alon ltai. 1994. Word Sense Dis- 
ambiguation Using a Second Language MonolinguM 
Corpus. Computational Linguistics, 20(4):563-596. 
William Gale, Kenneth Ward Church, and David 
Yarowsky. 1992. Estimating Upper and l~ower 
Bounds on the Performance of Word-Sense \])isam- 
biguation Programs. In the t)roc, of AUL, pages 
249 256. 
IPA, 1987. IPA Lexicon of the Japanese Language for 
computers IPAL (Basic Vcvbs) (in Japanese). 
Robert Krovets and W. Bruce Croft. 1992. Lexical 
Ambiguity and information Retrieval. ACM Trans- 
actions on Information Systems, 10(2):115 141. 
Sadao Kurohashi and Mal~oto Nagao. 1994. A 
Method of Case Structure Analysis h)r Japanese 
Sentences Based on Examples in Case Frame Dic- 
tionary. IEICE 'I. TtANSA CT'IONS on .Information 
and Systems, E77-D(2):227 239. 
Steven L. Lytinen. 11986. l)ynamicatly Combining 
Syntax and Semantics in Natural Language I)ro- 
cessing. \[n the Proc. of AAAI, pages 574 578. 
Katashi Nagao. :1994. A Preferential Constraint Sat- 
isNetion Technique for Natural Language Analysis. 
IEICE 517~.ANSAC770NS on b~forrnation and Sys- 
tems, E77-1)(2):161 1.70. 
National-Language Research Institute, editor. 1964. 
Bunruigoihyo (in .Japanese). Syuei put)lisher. 
Yoshiki Niwa and Yoshihiko Nitta. 1994. Co- 
occurrence vectors froxn corpora vs. distance w'.ctors 
from dictionaries. In the Proc. of COLING, pages 
304-309. 
Satoshi Sato. 1991. MB'F1: li;xample-Based Word Se- 
lection (in Japanese). Journal of Japanese Society 
for Artificial Intelligence, 6(4):592 600. 
Hinrich Schfitze. 1992. Word sense disambigua- 
tion with sublexical representations. In Workshop 
Notes, Statistically-Based NLP Techniques, AAA\[, 
pages 199-113. 
'.l'akenobu Tokunaga, Makoto Iwayama, and llozumi 
Tanaka. 1995. Automatic Thesmtrus Construction 
Based on Grammatical Rela.tions. In the Proc. of 
1JCAI; pages 1308-1313. 
Naohiko Uramoto. 71994. Example-l/ased Word-Sense 
l)isambiguation. LI','ICE TRANSA UTIONS on In- 
formation and Systems, ET7-D(2):240 246. 
Ellen M. Voorhees. 19!)3. Using Wor(tNet to l)isam- 
biguate Word Senses for Text Retrieval. In Proc. of 
SIGIR, pages 171-180. 
David Yarowsky. 1995. Unsupervised Word Sense 
Disambiguation Rivaling Supervised Methods. In 
the Proc. of ACL, pages 189-196. 
64 
