Machine Translation by Case Generalization 
Hiroshi Nomiyama 
IBM Research, Tokyo Research Laboratory 
5-19 Sanbancho, Chiyoda-ku,Tokyo 102 Japan 
E-Mail:nomiyama@trl.vnet.ibm.com 
Abstract 
Case-based machine translation is a promising ~p- 
preach to resolving problems in rule-based machine 
translation systems, such as difficulties in control of 
rules and low adaptability to specific domains. We pro- 
pose a new mechanism for case-based machine trans- 
lation, in which a large set of cases is generalized into 
a smaller set of cases by using a thesaurus. 
1 Introduction 
Case-Based/Example-Based Machine Translation 
(CBMT/EBMT) has been proposed as a way of over- 
coming the knowledge acquisition bottleneck in ma- 
chine translation. This approach is based on the simple 
concept of translating sentences by analogy with simi- 
lar cases stored in a set of cases(a case-base) \[1, 2, 3, 4\]. 
This ~pproach has two advantages in terms of knowl- 
edge acquisition. CBMT/EBMT ensures that (1) if the 
same case as the input exists in the case-base, then the 
same result will be obtained, and (2) if a similar case 
exists in the case-base, then a similar result will be ob- 
tained. In the first instance, which eases are regarded 
as the same depends on the equality metrics of the sys- 
tem. In the second instance, which cases axe regarded 
as similar depends on the similarity metrics. Rule de- 
velopers or users can control the system on the basis 
of equality and similarity without understanding the 
global flow of controls. 
In applying this idea to practical machine transla- 
tion systems, there are still two serious problems. One 
is that CBMT/EBMT requires a great deal of compu- 
tation because of its inherent need to retrieve a huge 
number of cases and calculate their similarities to the 
input. For practical systems, several hundreds of thou- 
sands of cases must be accessible. 
CBMT/EBMT systems should not impose any re- 
strictions on cases to be added to the case-base in an 
effort to keep the case-base small, since the similarity 
metrics depends on the frequencies of cases. If cases are 
restricted, sufficient information to control the rules is 
not acquired. 
The other problem of CBMT/EBMT is the diffi- 
culty of defining a semantic distance, Though the- 
sauri are used as bases for semantic distance calcula- 
tion in CBMT/EBMT, it may be impossible to define 
a general semantic distance by using thesauri alone. 
Semantic distances between words are defined accord- 
ing to which specific words axe related to their trans- 
lations. For example, in translating the word "~t:"~ 
"(eat,feed,... ), "9~ (dog)-;b~:"c,Y~"is equivalent to 
"a dog eats," "-'~ (cow)-7)~-~:"¢,7~ '' is equivalent to 
"a cow feeds," and ".~ (horse)-7)e-~:"v-Yd '' is equiva- 
lent to "a horse feeds." In these cases, "~i:"(cow) is 
closer to ",~"(horse) than "::~"(dog), because differ- 
ent words are selected for each transla, tion of "~:"~ 
~" with "t~'(cow) and "Y~"(dog). But in translating 
"~7z"(run,gallop,...), ":J~ (dog)-:6¢.~.7o " is equiva- 
lent to "a dog runs," ,,-~t= (eow).:~_:~=7~" is equivalent 
to "a cow runs," and ".~ (horse)-Z~L::i~--zo" is equivalent 
to "a horse gallops." In these eases, "t\[='(cow) is closer 
to "9~"(dog) than ",~"(horse). 
If such incomplete semantic distances calculated by 
thesauri alone are used for CBMT/EBMT, excep- 
tional cases may be interpreted as general ones(over- 
generalization). Over-generalization is a major prob- 
lem in translating idiomatic expressions. For exanl- 
pie, "\[t\[I (head)-z)~AJ/J~L7o" has two translations: "hurt 
one's head " or, idiomatically, "be smart." But "~\]i~ 
(head)-~¢-~-~ " has only one interpretation, "hurt 
one's head," though the word "~jl~\[~ '' has almost the 
same meaning as ".U~.". 
It is obvious that "~-~':~.~tt:~TJ;tl.~" can be trans- 
lated correctly by adding this translation pair into 
the case-base. The addition, however, cannot prevent 
ACRES DE COLING-92, NANTES, 23-28 ao6-r 1992 7 ! 4 PRec. OF COLING-92, NANTES, AUG. 23-28, 1992 
the idiomatic expression "~J~-7)¢-~7o" from being in- 
terpreted generally. The idiomatic interpretation still 
may be 'adopted for "X-~5¢-~7~" if X is more simi- 
lar to the word "~j~" than the words in the case-base 
whose pattern is "X-z~-~/LTo. " 
Sato \[3\] and Sumita \[4\] weigh each slot depending 
on how much it affects the translation. However, since 
such weights are calculated only for each slot, the over- 
generalization that occurs inside of a slot is not re- 
solved. To avoid over-generalization, we need some 
mechanism to encapsulate exceptions rather than to 
adjust the semantic distance. 
2 Machine Translation by Case 
Generalization 
A case-base, in contrast to a set of rules, has inherent 
redundancy, because cases are collected without pre- 
selection. In the simplest case, if the sentence "A" has 
only one translation equivalent "a," then the single 
ease "A" ~ "al' is enough to translate "A?' 
But if we view the case-base as a collection of sen- 
tences, the santo sentences rarely seem to occur 1. Sen- 
tences can, however, be divided into smaller fragments 
which are meaningful units for translation according to 
the some linguistic models, which we call translation 
patterns. 
These fragments are combined for use in translating 
sentences. Fragments divided on the basis of trans- 
lation patterns are obviously more effectlvc than sen- 
fences, because smaller fragments are more likely to 
match than full sentences. 
We generalize such fragments extracted according 
to each translation pattern, using a thesaurus, by re- 
placing the words that occur in cases by more general 
concepts in the thesaurus. The words to be replaced 
are determined by their frequencies in the case-base. 
Frequent occurring fragments should be assigned more 
weight than less frequent by occurring fragments. The 
frequencies of fragments axe used to weigh generalized 
cases in generalization. 
Semantic distances are calculated for each transla- 
tion pattern as the importances of generalized cases. 
Only meaningful categories for the translation patteru 
are stored as generalized cases, except that the most 
meaningful category is taken as a default. For example, 
1The ease-bane should contain natural sentences rather than 
examplt~ which ~re only the smallest fragments effective for 
translation. We distinguish CBMT from EBMT in accordance 
with this viewpaint. 
the word "9~"(dog) may be generalized into the cou- 
cept <dog> 2 for translation of ,qrJj < "("a dog barks"), 
whereas it may be replaced by tbe more general con- 
cept <animal>, for other translation patterns in which 
the concept <dog> is not ineaningful. 
While generalizing cases, we can identify exceptional 
cases as those which cannot be generalized. Once we 
identify exceptions, then we can prevent such excep- 
tions from being interpreted generally. 
In this way, cases are generalized according to tbe 
translation pattern into generalized cases with con- 
cepts as the values of their variables. 
In ddition to generalized cases, rules can be formu- 
latcd according to translation patterns. Generalized 
cases and manually written rules are assumed to be 
the same as objects in CBMT. It is valuable to have 
rules available as well as cases, especially when the 
case-base contains iusnfficicnt cases. If rules are not 
available, there must be sufficient cases from the time 
the system is first used. h~creinental development of 
any domain is possible only if general rules are avail- 
able. 
In accordance with these basic ideas, we propose a 
method of machine translation in which cases are gen- 
eralized. In our approach, we define linguistic patterns 
in translation. According to these patterns, the cases 
in the case-base are divided into smaller fragments and 
are generalized. BotlL rules and generalized cases are 
used to translate senteuces. 
CBMT is divided into two sub-processes: (1) best 
matching, to search for the nmst similar cases in the 
case-base, and (2) application control, to control the 
combinatim~ of similar cases for translation. Applica- 
tion coutrol is a general problem in machine transla- 
tion, whereas best matching is a problem unique to 
CBMT. If the best matching process returns certainty 
factors, the system is controlled using these factors on 
the basis of the some other model such as Watanabe's 
\[5\]. 
In tiffs paper, we concentrate on best matching using 
a thesaurus. 
2Concepts are enclosed between arrowheads (< and >) in 
this paper. 
ACTES DE COL1NG-92, NANTES, 23-28 AO(ff 1992 7 I 5 PROC, OF COLING-92, NANTES, AUG, 23-28, 1992 
3 Generalizing Cases 
3.1 Division and Linearlization of 
Cases 
At first, we define a translation pattern (TPi) as fol- 
lows. 
TP~ = \[P,,V,,P,,Vt\] 
P, : Structural Pattern in Source Language (SL) 
V, : List of Lexical Variables in SL 
Pt : Transformation into Target Language (TL) 
Vt : List of Variables in TL 
We call the number of variables in V, the term num- 
ber (Mi) of TP,. 
Next, we extract translation pattern causes (TPC,) 
from the case-base by applying the pattern matches 
described in TPI to all cases in the case-base. 
TPCi = \[L,, C,, L,\] 
L, : List of Values of Lexieal Variables in SL 
C, : List of Constraints in SL 
Lt : List of Values of Variables in TL 
If some patterns other than those specified in P, are 
related in translation, those patterns axe described in 
constraints (C,). 
These TPC, s are finearllzed into linearlized translw- 
tlon pattern cases (LTPCi). 
LTPCi : L. --* (Co, Lt) 
We call the right-hand part of LTPCi the value 
(V). The examples in Fig. 1 are extracted LTPC, s 
in Japanese-to-English translations of "NOUN ni 
VERB," where we assume a translation pattern in 
which an English preposition is determined by a bi- 
nary relation of a Japanese noun and a Japanese verb. 
In the following section, we show how to general- 
ize LTPCis into generalized linear translation pattern 
cases (GLTPCI) by replacing words with more gen- 
eral concepts in the thesaurus, and calculate degrees 
of importance for them. 
\["Sangat u" (M arch) ,"Kowasu" (dest roy) 
\["Sigatu"(April) ,"Gironsnru"(discuss) 
\["Gogatu"(May) ,"Saiketusuru"(vote) 
\["Rokugatn"(June) ,"Hieru"(cool) 
\["Getuyou"(Monday) ,"Arau'(wash) 
\["Kayon" (Tuesday) ,"Kimaru"(decide) 
\["Sy ...... t u" (weekend),"Agarn" (raise) 
\["Higasi"(east) ,"Uturu'(move) 
\["Toukyou" (Tokyo) ,"Idousuru" (move) 
\["Sitigatu"(July) ,"Idonsuru"(move) 
\] ~ (\[\],\["in"\]) 
l ~ (\[\],\["in"l) 
\] ~ (\[\],\["in"l) 
~ (\[I,\["in"D 
(H,\["on"D 
(\[\],\[-on"D 
(\[H"on'D 
~ (\[\],\["to"D 
(\[},\["to"\]) 
~ (\[\],{"in"\]) 
Figure I: Translations of "NOUN ni VERB" 
3.2 Case Generalization by Means of 
a Thesaurus 
3.2.1 Creation of N-Term Partial Thesauri 
We create working thesauri, PTH~(j) (1 < j < Mi), 
for each term. They iuclude every word in the j-th 
term, and set pairs of values and their frequencies in 
each word node. 
Here we define ttle importances used to weigh gen- 
eralized cases. 
Importance of a Link (.rL) The importance of a 
link (IL) is the probability of occurrence of eases that 
occurred in the subtree of PTH,(j). IL is defined as 
follows. S 
IL=-- c, 
where S is the total number of cases in the subtree 
connected with the link, and C~ is the total number 
of LPTCIs extracted from tile case-base according to 
TP. 
Importance of a Node (IN) The importance of 
node (IN) shows the degree of variance of values in a 
subtree. IN is defined as follows. 
where Pk is the probability of each value in the sub- 
tree 3. 
Importance of a Value (IV) The importance of a 
value L (IV) in the node k is defined as follows. 
If node k is a word node, then 
\[Vkt = frequency of value L in node k 
aWe adopt the s~me expre~ion as that used by Stanfill \[6\] 
and Sumita \[4\]. 
AcrEs DE COLING-92, NAN'IXS, 23-28 AO0"r 1992 7 1 6 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992 
else 
\[V~,t = INj, t~__,(IL,, x IV,,i) 
where m is a node linked to node k, and/14,a 
is the importaatce of value L in node m. 
Importance of a Generalized Case (IC) The im- 
portance of a GLTPCi (IC) is defined a.s follows. 
Mi 
j=l 
where IVjt is tile importance of value L, which is the 
same as the value of the GLTPCi. 
3.2.2 Subdivision of Conceptual Leaf Nodes 
According to the definitions given in the previous 
section, at first ILs and INs are set in all the links 
and nodes in PTHI(j), and IVs are calculated m con- 
ceptual leaf nodes in PTHi(j). 
If IV is not the maximum value in a conceptual 
leaf node and is greater than the prc-defined thresh- 
old value and its frequency is greater than 2, the node 
is subdivided into more specific concepts. 
Subdivision occurs because a specific category which 
doesn't exist ill the thesaurus is effective for a specific 
translation pattern. Only the difference from tile ttle- 
saurus is kept a.s the translatlou pattern thcsanrus i 
(TPTHI). 
3.2.3 Propagation of Importance of Values 
Next, we calculate IV in all nodes other than COlt- 
ceptual leaf nodes by propagating IV. The propaga- 
tion is done by multiplying the importances of values 
by the importances of links, and the sum of all the 
propagated values is multiplied by the importance of 
the node. At first, the propagation is done upward, 
starting from the conceptual leM nodes. During up- 
ward propagation, downward propagation is done if a 
child node is a conceptual node and a propagated value 
is greater than the maximum importance of values in 
the child node. Downward propagation prevents over- 
generalization. 
We show examples of results of importnnce calcu- 
lation in Fig. 2 and Fig. 3, for tile first and sec- 
ond terms respectively. In Fig. 2, the subdivision oc- 
curred in the node <Time> and the new node <*X*> 
was created. A downward propagation occurred in 
the node <Concrete> in Fig. 2. Tile word "in" was 
made more important than the word "to" in the node 
<Concrete>. 
\[<>,<Destruction>\] -~ (\[\],\["in"l) 
\[<>,<Speech>\] ~ (\[\],\["iil"\]) 
\[<>,"Salket ....... "(decide)\] ~ (\[\],\["\[n"\]) 
\[<>,<>1 ~ (\[\],\["in"\]) 
\[<*X*>,<Action>\] ~ (\[\],\["on"\]) 
\[<*X*>¢'Kinlaru"(decide)\]-" (\[I,\["on"l) 
\[<*x*>,<Up-D .... >l~(\[\],\["°n"l) 
\[<Location>,<Abstract>\]-* (\[\],\["to"\]) 
\[<Directlon>,<Abstract>\]-" (\[l,\["to"\]) 
\[<>,"Id ........ "( ...... )l ~ (\[\],\["t°"\]) 
Figure 4: Result of the lntra-Term Generalization 
3.2.4 Intra-Term Generalization of LTPCi 
According to importances calculated according to 
the method described in the previous section, LTPCis 
are generalized in the jail term. If the value with the 
highest IV in tile child node is the same as the value 
with tile highest IV in tin> parent node, then tile word 
in the term is generalized by the concept in the parent 
nodE.'. This process of generalization is repeated un- 
til no further generalization is possible, and only the 
most generalized cases are kept. If identical c~es are 
obtained as a result, only one case is kept. 
We show an ex~tmple of intra-term generalization of 
\["Kaymz"(Tnesday),"Ki ....... "(decide)\] ~ (\[1,\["on"\]). 
Initially, the firts term "K~vou"(Tuesday) is gener- 
Mized. T1 .... 1 .... f (hi ..... (\[\],{"on"\]) is th ....... 
as tile vMue with tile highest IV in the parent node 
<*X*> (see Fig. 2), so "Kayou"(Tuesday) is re- 
placed by <*X*>. The value (\[\],\["on"\]) is not tile 
value with tile highest IV in the parent nede of 
<*X*>, and therefore generalization stops at the first 
term. Next, the second term "Kimaru"(decide) is 
generalized. In tt~e parent node <Decision> of "Ki- 
maru"(decide), tile value that is the same as tile value 
of the ea.se is one of the values with the higtlest 
IV. Consequently, parent nodes are checked to de- 
termine which value is more important. In tile root 
node, (\[\],\["on"\]) is less important tl .... (\[\],\["in'\]) .... 
no generalization occurs for the second term. Fi- 
nally, \[<*X*>,"Ki ...... "(decide)\] ~ (\[\],\["on"\]) is ob- 
tained as tile result of intra-term generalization. 
Tile result of intra-term generallzatiml for all tile 
LTPC, s in Fig. 1 is slmwn in Fig. 4. 
3.2.5 Inter-Term Generalization of LTPCi 
Next we generalize cases over terms. Inter-term gen- 
eralization takes ICs into consideration. If M, = 1, 
ACRES DECOLING-92, NANTES, 23-28 AOI}T 1992 7 1 7 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
o.1 
"RokugllU",l,(fl,p+ln~} "0 etuyou ",l,(~\],Fon++\] ) 
-SmOaW',1.(fl,FIn"\]) "KJy~",l+(~,\["o.+\]) 
-sangltu++.l ,(D,t'ln+J) "$ y uuma~u ",1 ,(a, p+o.+l) 
Figure 2: First-Term Partial Thesaurus 
0 
"um""'+'I~'t:'t.".~P_., I i - , .. .............. \]+ ' 
::::=+:+:+,+++,,.,.++.++++.I +,.,,..,+r,:?.u,. ., !°::°:,:°+,, o.,o, +.,~-~o.1o, +-+ ~., +'+:+'°+ 
"~lketusum",l,(\[l,\["ln"l) 
.Glron~ru,,,(Q,\[.,in.l) "KImMu",l,(D,r'on"l) 
Figure 3: Second-Term Partial Thesaurus 
ACTES DE COLING-92. NANTES, 23-28 ^oI3"r 1992 7 1 8 PROC. OF COLING-92. NANTES, AUG. 23-28. 1992 
then the result of intra-term generalization with ICs 
is the generalized linear translation pattern case i 
(GLTPCi). If M, > 1, j th term ma~ximum gener- 
alization (1 ~ j < M~) is done for e~.ch term. In j-th 
term maximum generalization, terms other than the 
j-th term are fixed first and the j-th term is general- 
ized .as much a.s possible. Then, the maxinmni possible 
generalization is done for remaining of terms in turn. 
If Mi > 1, then M, x (Mi - 1) GLPTC, s axe obtained. 
If identical cases are obtained as a result, only one case 
is kept. 
We show an exz~mple of inter-term generalizatlnn 
of \[<Directiou>,<Abstraet>\] -" (\[\],\["to"\]). Initially, 
first-term ma.ximum generalization is done. IVs in the 
node <Abstract> are shown betow (see PTHi(2) in 
Fig. 3). 
(N,\["to"l) : 0.027 
(~,\["in"\]) : 0.020 
(N,I"on")) : 0.006 
IVs in the node <Abstract>, which is the parent node 
of <Direction>, are shown below (see PTHi(1) in Fig. 
2). 
(l\],\["in"\]) : 0.192 
(\[\],\["on'l) : 0.035 
(\[\],\["to"\]) : 0.007 
Their totals are ms follows. 
(~,\["to"\]) : 0.027 + 0.007 = 0.034 
(H,\["in"\]) : 0.020 + 0.192 = 0.212 
(\[l,\["on"\]) : 0.006 + 0.035 = o.04~ 
Since (\[I,\["to"l) doesn't have the highest importance, 
the case is not generalized any further in the first term. 
Next, the second term is generalized. The IVs in 
the node <Direction> are shown below (see Fig. 2). 
(\[\],\["to"l) : 0.1 
IVs in the node <>, which is the parent node of 
<Abstract>, axe shown below (see Fig. 3). 
(~,\["in'l) : 0.011 
(H,\["to"l) : o.o0s 
(N,i"on"I) : 0.006 
Their totals are as follows. 
\[<>,<>\] -, (\[},\["in"\]) 0.11s 
\[<*X*>,<>\] -~ (\[\],\["on"\]) 0.306 
{<C ......... te>,<Abstract>\] -~ (\[\],\["to"l) 0.037 
\[<Location>,<>\] -" (\[\],\["to"\]) 0.108 
\[<Direction>,<>\] ~ (\[\],\["to"l) 0.108 
Figure 5: Result of the Inter-Term Generalization 
d\],\["to"\]) : 0.1 + o.0os = 0.10s 
(\[\],\['in"\]) : 0 + 0.011 = 0.011 
(I\],\['~on"\]) : 0 + 0.006 = 0.006 
Since (\[\],\["to'\]) has the highest import ...... th ..... 
end term is 
generMized into the root node <>, and the general- 
ization stops because there are no nlore parent nndes. 
Therefore \[<Direction>,<>\] ~ ({\],\["to'\]) 0.108 is the 
result of first-term nlaxinlutn generalization of 
\[<Direction>,<Abstract>\] -" (\[\],\["to"\]). 
The result of inter-tcrm generalization for all the 
LTCPis in Fig. 1 is shown in Fig. 5. 
3.2,6 Addition of Translation Rules 
FinMly, translation rules (TRis) are added to the set 
of GLTPC, s. TRis are descriptions in which concepts 
are specified as the values of variables of L. of LTPC, s. 
If the same case Mready exists in the set of GLTPC,, 
then it is not added. If only the wJue of the ease is 
different from TRi, then it is replaced by TR,. Ottier- 
wise, TR, is added with its IC. The ICs for TRis are 
e~dcnleAed in the same way as for GLTPC, s. 
4 Best-Matching Algorithm 
The Tl'is, the set of GLTPC, s, the TPHis, aud tile 
thesaurus are used in hest matching. The values of 
vaxi~bleu in V. z.re extracted from the input sentence by 
applying pattern matching according to the description 
of TPi. The best-matching process retrieves the most 
similar case frmn the set of GLTPC,. 
If M'~ = 1, words which are equivalent 1o the word 
that is a value of the variable in l; axe first searched 
for in the value of the corresponding wriable in L, 
of GLTPCo. If none are found, upper concepts re- 
trieved in either TPTHI or the thesaurus are searched 
in turn. The GLTPCI which is found first is the 
shortest-distance GLTPCi (SDGLTPCI). If C; in 
GLTPCi is not null, then it is also evaluated, whether 
it is true or false. 
AcrEs DE COLlNG-92, NANTES, 23-28 ^Ot')q 1992 7 1 9 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
If Mi > 1, the j-th term shortest-distance GLTPC, s 
(SDGLTPCj) of each term are searched for. If Mi = 
2, SDGLTPC~ holds the shortest-distance word or 
concept in the first term, and SDGLTPC2 holds the 
shortest-distance word or concept in the second term. 
If 2!4, > 1, (M~ - 1) SDLTPCjs are obtaiued for each 
j-th term. A total of Mi x (Mi- 1) SDLTPCjs are ob- 
tained. The SDGLTPCj with the highest importance 
is selected as the SDGLTPC. 
We will show an exmnple in retrieving the most sim- 
ilar example for "Getuyou(Monday)ni-Huru(rain)." 
Suppose the parent node of "Huru" is <Climate>. At 
firsL SDGLTPCI will be searched for iu GLTPCis 
(see Fig. 5). "Getuyou" does not exist in any first 
terms in the set of GLTPC~s. Therfore <*X*> which 
is the parent node of "Getuyou" is searched for and 
\[<*X*>,<>) ~ (\[\],\["on'\]) 0.306 is found. T1 ...... 
ond term of this GLTPCi is a upper concept of 
"Huru," so this is SDGLTPC1. Next, SDGLTPC2 
is searched for and is found to be the same as 
SDGTPC1. Consequently, the most similar GLTPCi 
is \[<*X*>,<>\] ~ (\[\],\["on"\]) 0.306, and the word "on" 
is set as a preposition. 
5 Discussion 
In the CBMT approach, the linguistic model, which 
is a set of translation patterns, is important both for 
the compaction ratio of a case-base and for similar- 
ity metrics. If the model is not appropriate, most 
cases remains ungeneralized, and unnatural cases are 
retrieved as similar eases to the inpnt. The problems 
of constructing linguistic models axe the same as in 
rule-based systems. 
However, our approach assumes that the linguistic 
nmdel does not include controls of rules and general- 
ized cases. Whether or not this assumption is correct, 
it is very ditfieult to define controls in such a way that 
any exceptional cases axe encapsulated properly. Our 
approach provides a~l engineering solution to these dif- 
ficulties. 
In our approach, the quality of translations depends 
on the quantity of cases rather than the quality of 
the thesaurus. Therefore, it is important to explore 
(semi-)automatic case acquisition from bilingual cor- 
pora. 
To construct a huge case-base is easier than to con- 
struct a well-defined thesaurus, because cases are con- 
strueted locally without taking account of side-effects. 
To define an effective thesaurus for translation, every 
effective category for translation must be included, and 
every intermediate category that is effective for trans- 
lation must be included in order to calculate semantic 
distances properly. 
If, on the other hand, thesauri can be developed in- 
dependently from the case-base, developers or users 
can select the most appropriate thesaurus for the do- 
main, 
6 Concluding Remarks 
This paper has descrlhed a framework for a machine 
translation using a mixture of rules and cases general- 
ized by means of a thesaurus, whict~ is much smaller 
than the ease-base itself. Since the importances of rules 
and generalized cases are calculated in advance by gen- 
eralization, it is not necessary to calculate them during 
the best-matchlng, which is done by exact matching of 
words or upper concepts in the thesaurus. 
Acknowledgements 
I would like to thank Masayuki Morohashi and the 
members of Japanese Processing Group of Tokyo Re- 
search Laboratory, IBM Japan, for their valuable sug- 
gestions and encouragement, and Michael McDonald 
for his helpful advice on the wording of this paper. 
References 

\[1 t Nagao, M., ~A Framework of a Mechanical Transla- 
tion between Japanese and EngSsh by Analogy Princi- 
ple," Elithorn, A. and Banerji, R. (eds.): Artificial and 
Human Intelligence, NATO, 1984. 

\[2\] Sadler, V., "Working with Analogical Semantics: Dis- 
ambiguation Techniques in DLT," FORIS Publications, 
1989. 

\[3\] Sato, S., and Nagao, M., "Toward Memory-based 
Translation," Proc. of Coling '90. 

\[4\] Sumita, E., lida, H., and Kohyarna, ti, "Translating 
with Examples: A New Approach to Machine Transla- 
tion," Proe. of the 3rd Int. Conf. on Theoretical and 
Methodological Issues in Machine Translation of Natu- 
ral Languages, 1990. 

\[5 ! Watanabe, H., ~A Similarity-Driven Transfer System," 
Proc. of Coling '92. 

\[6\] Stanfill, C. and Waltz, D., "Toward Memory-Based 
Reasoning," Comm. of ACM, Vol.29, No.i2, pp. 1213- 
1228, 1986. 
