Classifiers in Japanese-to-English Machine Translation 
Francis Bond and Kentaro Ogura and Satoru Ikehara 
NTT Communication Science Laboratories 
\]-2356 Take, Y()kosuka,-shi, t<;ma,gawa-ken, ,1 APAN 2384)3 
bond~nttkb, ntt. j p 
Abstract 
This l)a.i)cr t)rot)oses ;m mmlysis of clas- 
sifters into \['our ma,jor l;yl)(;s: t0Nl'l', 
ME'I't(I(1, (III()UI' il,lld SPI,;(',IES, \])~lS(!(1 Oil 
prot)('xl;ies ()\[ l)ol,h .la.l/a.tw, s(~ and 10,u - 
glish. Tlm nnalysis mak(!s tlossibl(~ a 
mfiform :m(l straightforward l;r(~atm(;nt 
of noun t)hras(',s h('.a(h;d t/y (:lassitio, r,~ 
in Jallmmse--1;()q~;nglish ma(:hin(~ trnnsla.- 
lion, mM has b(',el~ iinph'm('att,(',d in t,he 
MT sysl;(',m ALT-J/E. Alth(lugh the 
analysis is bas('.d (m l;tm (;}mra(;l;eri,qt;it;s 
of, and diff(;r(m(:(~s b(',tw(~en, Ja.l)anes(~ 
and English, it is shown to 1)e also at)- 
t)li(:al/l(~ to tim mn(~lated languag(~ Thai. 
1 Introduction 
N()m~ I)hras(',s in Ja.pmms(' difl'(!r from l,hos(~ in l';n- 
glish in two ilnt)orl;mll; ways. \]?irsl;, ,\]al)ml(~s(! has 
no ('quivalent synl;a(:l;i(: (;a,l,('gory l;(/ l",nglish (l(;t(',r- 
tniners. S(;(;on(\[, thtu'c is tit) grmmuatical tam'k- 
ing of tl/lllil)(*,r, t \])~(~(;~l.ll,q(; ()f tlt(~SC diIl'(w(m(:es~ ml 
m('.ri(:a.1 expr(~ssi(/ns a.r(~ r('.aliz(;d very (lifft!r('.ntly 
in .la.lm, nes(~ and English. \[n English, comltal)l(; 
nouns can \])t; directly m()dili(!d l)y a mun(!ral: 2 
dog,s. In ,la,limmse , h()w(!v('.r, nmncrals (:a,nnot, (li- 
re(:tly mo(tiily C(/llllllOtl llOllliS, inst('ml a (:lassitier 
is us(;(t, in l;he stunt; way l;h~d; ~t \])arl;il;ivc noun 
is used wil,h a,n unc(mnta.ble noun in English: 2 
pieces of fltrnit',,r'e. In addition, when .\]a.I)an(*a ~. 
is translated int(/li;nglish, tim scle(:tion ()I' a,l/pr()- 
t)ria, l;(', d(;t(,,rmin(;rs, su(:h sis re'tit;los a,n(t t)ossc.ssiv(~ 
l/rOnouns, tLtl(t l;h(', d(;1,('amin~l;i(nt (if (:()unta.t)ilit;y 
a.nd mmfl)(;r is l)r()l)l('ma.tic. 
Various s(/lutions to t;he pr(/I)h~ms ()f gen(wat- 
ing a.rti(:l(',s ;rod t)oss(',ssiv(~ 1)ronouns a.nd (t(;t('~r- 
mining (:omltal)ilil;y a.nd munbcr have be(',n I/r()- 
t)oseA (MunLta and Nagao, 1993; Cornish, Fujit;% 
and Sugimura, 1994; ll(md, Ogm'a, and Kawaoka., 
1!)95). The (tiff(~r('.n(',('.s t)cl,w(wal the way mun(',ri- 
(:M Cxlirt',ssi(/ns are realized in ,la.t)all(!s(~ a.ud En- 
glish lists 1)Cell lo, ss studi(',d (Asa.hioka, llira.knwa, 
&n(l Alll;l, tl(), 1.990). In this 1)a.t)(;r we l)roI)()s(~ m~ 
mmlysis of (:lassifiers based on lirOl)('rties of l)oth 
,lapan(~,q('. mM English. Our caLegory (If classi- 
tier includes both ,\]a.im,nesc josusM 'tmm(.q'al clas-. 
~.lal)mW~se does noI, \]ta,ve contrasting singular and 
l)hlr;fl forms of nouns. 
silicrs' mM English lmr(;itivo, nouns. W(', divide 
classifiers inl;o four m~ki(ir l,yl)c.s: UNIT, METIll(:, 
(;l/()\[Jl' ~-Llld S\['E(JII,;S. UNIT (:lassiIio.rs are I'urtlmr 
divided inl,o (II,;NEIIAI,, TYPI(?AI, ~LII(I SI'I,;(:IAI., 
whil(~ MI,YI'I/I(? classifiers are divided into h,ll,;:\StJl¢l,; 
~/,lld (R)NTAINEII. classifiers. All,hough our ~malysis 
was tmscd on l;hc characteristics of, and difl'o, rences 
lmtwt!en, .bq)ane,q(! and Euglish, we fomtd il; t;o }m 
strikingly similaa to the, annlysis for Thai 1)rol)OS(~d 
by Sornlcrtlamwmic, het al. (1994), which sugg(',sts 
I;ha.l, Lhe re,quits ma.y bc useful for e.xa.miniug ol;hcr 
la.ngllages. 
The analysis inl;rotlut;ed in this tin.per has })eun 
iml)lem(!nla',(l in NTT Commutfi(:al;i(m S(:ien(:c 
l,a.1)or;~tori('.s' .J;qm.n(~s(>t()-English nmchitm tr;ms- 
lal;i(m system ALT-J/E (lko, lm.r;~ el, al., 1!)91; 
()gm'a ut al., 1993) ,qnc(! 1994. Ex~mlt)les o1' how 
it, has l)c(~n inipl(un(!nlx~(1 in ALT-J/E m(~ w(/v(!n 
l, hrough()ut the l;(!x\[;, nlth(mgh tim analysis it;self 
is not ti(',(1 t;o any t'ornmli,qm or \])ml;it:ulm repr(> 
scnl;a,l;ion, so is ;Ma,ptat)lt: to any sysl;t;m, 
We Sl;al'l; O\[\[ l)y (!xmnining re(rot/lingual mm.ly 
ses o\[ ,laIm.no, s(~ classiticrs and English pm'l,it;ivt,, 
(!Xl)l't~SSit)llS (Sccl:ion 2). Then we introduce ore 
bilingual mmlysis o1 classifiers a,nd show how this 
a.na\]ysis can lm used in a..la.pmms(:-l:()-lC, nglish mn- 
(:hin(~ trm>lal,i(m sysI,em (Se(;l;i(m 3). We ~lls() ex- 
;Llltill(! II!.()l'(~ (',Olllpl(~,X (',aS(!S who, l'C (;lassi/i(ws are 
used liD; normal nouns (S(w, ti(m 4). Flintily we 
(:Omllarc ore mmlyMs 1,() oth(;r l)(~ol)le's (S(w.l;i()n 5). 
Thr(/ughout the, pallor we us(; th('. following al)- 
ln'(;viations: A, B (/r N: z,oun or noun t)hrase; C: 
cla~sifi(;r, X: Nmneral, with ,\]apa.n('.s('. in it;ali(:s. 
2 Monolingual Analyses of 
Classifiers 
2.1 .latmnese 'Classitiers' 
,\]a,pmw~se is a mmmral cla.ssifier language (Allan, 
1977), in which clnssiIiers m'e obligat;ory in llHLlly 
(!xl)ressions of tlUmttity. Wc will reli!r to l)roto 
l,ytfica\] .\]almnt.',qe c.lassitier,q as .')os'iishi 'lmmeric.M 
classilic.rs'. 
Synl,acl;ically, ,josushi ",nc a subclass of nouns 
(Miyazaki, Shirai, a.nd lkeha.r;t, 1995). Th, main 
t)rt)l)(!rLy (lisl,il~guishing I;hem from u()rmal nomls 
is t;hat l;hey (:an \])()sl;tix t;o mmmrals, the, (lumltiti(,x 
su %ram' or the int;errogativc nani 'whaL', Lo form 
a noun plm~se. \[Jnlike normal nouns fix ,lalm.nes(~, 
125 
jos~shi can not form grammatical noun phrases 
on their own} 
(1) 2-hiki '2 animals' (Numeral) 
(2) sg-hiki 'some animals' (Quant.) 
(3) nan-hiki 'how many animals' (Int.) 
The resulting numeral-classifier noun phrase 
can modify another noun phrase, either linked 
by no 'of' 'XC-no-N', or 'floating' elsewhere 
in the sentence, typically directly after the 
noun phrase it modifies 'NXC'. It can also oc- 
cur on its own, with anaphoric or deictic ref- 
erence. Asahioka, Hirakawa, and Amano (1990) 
identify seven different patterns of use. In order 
to concentrate on the translation of classifiers and 
number, we will restrict our discussion to noun 
phrases of the type 'XC-no-N' and not discuss 
the problems of resolving anaphoric reference and 
floating quantifiers. 
Semantically, each classifier relates to a class 
of nouns (Kuno, 1973, 25), often fairly arbitrar- 
ily. For example -hiki '(small) animal' is used to 
count small animals excluding rabbits, which are 
counted with -wa 'bird'. There is a default classi- 
fier -tsu 'piece' which can be used to count almost 
anything. 
2.2 English 'Classifiers' 
In English, numerals can directly modify count- 
able nouns 'X N'. In order to enumerate uncount- 
able nouns, either the uncountable nouns have to 
be reclassified as countable nouns, or embedded 
in a partitive construction: two beers or two cans 
of beer 'X N' or 'X C of N' (Quirk et al., 1985, 
249). This partitive construction is similar to the 
Japanese quantifying construction 'XC-no-N'. 
Quirk et al. (1985, 249-51) divide partitive 
nouns into three main categories QUALITY PAR- 
TITIVES, QUANTITY PARTITIVES, and MEASURE 
PARTITIVES. QUANTITY PARTITIVES are further 
divided into three cases, the first where the em- 
bedded noun phrase is uncountable, the second 
where it is plural, and the third where it is singular 
and countable. All the partitive nouns themselves 
are fully countable. 
QUANTITY PARTITIVES where the embedded 
noun phrase is headed by an uncountable noun, 
the first case, are then divided into GENERAL PAR- 
TITIVES such as piece which serve only to quantify 
and TYPICAL PARTITIVES such as grain which are 
more descriptive. 
2There are some examples of words that can be 
either a common noun or josftshi: for example gy5 
'line' or hako 'box', which can follow a numeral or 
stand alone. These nouns can be handled in two ways: 
(a) as a lexical class that combines the properties of 
common nouns and josftshi, or (b) as two separate 
lexical entities. ALT-J/E follows option (b), such 
nouns are entered into the lexicon twice, once as a 
common noun and once as a jos~ishi. 
3 A Bilingual Analysis of classifiers 
As there is no direct fit between English and 
Japanese, it is necessary to categorize the 
Japanese and English classifiers and to define rules 
which will enable effective machine translation. 
We divide classifiers into four major types: IIN\]T 
(Section 3.1), METRIC (Section 3.2), GROUP (Sec- 
tion 3.3) and SPECIES (Section 3.4). The main cri- 
teria for the analysis are the restrictions placed, 
in English, on the countability and number of 
the embedded noun phrase in a partitive con- 
struction. Whether a noun is a classifier, and if 
so which type, is marked in the lexicon for each 
Japanese/English noun pair. 
We distinguish between five major different 
noun countability preferences, based on the anal- 
ysis of Allan (1980), adapted for use in machine 
translation by Bond, Ogura, and lkehara (1994). 
'Fully countable' nouns, such as knife, have both 
singular and plural forms, and cannot be used 
with determiners such as much. 'Uncountable' 
nouns, such as furniture, have no plural form, and 
can be used with much. Between these two ex- 
tremes are nouns such as cake, which can be used 
in both countable and uncountable noun phrases. 
They have both singular and plural forms, and can 
also be used with much. We divide such nouns 
into two groups: 'strongly countable', those that 
are more often used to refer to discrete entities, 
such as cake, and 'weakly countable', those that 
are more often used to refer to unbounded refer- 
ents, such as beer. The fifth major type of count- 
ability preference is 'pluralia tanta': nouns that 
only have a plural form, such as scissors. 
3.1 Unit classifiers 
UNIT classifiers are the prototypical classifiers. 
A UNIT classifier will be realized in Japanese a~ a 
jos{shi. However, there are three possible transla- 
tions of a Japanese noun phrase of the form ~XC- 
no-N', where C is a unit classifier: 
Individuate: Translate as 'X N', where the clas- 
sifier C is not translated and the numeral 
directly modifies the countable English noun 
phrase: 
1-hiki-no-inu 'l-piece of dog' --+ 1 dog. 
Part: Translate as 'X C of N', where the classi- 
fier is translated by its translation equivalent 
(from the transfer dictionary) and N is un- 
countable (headed by a bare singular noun): 
1-tsubu-no-kome 'l-grain of rice' 
-+ 1 'grain of rice. 
Default: Translate as 'X C of N' where the clas- 
sifier is replaced by a default that depends 
on the embedded noun and N is uncountable. 
The default is normally piece, but this can be 
over-ridden by an explicit entry for N's de- 
fault classifier in the lexicon: 
126 
Table 1: Unit Classifiers 
Noun Type General Typical Special 
Fully Countable 
Strongly Countable 
Weakly Countable 
Uncountable 
Pluralia Tanta (pair) 
1 (log 
1 cake 
1 hair 
1 piece of information 
1 pair of scissors 
1 dog 
1 crumb of cake 
1 strand of hair 
1 grain of information 
1 pair of scissors 
1 slice of dog 
1 slice of cake 
1 slice of hair 
1 slice of information 
1-tsu-no-kagu 'l-piece of furniture' 
-~ 1 piece of furniture. 
The three types of UNIT classifier are summa- 
rized in Table 1. a 
Having established three possible translations 
of the 'XC-no-N' construction, we can proceed to 
divide UNH' classifiers into three types, depending 
on which of the above alternatives is most suit- 
able. The first, OI,'NEItAL classifiers, are those that 
have no special meaning of their own, but are used 
only to quantify the denotation of a noun. Typical 
examples are - tsu 'piece' and -ko 'piece'. If N is 
fully, strongly or weakly countable, then the clas- 
sifter is not translated (individuate). If N is un- 
countable, then the classifier is translated as the 
default (default). The second type of classifer, 
TYPICAL, consists of those classifiers which are de- 
scriptive in their own right, such as -teki 'drop'. If 
N is fully countable, then the classifier will not be 
translated (individuate), otherwise the classifier is 
translated (part). The final type of classifier, SPE- 
CIAl,, is rare: classifiers which force an uncount- 
able interpretation of even countable nouns, for 
example -kire 'slice'. N is always parted: 1-kire- 
no-inu 'l-slice of dog' -+1 slice of dog. 
The translation of classifiers is complicated by 
the fact that classifiers and their relationships 
to nouns are both arbitrary and language de- 
pendent. Consider the Japanese classifier -mai 
'sheet', which is used for counting fiat objects. 
This has no direct English equivalent. As a de- 
fault, it is entered in the dictionary as a GI.'NEI{AL 
classifier with the translation piece. There are 
however several fiat, objects for which piece is in- 
appropriate in English: food-stuffs (slice); paper, 
glass, cloth and leather (sheet); bacon (rasher); 
and financial contracts (contract). The selection 
of an appropriate translation is not dependent on 
this analysis and can be left, to the normal ma- 
chine translation process. In ALT-J/E it is done 
by examining the semantic category of the embed- 
aIf N's countability preference is pluralia tanta then 
N will never be individuatcd. If N is parted or de- 
faulted there axe two possibilities: either, if the dic- 
tionary entry for N has the default classifier pair then 
it will be used as the classifier or, if N has no default 
classifier, then a different translation is searched for 
in the dictionary and used instead. If there is no non- 
pluralia tanta translation equivalent, then the trans- 
lation will default to 'X C of N' as above, but with N 
headed by a bare plural noun. 
ded noun. Once an appropriate translation of the 
classifier has been found, knowledge of its type al- 
lows the system to decide the appropriate form of 
the final translation. 
3.2 Metric classifiers 
The next overall category is METRI() classifiers: 
A noull phrase of the form 'XC-no-N', where C is 
a METRI(; classifier will be translated as 'X C of 
N', where N will be plural if it is headed by a fully 
countable or pluralia tanta noun. We fllrther sub- 
divide METI/,IC classifiers depending on whether 
the resulting English noun phrase will have singu- 
lar verb agreement (MEASURI'; classifiers), or plu- 
ral verb agreement (CONTAINFat classifiers) as its 
default. 
(4) 2-kg-no-kami-ha jgbun da '2 kg of paper- 
TOP enough is' -~ 2 kg of paper is enough 
(5) 2-hako-no-kami-ha jubun da '2 box of 
paper-TOP enough is' -+ 2 boxes of paper" 
are enough 
In fact both (4) and (5) could be translated with 
singular or plural verb agreement. The differen- 
tiation into MEASURE and CONTAINER provides a 
graceful default. Examples are given in Table 2. 
3.3 Group classifiers 
GROUP classifiers combine with plural or uncount- 
able noun phrases to make a countable noun 
phrase representing a group or set. A noun phrase 
of the form 'XC-no-N', where C is a GROUP clas- 
sifier will be translated as 'X C of N', where N 
will be plural if it is headed by a fully or strongly 
countable noun or a pluralia tanta. Noun phrases 
of the form 'N-no-C', where C is a GROUP classi- 
fier (but not a jos~shi) will also be translated as 
'C of N' where N will be plural if it is headed by 
a fully or strongly countable noun or a pluralia 
tanta. This allows us to give a uniform treatment 
of noun phrases such as (6) and (7) during English 
generation, even though their Japanese structure 
is very different. 
(6) 2-hako-no-pen '2 box of pen' 
2 boxes of pens 'XC-no-N' 
(7) pen-no-hako 'box of pen' 
a box of pens 'N-no-C' 
127 
Table 2: Container and Measure Classifiers 
Noun Type Container Measure 
Fully Countable 
Strongly Countable 
Weakly Countable 
Uncountable 
Pluralia Tanta 
1 box of clogs 
1 box of cake 
l box of beer 
\] box of fllrniture 
I box of scissors 
l kg of ants 
1 kg of cake 
1 kg of beer 
1 kg of flHniture 
I kg of scissors 
Table 3: Group and Species Classifiers 
Noun Type Group Species (Si) Species (Pl) 
lqflly Countable 
Strongly Countable 
Weakly Countable 
Uncountable 
Pluralia Tanta 
1 set of dogs 
1 set of cakes 
l set of beer 
1 set of information 
1 set of scissors 
1. kind of dog 
1 kind of c}~ke 
1 kind of beer 
1. kind of infbrmation 
1 kind of scissors 
2 kinds of clogs 
2 kinds of cakes 
2 kinds of beer 
2 kinds of informal;ion 
2 kinds of scissors 
Whether a notln is a GIll)UP classifier or not 
carl also be used to help determine the Irtlmber 
of ascriptive and appositive noun phrases. For 
example, in ALT-J/E the countability and num- 
ber of two at)positive noun phrases are made to 
match each other, unless one element is plural 
and the other is a GI{OUP classifier. For example, 
many insects, a whole swarm, ... as opposed to 
many insects, bees I think, ... (Bond, Ogura, and 
Kawaoka, 1995). Examples of (;Rein, classifiers 
are given in Table 3. 
3.4 Speeies classifiers 
The last type of classifier is sP,,;cn,;s classifiers. 
SI'ECII:S classifiers are partitives of quality and 
(;an occur with countable or uneo,lnt&ble llOlln 
phrases. The embedded noun phr~se will agree 
in number with the head noun phrase if flflly or 
strongly countable: a kind of car, 2 kinds of cars; 
a kind of equipment, 2 kinds of equipment. Exaln- 
ples of SPE(:mS classifier's are given in Table 3. 
4 When is a Classifier a Classifier? 
In the analysis given above for Japanese noun 
phrases of the form 'XC-no-N', we have given no 
consideration 1;o the denotation of N, except for 
when choosing the at)propi'iate translation for C. 
Thus we assume that 'XC~no-N' will be translated 
as 'X C of N' or just 'X N' if N is countable, as 
in (8) or (9). 
(8) 1-pal-no mizu 'l-cup of water' 
--> 1 e?Lp of wate°f (CONTAINFI/.) 
(9) i-tsu-no koppu 'l-piece of cup' 
--+ 1 C~tp (GENEI/AI,) 
However if N is a noun that denotes an at- 
tribute, such as PRICE or WEIGIIT, then the trans- 
lation process becomes more complicated. In the 
simplest case the noun phrase 'XC-no-N' should 
be. translated as though the classifier' were a nor- 
real noun, giving 'the N of X C', for examph'. (10), (ll). 
(10) 1-pal-no nedan 'l-cup of price' 
the price of lcv, p 
(11) 1-tsu-no ncda'n \[-ha lOen da\] 'l-piece of 
t)riee \[-'cop 1(\] yen is\]' 
-> the price of 1 (thing) fis 10 yen\] 
In other words, if N has the attribute AMOUNT 
then the noun phrase should normally be trans- 
lated as though C were not a classifier. The inter- 
pretation of C is, however, ambiguous. C could 
be used as a elassiiier with the amount N in its 
scope (12), or C could have anaphoric reference 
(13). ALT-J/E chooses the interpret~tion shown 
in example (13) as its defmflt. 
(12) 1-sh'ili-no 'n, edan '1 kind of price' 
-~ 1 kind of price 
(13) 1-st\]u-no neda'n, '1 kind of price' 
-~ th.e price of .l kind/of something\] 
Further, when N is an attribute and C measures 
the same attribute, the interpretation is again dif- 
ferent. N)r exainple, if C measures N's attribute 
then the resulting noun phrase will be indefinite 
by default: a height of lore or a price of 10 yen. 
Ilowever if the noun phrase is used ascriptively 
then it; should be converted (;ither to an adjective 
it is lore high or a prel)osit;ional pin'as(; it is lO 
yen in price. Finally, if a noun phrase of this type 
is used to modify at\]other noun then it line(is to lm 
converted to an adjective a .lOre high building or 
a post modifying prepositional phrase a chocolate 
10 yen in price. 
The coml)inatio\]ls of nouns and classifiers men- 
tioned above can all be translated by the ma- 
chine translation systerit ALT-J/E using the 
analysis of classifiers presented in this paper 
ill combination with a semantic hierarchy of 
2,800 categories common to all nouns, as de- 
scribed in Ikeharaetal. (1991). The parti- 
cle no 'of', has many possible interpretations, 
Shimazu, Naito, and Nomura (1987) identify tire 
main types of A-'n.o-\]l expressions, and some 80 
128 
Ta})l(~ 4: l>rol)osed Analysis of (A tssitiers 
-- \]~xanq)h." " ~ ~ - " - .lalmnese POS Enghsh Restriction on eml)edded NP 
Unit --tsv, 'I)ie(:e' jos'ashi l)efmfll; classifi(;r if un(:ount;al)l<~ head, 
n() (:lassiti(~r if' (:ounl;al)le 
os',,slii ~l'r:mSldi:(ih:lassit~ir if ,m({0{mtal)l}{, 
n() (:la.ssiiier if (:o(ml,at)le 
.'i&~',,,~Sh.i - " q),/a;nsl;~l;e (:hl.ssiti({l:, : 
t'()r(:(', h(!a.(\[ 1,() I)e lllH:ollnl,a})l(~ 
Metric - Mc.asm'e josUshi l~lmM if l)ossi})h~7 sitlgular agr(>merll. 
C(>fil, aiiw, r n<)tfii/jo,s'&d6 Plu,'i{i if i)<issil)le, no,,md i~.gt:;:efim,i~ 
Group - mu'rc 'gr<m\[)' n(nm/josv;sh, i I'h||;-d if 1)ossil>le 
Species " " ' " ' : ' = @/w(',a,z km(l n(,un/do,s',Mv~ Nufid)er &r(',es it' l)ossil)le 
'\['able 5: A comparison of (liff(;r<;nl, analyses 
Proposed Analysis Quirk-et al Kamei et al ~ Sorldertlamvanieh et al 
Unl~t ---, ~@('J!.et'a~Lll,y:(J(~n~,,ra.~ ..... V -- -- -- 
|~P~yI>~e'al - . - . Pie(:e Unit, ; 
Si;(',<:i~ - Q uanl.|l;y/l'yl)l(.a. 1 
Metric, gMea,sm'(i M(ii~sUr( : ~7i~1, - ~+< . • Mel,ric 
\] t,(~m~mer v, )ntamer _ 
_Group Qua||t,ib,-l'hirM %el, C<)lh~<:l, iv(', 
Species Qualil;y - l(in(1 
(Unit) 7 "\]'i,ncs F,<@wi(:y 
(Unit) V<M)a,1 
Classifier type 
General 
Typical - il:s'u, bv, -'gi:ai~l c 
Sl,&:ia,f ---ki,'(: 'sli<?~ ~ 
-i'n, ch, i 'inch' 
h.(&o 'bc)x' 
sub l,yt)es. Our analysis (:ul;s across Shima.Z|l et: 
al.'s l;ypes, includillg al; leasl; t:hre<; of I;h(! su})l;yl)es, 
a,t(1 also makes (:lear some l'el~-tl;iolls l;\]lal; ill'(! II()L 
explMIJy nam(~<l. 
5 Comparisons with other 
Analyses 
We summm.'iz(~ our m~alysis of classifiers in Ta- 
ble d. Our aualysis was based mainly (m I;h<! 
I)rot)eri;ies of Lhe g<;n(',rat;<',(1 li;nglish, so ix nat- 
urally (luilx~ (:lose t,o I;he division ()f t)arl;i@v(! 
nouns propose<l by Quirk el; al. (1985). The anal- 
ysis is also (tuil;e, (:lose to those t)roi)ose(\[ by 
Kamei and Muraki (1995) for Jal)ancsc and S()III- 
leri;lmnwufi(:h el; al. (1994) f()r Thai. This sup- 
ports Allah'S (1977) ass('a'l:io,l t;hat; "diverse lan- 
guage eOlnlnunil, i(~s (:ateg/orize. 1)er(:eived phenom-- 
(',ha in similar ways". The <lit\['erenl; analyses are 
(:oml)ar(',d in Tabh', 5. 
We make th('. distinct, ion b(',l;we(m <:lassitiers 
of frequen<:y and ol;her UNIT (:lassitiers })y u,q- 
ing our general sema.nl;i(: hi<~rarchy. Sornlerl;lam- 
wmi(:h el; M.'s VEHBAI, (:lassiti(;rs "ally (:lassifier 
whi{:h is derived from a verb \[...1 /kraa(l haa 
muan/ 'five roils of pal)er'." (:an be in(:lu<h',d 
in the METRIC (:a.t;egory, ;dr;hough il; may })e t;he 
(:ase l;hal; l;hey have a diflhrenl; parl; of sl)ee(:h in 
Thai. Kamei and Muraki (1995) put UNIT (:lassi- 
tiers inl;o {;wo (:lasses: 'Counl;ing T()l;aJ Amomtl;': 
3kg of su.(tar' and ~Coull.Ling; all Al;t;ril>lll;e Yahte': 
a .spc.ed of (iOmph. '\]'his disl:in(:l;ion t)elongs I;o l;he 
inW, rt)ret, ation ol' (;he (:lassitier in (:ontx~xt, rat;h<~r 
than il;s hdw, l'(!nt prOl)erl;ies , so we fe(!l t:he dis- 
tin(:l,ion sh(mh\[ l)e made (\[llI'illg l>ro<:essint,~, as (h> 
s(:ril)ed in Se(-1,ion d, val,her t;ha.n its l)arl, oJ t,he 
analysis <)f l,he (:lassiti<'.rs t;hems(~lv<~s. 
6 Conclusion 
In this im,l)er we pres,:ull; a,n analysis off (:lassitieT~, 
suii;al)le for use ill a .\]apanese-to-ldnglish ma<:hin<~ 
I, ranslal;ion sysl,('m. We divide (:lassitiers into four 
ma,ior tyl)es: UNIT, METI/.IC~ (~l/.()Ul) /Hid SI)},',CII,;S. 
IJNIT classifiers arc, further divided inl;o (IIqNEI{.AI,~ 
TYI'ICAI, ;I,II(l SI'E(:IAI,, whil(; METI{IC (;\]assitiers 
are divided iltl;o MEASUI/.I,; all(\[ CONTAINEI{ (:\]ils- 
sifters. The analysis ix 1)ased on <:ha.ra.<:lx~risl;i(:s 
1)e<'uliar 1;o Japanese mM English, a.s well as i,h(', 
differences bel;we(m I;hem. The resuli;ing amtlysis 
ix shown {;<> }>e similar t;o ()n(~ pr()pose(t tbr Thai, 
an unrelaJ;(xl laJtguage, suggesl:iug thai, it may I)e 
more widely al)pli(:al)le. 
The az|Mysis has }>e(!n imt)lemen(x~d in NTT's 
,/at)a.nes<>lx)-English machine (,ranslal;ion syst;(;m AL'.r-J/E 
si~<:(,. 19.()4.11; makes 1)ossible a uniform 
an(l st;raigh(;forwa.rd l;re~-tt, m(:nL of n()un phrases 
headed t)y classifiers. 
Furl;h('J" work remains 1;() be done in (',xmnining 
the <listribul;ion of classifiers in differ(;nt, domains, 
and possibly identit~ying classifiers a.ul,omal;i<:a,lly. 
129 
References 
Allan, Keith. 1977. Classifiers. Language, 53:285- 
311. 
Allan, Keith. 1980. Nouns and countability. Lan- 
guage, 56(3):541-67. 
Asahioka, Yoshimi, Hideki Hirakawa, and Shin-ya 
Amano. 1990. Semantic classification and an an- 
alyzing system of Japanese numerical expressions. 
IPSJ SIG Notes 90-NL-78, 90(64):129 136, July. 
(in Japanese). 
Bond, Francis, Kentaro Ogura, and Satoru Ikehara. 
1994. Countability and number in Japanese-to- 
English machine translation. In Proceedings of the 
15th International Conference on Computational 
Linguistics (COLING 'g~), pages 32--38, August. 
(cmp-lg/9511001). 
Bond, Francis, Kentaro Ogura, and Tsukasa Kawaoka. 
1995. Noun phrase reference in Japanese-to-English 
machine translation. In Proceedings of the Sixth In- 
ternational Conference on Theoretical and Method- 
ological Issues in Machine Translation (TMI '95), 
pages 1-14, July. (cmp-lg/9601008). 
Cornish, Tim, Kimikazu Fujita, and Ryochi Sugimura. 
1994. Towards machine translation using contex- 
tual information. In Proceedings of the 15th Inter- 
national Conference on Computational Linguistics 
(COLING '9~), pages 51-56, August. 
Ikehara, Satoru, Satoshi Shirai, Akio Yokoo, and Hi- 
romi Nakaiwa. 1991. Toward an MT system with- 
out pre-editing - effects of new methods in ALT- 
J/E-. In Proceedings of MT Summit III, pages 
101 106. (cmp-lg/9510008). 
Kamei, Shin-ichiro and Kazunori Muraki. 1995. An 
analysis of NP-like quantifiers in Japanese. In Pro- 
ceedings of the Natural Language Processing Pacific 
Rim Symposium (NLPRS '95), volume 1, pages 
163 167. 
Kuno, Susumu. 1973. The Structure of the Japanese 
Language. MIT Press, Cambridge, Massachusetts, 
and London, England. 
Miyazaki, Masahiro, Satoshi Shirai, and Satoru Ike- 
hara. 1995. A Japanese syntactic category sys- 
tem based on the constructive process theory and 
its use. Journal of Natural Language Processing, 
2(3):3-25, July. (in Japanese). 
Murata, Masaki and Makoto Nagao. 1993. Determi- 
nation of referential property and number of nouns 
in Japanese sentences for machine translation into 
English. In Proceedings of the Fifth International 
Conference on Theoretical and Methodological Is- 
sues in Machine Translation (TMI '93), pages 218- 
25, July. 
Ogura, Kentaro, Akio Yokoo, Satoshi Shirai, and 
Satoru Ikehara. 1993. Japanese to English ma- 
chine translation and dictionaries. In Proceedings 
of the ~th Congress of the International Astronau- 
tical Federation, Graz, Austria. 
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, 
and Jan Svartvik. 1985. A Comprehensive Gram- 
mar of the English Language. Longman, Essex. 
Shimazu, Akira, Shozo Naito, and Hirosato Nomura. 
1987. Semantic structure analysis of Japanese noun 
phrases with adnominal particles. In Proceedings of 
the 25th Annual Meeting of the ACL, pages 123- 
130. Association for Computational Linguistics. 
Sornlertlamvanich, Virach, Wantanee Pantachat, and 
Surapant Meknavin. 1994. Classifier assignment 
by corpus-based approach. In Proceedings of the 
15th International Conference on Computational 
Linguistics (COLING '9\]t), pages 556 561, August. 
130 
