Countability and Number 
in Japanese to English Machine Translation 
Francis Bond, Kentaro Ogura, Satoru Ikehara 
NTT Communication Science Laboratories 
1-2356 Take, YokosUka-shi, K~nagawa-ken, JAPAN 238-03 
{bond,ogura,ikehara} @nttkb.ntt.jp 
Abstract 
This paper presents a heuristic method that 
uses information in the Japanese text along 
with knowledge of English countability and 
number stored in transfer dictionaries to de- 
termine the countability and number of En- 
glish.noun phrases. Incorporating this method 
into the machine translation system ALTJ/E, 
helped tO raise the percentage of noun phrases 
generated with correct use of articles and num- 
ber from 65% to 73%. 
1 Introduction 
Correctly determining number is a difficult 
problem when translating from Japanese to 
English. This is because in Japanese, noun 
phrases are not normally marked with respect 
to number. Japanese nouns have no equivalent 
to the English singular and plural forms and 
verbs do not inflect to agree with the number 
of the subject (Kuno 1973). In addition, there 
is no grammatical marking of countability, t 
In order to generate English correctly, it 
is necessary to know whether a given noun 
phrase is countable or uncountable and, if 
countable, whether it is singular or plural. 
Deciding this is a problem even for humans 
translating from Japanese to English, but they 
have their own knowledge of both languages 
t Japanese does not have obligatory plural mor- 
phemes. Plurality can be marked but only rarely is, for 
example by adding a suffix such as tachi "and others" 
(this can normally only be used with people or mfimals). 
tO draw on. A machine translation system 
needs to have this knowledge codilied in some 
way. As generating articles and number is only 
important when the rest of the sentence has 
been correctly generated, them has not been a 
lot of research devoted to it. Recently, Murata 
and Nagao (1993) have proposed a method 
of determining the referentiality property and 
number of nouns in Japanese sentences for 
machine translation into English, but tim re- 
search has not yet been extended to include 
the actual English generation. 
This paper describes a method that extracts 
information relevant to countability and num- 
ber from the Japanese text and combines it 
with knowledge about countability and num- 
ber in English. First countability in English 
is discussed at the noun phrase and then the 
noun level. As a noun phrase's countability in 
English is affected by its referential property 
(generic, referential or ascriptive) we present 
a method of determining the referential use of 
Japanese noun phrases. Next the process of 
actually determining noun phrase countability 
and number is described. This is followed 
by some examples of sentences translated by 
the proposed method and a discussion of the 
results. 
The processing described in this paper has 
been implemented in NTI" Communication 
Science Laboratories' experimental machine 
translation system ALTJ/E (\[kehara et al. 
1991). Along with new processing for the 
generation of articles, wifich is not discussed 
in detail in this paper, it improved the percent- 
32 
age of noun phrases with correctly generated 
detemfiners and number flom 65% to 73%. 
2 Countability 
2.1 Noun Phrase Countability 
We adopt the definition of countability in 
English given in Allan (1980:541-3). A 
countable noun phrase is defined as follows: 
I If the head constituent of an NP falls within 
the scope ofa denumerator it is countable. 
II If the head constituent of an NP is plural it 
is countable. 
Where "the phrase 'fiflls within tile scope 
\[or domain\] of a denumerator' means 'is de- 
numerated' by it; i.e tim NP reference is quan- 
tiffed by the denumerator as a number of 
discrete entities." 
Not all nouns in English can become the 
head of a countable noun phrase. In particu- 
lar, noun phrases whose heads fall within the 
scope of a denumerator ('denumerated' noun 
phrases) must be headed by a noun that has 
both singular and plural forms. Nouns that 
do not have both forms, like equipment or 
scissors, require a classifier to be used. The 
classifier becomes the head of a countable 
noun phrase with the original noun attached 
as the complement of a prepositional phrase 
headed by of.. a pair of scissors, a piece oJ" 
equipment. 
Whether a noun can be used to head a 
countable noun plu'ase or not depends both 
on how it is interpreted, and on its inherent 
countability preference. Noun countability 
preferences are discussed ill the next section. 
2.2 Noun Countability Preferences 
A noun's countability preference determines 
how it will behave in different environments. 
We classify nouns into seven countability 
preferences, live major and two minor, as 
described below. 
The two most basic types are 'fully cotmt- 
able' and 'uncountable'. Fully countable 
nouns, such as knife have both singular and 
plural forms, and cannot be used with deter- 
miuers such as mttclt. 2 Uncotmtable nouns, 
such as Jiirniture, have no plural form, and 
can be used with much. 
Between these two extremes there are a 
vast number of nouns, such as cake, that can 
be used in both countable and uncountable 
noun phrases. They have both singular and 
plural forms, and can also be used with much. 
Whether such nouns will be used countably or 
uncountably depends on whether their referent 
is being thought of as made up of discrete units 
or not. As it is not always possible to explicitly 
determine this when translating from Japanese 
to English, we divide these nouns into two 
g,'oups: 'strongly countable', those that are 
more often used to refer to discrete entities, 
such as cake, and 'weakly countable', those 
that are more often used to refer to unbounded 
referents, such as beer. 
The last major type of countability prefer- 
ence is 'phlralia tanta': nouns that only have a 
phn'al form, such as scissors. They can neither 
be denumerated nor modilied by much. We 
further subdivide pluralia tanta into two types, 
those that can use the classifier pair to be de~ 
numerated, such as a pair of scissors and those 
that can't, such as clothes. 'pair' pluralia tanta 
have a singular lorm when used as modifiers 
(a scissor movement). Pluralia tanta such as 
clothes, use the plural form even as modifiers 
(a clothes horse), and need a countable word 
of similar meaning to be substituted when they 
are denumerated: a garment, a suit ..... 
The two minor types are subsets of flflly 
countable and uncountable nouns respec- 
tively. Unless explicitly indicated, they will 
be treated the same as their supersets. 'Col- 
lective' nouns share all the properties of fully 
countable nouns, in addition they can have 
singular or plural verb agreement with the 
singular form of tile noun: The government 
has/trove decided. 'Semi-countable' nouns 
share the properties of uncountable nouns, ex- 
cept that they can be modiffed directly by alas; 
for example a knowledge l of.lapanese\]. 
I 
2The determiners much, 'little, a little, less and 
overmuch, can all be used for this test 
33 
Table 1: Lexical information for nouns 
Noun Countability Default Default 
English Japanese Preference Number Classifier 
-~ife houchou 
noodles men 
group mure 
cake ke-ki 
beer bi-ru 
furniture kagu 
knowledge chishiki 
scissors hasami 
clothes ifuku 
FULLY COUNTABLF~ SINGULAR -- 
FULLY COUNTABLE PLURAL -- 
(COLLECTIVE) SINGULAR -- 
STRONGLY COUNTABLE SINGULAR -- 
WEAKLY COUNTABLE SINGULAR -- 
UNCOUNTABLE SINGUI,AR piece 
(SEMI-COUNq<ABLE) SINGULAR piece 
PLURALIA TANTUM PLURAL pair 
PLURALIA TANTUM PLURAL -- 
Examples of the information about count- 
ability and number stored in the Japanese to 
English noun transfer dictionary are given in 
table 1. The information about noun count- 
ability preferences cannot Joe found in stan- 
dard dictionaries and must Ix: entered by an 
English native speaker. Some tests to help 
determine a given noun's countability prefer- 
ences are described in Bond and Ogura (1993), 
which discusses the use of noun countability 
preferences in Japanese to English machine 
translation. 
3 Determination of NP Ref- 
erentiality 
The first stage in generating the count- 
ability and number of a translated English 
noun phrase is to determine its referentiality. 
We distinguish three kinds of referentiality: 
'generic', 'referential' and 'ascriptive'. 
We call noun phrases used to make general 
statements about a class generic; for example 
Mammoths are extinct. The way generic noun 
phrases are expressed in English is described 
in Section 3.1. Referential noun phrases are 
ones that refer to some specific referent; for 
example Two dogs chase .a cat. Their num- 
ber and countability are ideally determined by 
the properties of the referent. Ascriptive noun 
phrases are used to ascribe a property to some- 
thing; for example Ilathi is an elephant. They 
normally have the same number and count- 
ability as the noun phrase whose property 
they are describing. 
2. 
3. 
if restrictively modilied then 'referential' 
my book, the man who came to dinner 
if subject of extinct, evolve ... 'generic' 
Mammoths are extinct 
if the semantic category of the subject of a 
copula is a daughter of the semantic cate- 
gory of the object then 'generic' 
Mammoths are animals 
at, Jbr ...then 4. if modified hy aimed 
' generic' 
' A magazine for women 
5. if object of like.., then 'generic' 
I like cake 
6. if complement of a copula then 'ascriptive' 
N77' is a telephone company 
7. if appositive then 'ascriptive' 
NT/, a telephone company ... 
8. default 'referential' 
Figure 1: Determination of NP refemntiality 
The process of determining the referential- 
ity of a noun phrase is shown in Figure 1. 
The tests are processed in the order shown. 
As far as possible, simple criteria that can be 
implemented using the dictionary have been 
chosen. For example, Test 4" if a NP is mod- 
ilied by aimed at,for.., then it is 'generic'" is 
34 
applied as part of translating NPl-muke into 
"for NPI". The transfer dictionary includes 
the information that in this case, NPI should 
be generic. 
"li~sts 2 a3 show two more heuristic methods 
for determining whether a noun phrase has 
generic reference. In Test 2, if the predicate 
is marked in the dictionary as one that only 
applies to classes as a whole, such as evolve 
or be extinct, then the sentence is taken to 
be generic. In "lest 3, AUI',I/I,:'s semantic 
hierarchy is used to test whether a sentence is 
generic or not. For example in Mamnloths are 
animals, mammoth has the semantic category 
ANIMAL so the sentence is judged to be stating 
a fact true of all mannnoths and is thus generic. 
3.1 Generic iioun phrases 
A generic noun phrase (with a countable head 
noun) can generally be expressed in three 
ways (Huddleston 1984). We call these GEN 
'a', where the noun phrase is indefinite: A 
mammoth is a mammal; GEN 'the', where the 
noun phrase is definite: The mammoth is a 
mammal; and GEN ~/5, where there is no arti- 
cle: Mammoths are mammals. Uncountable 
nouns and pluralia tanta can only he expressed 
by GEN </~ (eg: Furniture is expensive). They 
cannot take GEN 'a' because they cannot be 
moditied by a. They do not take GEN 'the', 
because then the noun phrase woukl normally 
be interpreted as having detinite reference. 
Nouns that can be either countable or un- 
countable also only take GEN (,: Cake is 
delicious/Cakes are delicious. These combi- 
nations are shown in Table 2, noun phrases 
that can not be used to show generic reference 
are marked *. 
Table 2: Genericness and Countahility 
Countable 
it mammofll 
\] 'the'| the mammoth 
I ...4'. / ,  ,mmott,  
Noun Countability Preference 
Both Uncounlable 
*a cake 
*the cake 
cake/cakes 
*a furniture 
*the furniture 
furniture 
The use all three kinds of generic noun 
phrases is not acceptable in some cofltexts, for 
example * a mammoth evolved. Sometimes a 
noun phrase can be ambiguous, for example 1 
like the elel~hant, where the speaker could like 
a particular elephant, or all elephants. 
Because the use of GEN (~ is acceptable 
in all Contexts, AI,T-J/E generates all generic 
noun phrases as such, that is as bare noun 
phrases. The number of the noun phrase is 
then determined by the countability preference 
of the noun phrase heading it. Fully countable 
nouns and pluralia tanta will be plural, all 
others are singular. 
4 Determination of NP 
Countability and Number 
The following discussion deals only with ref- 
erential and ascriptive noun phrases as generic 
noun phrases were discussed in Section 3.1, 
The delinilious of noun phrase countability 
given in Section 2, while useful for analyzing 
English, are not suf\[icient for translating fl'om 
Japanese to English. This is hecause in many 
cases it is impossible to tell fl'om the Japanese 
form or syntactic shape whether a translated 
noun phrase will fall within the scope of a 
denumerator or not. Japanese has no equiva- 
lent to a/an and does not distinguish between 
countable and uncountable quantifiers such 
as many/much and little/J'ew. Therefore to 
deternfine countability and generate number 
we need to use a combination of iuformatiou 
from the Japanese original sentence, and de- 
\['ault itflBrmation from the Japanese to English 
Iransfer dictionary. As much as possible, de- 
tailed intbrmation is entered in the lransier 
dictionaries to allow the translation process 
itself to be made simple. 
The process of determining a noun phrase's 
countability and ntnnl)er is shown in Figure 2. 
The process is carried out during the transfer 
stage so information is available from both 
the .lapanese original and the selected English 
translation. 
To make the task of determining countabil~ 
ity and number simpler,' we deline combina- 
tions of dil'fereut countabilities for nouns with 
35 
Table 3: Noun Phrase Countability and Number 
Noun Type 
Fully Countable 
Strongly Countable 
Weakly Countable 
Uncountable 
Pluralia Tantum 
Denumerated 
Singular t Plural a dog two clogs 
a cake 
a beer 
a piece of information 
a pair of scissors 
two cakes 
two beers 
two pieces of information 
two pairs of scissors 
Mass 
Cotmtable Uncountable 
dogs 
cakes 
beer 
in formation 
scissors 
(logs 
cake 
beer 
in format ion 
scissors 
1. 
2. 
. 
4. 
. 
if the Japanese is explicitly plural then 
countable and plural 
determine according to determiner 
one dog, all dogs 
determine according to classifier 
a slice of cake, a pile of cakes 
determine according to complement 
schools all over the country 
aseriptive NPs match their subjects 
A computer is a piece of equipment 
6. determine according to verb 
I gather flowers 
7. use default value 
(a) uncountable, weakly countable be- 
come: 
uncountable and singular 
(b) pluralia tanta become: 
countable and plural 
(c) countable and strongly countable be- 
come: 
countable and singular or plural 
according to the dictionary default 
Figure 2: Determination of English noun phrase 
Countability mid Number 
different countability preferences that we can 
use in the dictionaries. The effects of the four 
most common types on the five major noun 
countability preferences are shown in Table 3. 
Noun phrases modi fled by Japanese/English 
pairs that are translated as denumerators we 
call denumerated. For example a noun mod- 
ified by onoono-no "each" is denumerated - 
singular, while one modilied by ryouhou-no 
"both" is denumerated - plural. Uncountable 
and pluralia tantum nouns in denumerated en- 
vironments are translated as the prepositional 
complement of a classifier. A default classilier 
is stored stored in the dictionary for uncount- 
able nouns and pluralia tanta. Ascriptive noun 
phrases whose subject is countable will also 
be denumerated. 
The two 'mass '3 environments shown in Ta- 
ble 3 are used to show the countability of nouns 
that can be either countable or uncountable. 
Weakly countable nouns will only be count- 
able if used with a denumerator. Strongly 
countable nouns will be countable and plural 
in such mass - countable environments as tim 
object of collect (vt): I collect cakes, and un- 
countable and singular in mass -uncountable 
enviromnenls such as I ate too much cake. 
In fact, both I collect cake and I ate too 
many cakes are possible. As Japanese does 
not distinguish between the two the system 
must make the best choice it can, in the same 
way a human translator would have to. The 
rules have been implemented to generate the 
translation that has the widest application, for 
example generating I ale too much cake, which 
is true whether the speaker only ate part or all 
3We called these environments 'mass' because they 
both can be used to show a mass or unbounded 
interpretation. 
36 
of one cake or if they ate many cakes, rather 
than I ate too many cakes which is only true if 
the speaker ate many cakes, 
Sometimes the choice of the English trans- 
lation of a modifier will depend on the count- 
ability of the noun phrase. For example, 
kazukazu-no and takusan-no can all be trans- 
lated as "many". kazukazu-no implies that it's 
modificant is made up of discrete entities, so 
the noun phrase it modifies should be trans- 
lated as denumerated - plural, takusan-no 
does not carry this nuance so ALT-J/E will 
translate a noun phrase modified by it as mass 
- uncountable, and takusan-no as many il' the 
head is countable and much otherwise. 
Rules that translate the nouns with differ- 
ent noun countability preferences into other 
combinations of countable and uncountable 
are also possible. For example, sometimes 
even fully countable nouns can be used ira 
uncountable noun phrases. If an elephant is 
referred to not as an individual elephant but its 
a source of meat, tben it will be expressed in 
an uncountable noun phrase: I ale a slice qf 
elephant. To generate this the following rule is 
used: "nouns quantilied with the classilier kire 
"slice" will be generated as tile prepositional 
complement of slice, they will be singular 
with no ,article unless they are pluralia tanta, 
when they will be plural with no article". 
Note that countable indefinite singular noun 
phrases without a determiner will have a/an 
generated. Countable indelinite plural noun 
phrases and uncountable noun phrases may 
have some generated; a full discussion of this 
is outside the scope of this article. 
5 Experimental Results 
Tiffs processing described above has been im- 
plemented in ALT-J/E. It was tested, together 
with new processing to generate articles, on 
a specially constructed set of test sentences, 
and on a collection of newspaper articles. The 
results are summarized in Table 4. 
In the newspaper articles tested, there were 
an average of 7.0 noun phrases in each sen- 
tence. For a sentence to be judged as correct 
%ble 4: Correct Generation of Articles and 
Number 
Test Sentences \[~ewspal)er Articles 
NPs \[Sentences-Vl;qPs Sentences 
~7I-- 90o/'o 173% 12~ 
y~_ 46°/,, 165% 5% 
all the uoun phrases nmst be correct. The in- 
troduction of the proposed method improved 
the percentage of correct sentences from 5% 
to 12%. 
Some examples of translations before and 
after tile introduction of the new processing 
are given below. The translations before 
the proposed processing was implemented are 
marked O1.D, the translations produced by 
AI;I'-J/I,; using the proposed t)rocessing are 
marked NEW. 
(1) taitei-no kodomo-ha ototm-ni 
most child adult 
llill'U 
become 
OI.D: "Most children become 
an adult" 
NEW: "Most children become adults" 
In (1), the noun phrase beaded by otona 
"adult" is judged to be prescriptive, as it is tile 
complement of tile coptflar narlt "become". 
Therefore the proposed method translates it 
with the same number as the subject. 
(2) manmo.~u-ha zetumetu-shita 
mammoth died-out 
OLI): "A mammoth died out" 
NEW: "Manamoths died out" 
zetumettt "die out", is entered in the lexicon 
as a verb whose subject must be generic. 
manmosu "mammoth" is fully countable so 
the gene,ic noun phrase is translated as a bare 
plural. 
(3) tolil 3-chou, ha.~ami l-chou, 
tofu =3, scissors 1, 
i houchou 2-chou~Ea aru 
knife 2 is 
37 
38 
OLD: "There are 3 piece tofu, 
1 scissors, 
and 2 knives" 
NEW: "There are 3 pieces oftofu, 
1 pair 
of scissors and 2 knives" 
The old version recognizes that a denumer- 
ated noun phrase headed by ,an uncountable 
noun tofu requires a classifier but does not 
generate the correct structure neither does it 
generate a classifier for the pluralia tanta scis- 
sors. The version using the proposed method 
does. 
(4) sore-ha dougu da 
that equipment is 
OLD: "That is equipment" 
NEW: "That is a piece of equipment" 
As the subject o f the copula that is countable 
it's complement is judged to be denumerated 
by the proposed method. As the complement 
is headed by an uncountable noun it must be 
embedded in the prepositional complement of 
a classifier. 
There are three main problems still remain- 
ing. The first is that currently the rules for 
determining the noun phrase referentiality are 
insufficiently fine. We estimate that if refer- 
entiality could be determined 100% correctly 
then the percentage of noun phrases with cor- 
rectly generated articles and number could 
be improved to 96% in the test set we stud- 
ied. The remaining 4% require knowledge 
from outside the sentence being translated. 
The biggest problem is noun phrases requir- 
ing world knowledge that cannot be expressed 
as a dictionary default. These noun phrases 
cannot be generated correctly by the purely 
heuristic methods proposed here. The last 
problem is noun phrases whose countability 
and number can be deduced flom informa- 
tion in other sentences. We would like to 
extend our method to use this information in 
the future. 
6 Conclusion 
The quality of the English in a Japanese to 
English Machine Translation system can be 
improved by the method proposed in this pa- 
per. This method uses the information avail- 
able in the original Japanese sentence along 
with information about English countability at 
both the noun phrase and noun level that can 
be stored in Japanese to English transfer dic- 
tionaries. Incorporating this method into the 
machine translation system ALT-J/E helped 
to improve the percentage of noun phrases 
with correctly generated articles and nmnber 
from 65% to 73%. 
References 
Allan, K. (1980). Nouns and countability. 
Language 56.541-67. 
Bond, F., and K. Ogura. (1993). Determina- 
tion of whether an English noun phrase 
is Countable or not using 6 levels of lex- 
ical countability (in Japanese). In 46th 
Transactions of the Information Process- 
ing Society of Japan, 3:107-108. 
Huddleston, R. (1984). Introduction to the 
Grammar of English. Cambridge text- 
books in linguistics. Cambridge: Cam- 
bridge University Press. 
lkehara, S., S. Shirai, A. Yokoo, and 
1t. Nakaiwa. (1991). Toward an MT 
System without Pre-Editing - Effects of 
New Methods in ALT-J/E -. In Proceed- 
ings o/'MT' Summit III, 101-106. 
Kuno, S. (1973). The Structure of the 
Japanese Language. Cambrige, Mas- 
sachusetts, and London, England: MIT 
Press. 
Murata, M., and M. Nagao. (1993). Determb 
nation of referential property and number 
of nouus in Japanese sentences for ma- 
chine translation into English. In 7'hefifth 
international confef'ence on Theoretical 
and Methodological Issues in Machine 
7)'anslation, 218-25. 
