COLING 82, J. Horeckfp (ed. } 
North.Holland Publishing Company 
© Ac~lon~ 1982 
COMPUTATIONAL ANALYSIS OF MANDARIN SOUNDS 
WITH REFERENCE TO THE ENGLISH LANGUAGE 
Ching Y. Suen 
Department of Computer Science 
Concordia University 
1455 de Maisonneuve West 
Montreal, Canada H3G IM8 
In the analysis of Mandarin, the author used a 
corpus composed of over 750,000 samples transcribed 
automatically from Chinese characters by the com- 
puter through the sequential application of a set 
of phonetic rules developed by the author. The 
result is a classification and rank distribution 
of all speech sounds, the phonetic properties, 
frequency distribution of symbols, phonemes, syl- 
lables, tones, and their combinations. These 
statistical properties are compared with those of 
the English language. 
INTRODUCTION 
English is widely used throughout the world while Mandarin is spoken 
in China by one quarter of the world's population. It is interest- 
ing to compare the phonetic properties of these two languages and to 
study their similarities and differences to understand the problems 
involved when the speakers of one language learn to speak the other 
language. 
The linguistic aspects of written English and Chinese, such as those 
presented below, have already been studied extensively (see Suen 
(1979 a & b)), e.g. a) the frequency of occurrence of words/ 
characters, b) the statistical distribution of letters/radicals, 
and their combinations, c) the study of syntax, semantics, prag- 
matics, and related areas, and d) translation of one language into 
another. Also the linguistics aspects of spoken English have been 
described in well-known literature (Dewey (1923), Carterette et al 
(1974)), e.g. a) the frequency of occurrence of syllables, b) the 
statistical distribution of phonemes and their combination s , c) the 
phonetic system and symbolic representation of sounds, and d) a- 
coustical analysis, recognition and synthesis of the spoken language. 
Such investigations, however, have been rarely conducte~'"on Mandarin, 
the official language spoken in China, owing to many reasons. In 
order to close this gap, the author has initiated several research 
projects aimed specifically at the analysis of these aspects (Suen 
(1979 b)). 
PROPERTIES OF MANDARIN 
Mandarin sounds consist of consonants, vowels, semi-vowels and diph- 
thongs. A Mandarin syllable comprises 1 to 3 such constituents, the 
first symbol is usually a consonant. The syllabic structure of 
Mandarin is shown below. A key to the phonetic symbols used in this 
371 
372 C.Y. SUEN 
paper can be found in Suen (1979 b). 
Syllabic Structure Example 
i. Vowel /A/ 
2. Diphthong /EI/ 
3. Triphthong /I AU/ 
4. Vowel ÷ Nasal /AN/ 
5. Diphthong ÷ Nasal /I ANG/ 
6. Consonant ÷ Vowel /B O/ 
7. Consonant + Diphthong /P AU/ 
8. Consonant + Triphthong /M I AU/ 
9. Consonant + Vowel ÷ Nasal /RUN/ 
10. Consonant + Diphthong + Nasal /L I ANG/ 
Associated with each Mandarin syllable is a tone which gives the mu- 
sical quality. It is normally denoted by a diacritical mark as 
shown below. A tone specifies the pitch contour of the syllable. 
There are five tones in Mandarin and they can be described as 
follows (Suen (1979 b)) : 
Tone DescriDtion Pitch 
i. -- high level 55 
2. / high rising 35 
3. v low rising 214 
4. ~ high falling to low 51 
5. ° neutral 5 
For example, the syllable /WOV/, meaning "I" in English, has a low 
rising tone. Since there are only about 400 different syllables in 
the whole Mandarin language, the tone is crucial in signifying the 
meaning of words. This property of Mandarin is distinct from the 
English sounds. 
COMPUTATIONAL ANALYSIS OF DATA 
In computational linguistics, it is essential to have a large collec- 
tion of data in order to derive reliable results. The help of a 
COMPUTATIONAL ANALYSIS OF MANDARIN SOUNDS 373 
computer is indispensable. In this study, computational analysis of 
a corpus composed of more than 750,000 samples of Mandarin syllables 
was made. More details .can be found in Suen (1979 b). Owing to the 
limitation of space here, this paper only compares the frequency 
distribution of Mandarin and English phonemes. The distribution of 
English sounds was derived from a study conducted by Carterette and 
Jones (1974). Their phonemic frequencies were obtained from a 
transcription of 15,694 words spoken by 24 subjects. 
REMARKS ON MANDARIN/ENGLISH SOUNDS 
From Tables 1 and 2, one can make the following observations: 
(a) Mandarin consonants occur 8% less frequently than English con- 
sonants 
(b) Semi-vowels are used twice more often in Mandarin than in 
English 
(c) Vowels occur more frequently in Mandarin than in English 
(d) Chinese speakers used diphthongs more often than English 
speakers 
(e) Mandarin tones are not evenly distributed and the 4th tone 
occurs much more frequently than the others 
(f) Although both English and Mandarin have approximately 40 
phonemes, many Mandarin phonemes do not occur in the English 
language, especially the retroflex and dental sibilant sounds, 
and the round-lipped vowel ~ which occur rather frequently 
(about 12%) in Mandarin conversations 
(g) There is considerable difference in the distribution of Mandarin 
and English diphones, triphones, etc. which affect significant- 
ly the formation of syllables in these two languages 
(h) Considerable difference also occurs between the syllabic 
structures of English and Mandarin 
(i) There are many more sound patterns in English (about one dis- 
tinct sound for one word) than in Mandarin (only about 1160 
distinct sounds in the entire language) 
Implications of the above on the learning of Mandarin by English 
speakers will be discussed. Their effects on computer synthesis 
and recognition of Mandarin speech will also be presented. If time 
permits, the author wishes to present his new phonetic system which 
will enable an English speaker to pronounce Mandarin sounds correctly 
and easily. Encouraging results of applying this new system in the 
learning situation will be discussed. 
ACKNOWLEDGEMENTS 
This research was supported by a grant from the Kung Chung Wo Co., 
Ltd. of Hong Kong. The encouragements of Mr. Peter K. L. Chan is 
deeply appreciated. 
374 C.Y. SUEN 
Table 1 
Relative Percent Proportion of Phonemes in the Corpus: Classified 
into Consonants, Semi-vowels , Vowels and Diphthongs. 
(a) .Consonants 
Plosive 
Unaspirated Aspirated 
Labial b 1.97 
Dental & d 4.00 
Alveolar 
Gutteral g i. 91 
Palatal ~ 2.56 
~t~flex .a~ 2.11 
Dental dz 1.38 
sibilant " " 
Fotal 13.93 
Nasal Lateral 
p 0.40 m 1.49 
t 1.61 n 7.60 1 2.24 
k 0.78 ~ 6.10 
1.23 
t~' 1.04 
ts' 0.45 
5.51 15.19 2.24 
(b) Semi-vowels 
Fricative Total 
f 1.03 4.89 
15.45 
h 1.98 10.77 
2.13 5.92 
J 3.13 7.06 
0.78 
s 0.86 2.69 
9.91 46.78 
lw ,~ I ~ ~o~I ~ota~ ~o~ I 
(c) Vowels 
Tongue 
Position 
High 
Mid 
iLow 
Dotal 
Front 
i 12.45 
1.91 
3.15 
17.51 
Central Back 
i3" 0.40 
6.19 
i~ 6.67 
u 4.43 
D 1.93 
13.26 6.36 
(d) Diphthong s 
Total 
18.79 
11.67 
6.67 
37.13 
I~ i"691ei i'75I@'~12"231 °~ 1"43li~ 0.60 i=~ ~.~0 I 
(e) Tones 
I 1 I I -i st Tone 2nd Tone 3rd Tone 4th Tone 5th Tone 12!. 39 I,, 20.40 17.75 34.46 6. oi | 
COMPUTATIONAL ANALYSIS OF MANDARIN SOUNDS 375 
Table 2 
Relative Percent Fre~uenc\[ Occurrence of Phonemes in English Speech. 
(a) Consonants 
Fricati~s rl Plosives Nasal Lateral Total 
Un- Un- 
Voioed Voiced Voiced Voiced 
Iabial b 1.80 p 1.43 m 2.46 5.69 
Labial- v 1.52 f 1.42 2.94 Dental 
Dental ~ 2.7~ ~ 0.80 3.58 
~iveolarld 3.75 t 4.62 n 7.11 1 3.80 z 2.27 s 4.65 31.96 r 5.76 
Palatal ~ 0.82! ~ 0.44 ~ 0 f 0.45 1.71 
Velar g !.23 k 2.90 ~ 1.06 5.19 
31o~.al ? 2.03 h 1.63 3.66 
Dotal 7.60 11.42 i0.6~ 9.56 6.5~ 8.95 54.73 
(b) Semi-Vowels 
I w 2.87 
(c) Vowels 
I y 1.93 Total 4.80 I 
tongue 
Position 
High 
Mid 
Low 
rotal 
Front 
i 3.77 
I 5.11 
e 1.55 
£ 3.18 
az 2.52 
16.13 
Central 
12.99 
1.22 
14.21 
Back 
u 1.78 
1./ 0.47 
1.51 
3..76 
Total 
11.13 
19.23 
3.74 
34. I0 
(d) Diphthongs 
l~z(bite) 3.19 laS(bout). 0.75 \[o( (boy) 0.09 10U(boat) 2.34\]Tota16.3~ 
376 C.Y. SUEN 

REFERENCES 

Suen, C. Y., "N-gram statistics for language understanding and 
textprocessing," IEEE Trans. Pattern Analysis and Machine 
Intelligence, (1979a), 164-172. 

Dewey, G., Relativ Frequency of English Speech Sounds, (Harvard 
University Press, Cambridge, 1923), 187 pp. 

Carterette, E. C. and Jones, M. H., Informal Speech: Alphabetic 
& Phonemic Texts with Statistical Analyses and Tables, 
(University of California Press, Berkeley, 1974), 646 pp. 

Suen, C. Y., "Computer synthesis of Mandarin," Proc. Internation- 
al Conf. on Acoustics, Speech and Signal Processing, (1976), 
698-700. 

Suen, C. Y., Computational Analysis of Mandarin, (Birkhauser 
Verlag, Basel-Stuttgart-Boston, 1979 b), 160 pp. 

Suen, C. Y., "A comparative study of Mandarin phonetic systems 
by computer," Proc. International Computer Conf., Hong Kong,(198~ 
7.3.1-7.3.15. 
