A CONSIDERATION ON THE CONCEPTS STRUCTURE AND LANGUAGE 
IN RELATION TO SELECTIONS OF TRANSLATION EQUIVALENTS OF VERBS IN MACHINE TRANSLATION SYSTEMS 
Sho Yoshida 
Department of Electronics, Kyushu University 36, 
Fukuoka 812, Japan 
ABSTRACT 
To give appropriate translation equivalents 
for target words is one of the most fundamental 
problems in machine translation systrms. 
Especially, when the MT systems handle languages 
that have completely different structures like 
Japanese and European languages as source and 
target languages. In this report, we discuss 
about the data strucutre that enables appropriate 
selections of translation equivalents for verbs 
in the target language. This structure is based 
on the concepts strucutre with associated infor- 
mation relating source and target languages. 
Discussion have been made from the standpoint of 
realizability of the structure (e.g. from the 
standpoint of easiness of data collection and 
arrangement, easiness of realization and compact- 
ness of the size of storage space). 
i. Selection of Translation Equivalent 
Selection of translation equivalent of a 
verb becomes necessary when, 
(1) the verb has multiple meanings, or 
(2) the meaning of the verb is modified under 
different contexts (though it cannot be 
thought as multiple meanigns). 
For example, those words '?~ ', '~9~;~ ', 
'~< ', '~', ' r~ ', '@~< ', ... 
are selectively used as translation equivalents of 
an English verb 'play' according as its context. 
i. play tennis : ~--- ~r~ 
2. play in the ground : ~ ~ ~ ~"C~ 
3. The children were playing ball (with each 
other) : -~/~,'--~rI~'g~t 
~. play piano : ~'T~r~( 
5. Lightning palyed across the sky as the storm 
began : ~:~ ~f~h 
In the above examples, they are not essential- 
ly due to multiple meanigns of 'play' but need to 
assign different translation euqivalents according 
as the differences of contexts in the case of 1. 
to 3., and due to multiple meanings in the cases of 
4. orS. 
A typical idea for selecting translation 
euqivalents so far is shown in the following 
example. 
Lets take a verb 'play'. If the object 
words of the verb belong to a category C play: ~ ~ obj 
we give a verb '?~ '(=do) as its appropriate 
translation equivalent. If the object words 
belong to a category CI~ : ~< , we give '~< ' 
as an appropriate translation equivalent of 
'play'. 
Thus, we categories words (in the target 
language) that are agent, object, -." of a given 
verb (in the source language) according as 
differences of its appropriate translation 
equivalents. 
In other words, these words are categorized 
according as "such expression as a verb with its 
case filled with these words be afforded in the 
target language or not", and are by no means 
categorized by their concepts (meaning) alone. 
For example, for tennis, baseball, ... E 
CPobl~: S~ =(tennis, baseball, card, ...}, trans- 
lation of 'play' are given as follows. 
play tennis : T--x~clt 
play baseball : ~ci~ 
play card : ~-- F~c?" 
To the words belonging to C play: 9~ ( = obJ 
{piano, violine, harp, -.. ), the translation 
equivalent of 'play' is given as follows. 
play piano : ~'TJ ~z~< 
play violine : ~4 ~ i) ~r~ 
pla~ harp : ~" -- / ~r ~ < 
Categories given in this way have a problem 
that not a small part of them do not coincide 
with natural categories of concepts. For example, 
members ' 7 ~ (ten/lid) ' and ' ~(baseball) ' of a 
category belong to a natural category 
of concepts ~(ball game), but ' ~--Y(card)' 
does'nt. Instead it belorEs to a conceptual 
category ~ (game in general). ~ is considered 
as a sub-category of ~ . Therefore, if we 
regard C play: ~ ~ obJ as ~ , then ~---~ (tennis), 
~-- ~" (card), 7 ~ ~ ~'--~ (football), ~7 (golf), 
--- can be members of it, but ~(go), ~;~(shogi) 
which also belong to the conceptual category ~, 
are not appropriate as members of ~obl~ : $ ~ 
('pl%y go : ~r~', 'play shogi : ~}~%~&' are 
not appropriate, instead we say 'pla~ go : ~r_~u 
_~ ', 'play shogi : ~_~._~') 
Therefore, cPla. y: $~ should be derided 
OD~ play" ~ & _~lay. ~ 
into two categories Cob j " and tobJ " @ 
The problem here is that, such division of 
categories do not necessarily coincide with 
natural division of conceptual categories. For 
167 
example, translation equivalent '~'' cannot be 
assigned to a verb 'play' when object word of it 
is ~ ~ ~ (chess), which is a game similar to ~ or 
~. Moreover, if the verb differs from 'play', 
then the corresponding structure of categories of 
nouns also differs from that of play. Thus we 
have to prepare different structure of categories 
for each verb. 
This is by no means preferable from both 
considerations of space size and realizability on 
actual data, because we have to check all the 
combinations of several ten thousands nouns with 
each verb. 
2. Concepts Structure with. Associated Information 
So we turn our standpoint and take natural 
categories of nouns (concepts) as a base and 
associate to it through case relation pairs of a 
verb and its translation equivalent. 
Let a structure of natural categories of 
nouns were given (independently of verbs). 
A part of the categories (concepts) structure 
and associated information (such as a verb and 
its translation equivalent pair through case 
relation etc.) is given in Fig.1. 
In Fig.l, verbs associated are limited to a 
few ones such as Do (obJ = musical instrument)~ 
Pla~ (obJ = musical instrument). Becsuse, from 
the definition of musical instrument :'an object 
which is played to give musical sound (such as a 
piano, a horn, etc.)", we can easily recall a 
verb 'play' as the most closely related verb in 
this ease. 
It can generally be said that the more the 
noun's relation to human becomes closer and the 
more the level of abstract of the noun becomes 
lower the numbers of verbs that are closely related 
to them ~id therefore have to associate to them 
(nouns) become large. And that the numbers of 
associated ideoms or ideom like parases become 
large, Therefore, the division of categories 
must further be done. 
The process of constructing this data 
structure is as follows. 
(1) Find a pair of verb and associated transla- 
tion equivalent (Do, Play : ~9-& ) that can 
be associated in common to a part of the 
structure of the categories as in Fig.l, and 
then find appropriate translation equivalents in 
detail at the lower level categories. 
(2) To each verb found in the process of the 
association, consults ordinary dictionary of 
translation equivalents and word usage of verbs 
and obtain the set of all the translation 
euqivalents for the verb. 
(3) Then find nouns (categories) related through 
case relation to each translation equivalent 
verb thus obtained by consulting word usage 
dictionary. Then check all the nouns belonging 
to nearby categories in the given concepts 
structure and find a nouns group to which we 
associate the translation equivalent. 
In this manner, we can find pairs of verb and 
its translation equivalent for any noun belonging 
to a given category. To summarize the advantage 
of the ls~ter method, (1) to (4) follows. 
(i) The only one natural conceptural categories 
structure should be given as the basis of this 
data structure. This categories structure is 
stable, and will not be changed basically, and 
is constructed independently from verbs. In 
other words, it is constructed indepndently 
from target language expression. 
(2) To each noun in a given conceptual category, 
,numbers of associated pairs of verb and its 
translation equivalent are generally small and 
can easily be found. 
(3) Association of the pair of verb and its trans- 
lation equivalent through case relation should 
be given to one category for which the associa- 
tion hold in common for any member of it. In 
cplay : ~ < Fig.l, a conceptual category -obJ is 
created from two categories ~ (keyboad 
musical instrument) and~ (string musical 
instrument) for this purpose. And then 
associate through case relation specific pair 
of verb and its translation equivalent to 
exceptional nouns in the category. 
(4) From (i) to (3), it follows that this data 
structure needs considerably less space and 
is more practical to construct than the former 
method.(chapter i) 
3. Concludin5 Remarks 
We proposed a data structure based on con- 
cepts structure with associated pairs of verb and 
its translation equivalent through case relations 
to enable the appropriate selections of transla- 
tion equivalents of verbs in MT systems. 
Additional information that should be 
associated to this data structure for the selec- 
tions of translation equivalents is ideoms or 
ideom like phrases. The association process is 
similar to the association process in chapter 2. 
0nly the selections of translation equiva- 
lents for English into Japanese MT have been 
discussed on the ass~nmption that the translation 
equivalents for nouns were given. 
Though the selection of translation equiva- 
lents for nouns are also important, the effect 
of application domain depeadence is so great 
that we strongly relied on that property at the 
present circumstances. 
There are cases that translation equivalents 
are determined by pairs of verbs and nouns to 
each other. So we need to study the problem of 
selection of translation equivalent also from 
this point of view. 
Reference 
(i) Sho Yoshida : Conceptual Taxonomy for Natural 
Language Processing, Computer Science & 
Technologies, Japan Annual Reviews in Electro- 
nics, Computers & Telecommunications, CH~HA 
& North-Hollg_ud, 1982. 
168 
/ ~ ~ ( :Keyboard instrument) ~ ~'T/ (:Piano) 
~~--~u~ y( : Organ) / ... /C obj Play:. < i 
~~(:String instrument) 
O (:Things) ~ (:Musical instrument) 
~ ~obj Do.Play: ~~ ~ -'<4~1)--~(:vi°line) 
J~D~° (@ W: n; : i~s<t r ume n t) Conc 
7~u-- F (:Flute) inlnglish 
~t -- ~,'m ( : Oboe ) ~/ 
C°ncept''''''''-! ~ (:Percussion inst~ume~t) 
Case ,obtDo ~Play:~/O~ .... Translation (Japanese) ~..~- ~ equivalent 
Associated verb~" F'~ (:Drum) 
l / 
Appropriate associated verb ~ 
Fig. 1 A Part of Concepts Structure with 
Associated Information 
169 
