// ?, 
1965 interuationa! Cou -~ ...... ~-e.~uC OY ~, 0_, ....... ~'"'" ~'- o,,~:ju~,~., c, moI~<~,l Linz~.istios 
o,',-:,,-,~,~.L±,., UI,:-i'DS 
7;o j o ie. <.,~. s~a imowsl,;i 
Do;,<.._" ...... -~'-,~,'~ +~,.~ .... '~,-~ ~ General. Linsuistlos 
Jasellonian University 
-55:1 .~..i.u. 3~.J - 0 Z~ 
• ,' " • r-} -I ,~. m 
Skalmowski - I 
SUY~I~RY. A remarkable regularity of distribution of 
Arabic verbal roots in the vocabulary is shown to 
exist. Presented results sugsest that similar resular 
distributions of semantic units in other languages 
may be found with the help of word formation rules 
and vocabulary statistics. Possible applications in 
approaching the problem of "true" multiple meaning 
in MT are being discussed. 
The notion of "semantic unit" may be formulated in 
several ways /I/ so that the application of this term makes its 
explicit definition indispensable. It seems that difficulties in 
definin~ it arise from the fact that llke most general terms it 
should be related to some deflnlte theory. At present we do not 
possess any sufficiently strong and ~eneral theory of the semantics 
of natural languages~ though important preliminary steps in this 
direction have already been made /2/. For this reason most seman- 
tic investigations of natural languages still preserve the 
"artlsanllke" character stressed by LI.Coyoud and all definitions 
of the semantlo notions remain rather tentative- as well as all 
the more ~eneral conclusions drawn from such investlg~tlons. 
ThIs~ too~ holds true for the present contrlbution, in which an 
empirical fact is described and some remarks on its possible 
applications to the problem of the "~rue" multiple meanln~ have 
been made. 
s 
Skalmowski - 2 
For this paper it seems advisable to hold apart two 
notions: that of the "concept" and that of the "semantic uz~it". 
Given a generative descrlptional device G z~rammar/ and a pro- 
Jective system of the type proposed by Katz and Fodor S /sem~n- 
tics/ we can describe a semantic concept in a lan~ua~'e L as 
a set of n-tuples of symbols from G and S~ ordered or par- 
tially ordered by the relations which define the formal rules of 
these systems~ and having a common derivation in S. This broad 
frame al~ows us to re~,:ard as a oonoe~t e~ery diction ..... - , ~ ~u~ er~try - 
except for the "grammatical words" v:hich do not i)ossess any de- 
rivations in C - and leaves us a wide mar~in of freedeou in 
-~, v oonoelt -systems:: ',viti~ a i,rio~i estab- construotin~ arb_t,at~ '~ 
lished features. 
In a similar way we may describe a sei;mntic unit 
as a set of n-tuples of G-symbols, G-rules of word formation 
and S-symbols~ ordered or partially ordered by means of rela- 
tions wI~ich define the formal rules of these systems~ and ' ...... 
a common derivation in ~ from some J-symool unique!y related lj\[ 
to some S-symbol. This allows us to relate with ti~e notion of 
a semantic unit the linguistic notions of morpheme /or mere 
strictly: semanteme/ and of "word family"~ defined in terms of 
grammatical derivations. 
The thessurlc ai?proach to the problem of meaning 
in fit /s.e.g.3/ pays trlbut to the idea of ordering the symbols 
• c~l^ within the conoel~ts~ but at the same time it brings to I~ L~t 
the Droblem of multiple meaning. This problem has been much 
discussed already /s.e.g.4/, but it is still far from being 
f~ solved in all its aspects ~,.eneral!y sl) ~ '~" - . ~a~.In,.-, the main difficul- 
Skalnmwski - 3 
ty arises from the fact that the "concept-systems" of lansua~es 
are not isomorphic and even if we manage to brinS tk, e~u closer 
to,sether there remains some amount of "looseness" within the 
concepts themselves, giving. rise to the problem of "true" mul- 
tiple meaning. The "contextual" multiple ~nea~Jin.~ inay be resolved 
- in l)rinciple~ at least - by extending the notion of concepts 
both "in the source and i~ the tarfet languages to ,:holt sentences 
or even lar&~er utterances) this is allowed by our :' broad:' 
treatment of this notion, not s!',eoifyin S the maximal size of 
the n-tu/~les of symbols. By this extension the inner structure 
of concepts makes the relations defining the isomorphism of the 
"concept -systems" more apparent i thus even such cases as the 
adequate translation of the ~\[ussian ~@M£~eas the English 
':ohangin~ /the order of intezration/"and "varying /argument/" 
are theoretically resolvable. Yet there exist instances w?,ere 
the extension of concept would have to go beyond limits and to 
involve the whole !un.~fuage: these are c6~ses of :'stylistic' 
difference in v;hich there are not ai)parent reasons for choosin:i 
one o.£ the fossible s$,nonyms instead of the ct~\]er but \';ilere t~e 
difference ia distinctly felt by competent bilingual speakers. 
ri'he !)z-oblem is important for the translation of literary pieces, 
especially i)oetr~:) b$' the \])resent stal\]d of .,iT it is still an 
;'acade~:\]ic" p~oiJlem, of course~ but it exists after all. it ~.lay 
be best illustrated by the question whether t~ere are "better;: 
and '~worse ~T translations of nonsensical expressions~ such as the 
famops ';furiously sleel)ing ideas". Le6ative ans'~'Jer would mean 
that every translation is equally sood~ ~,;hioh in turn would i.lean 
that o~:ly "meaningful" sentences are translatable~ in that case 
I 
Ska imowski - 4 
the i,!T problems would be "enriched" with the whole load of phi- 
losophioal questions - an embarassing development~ certainly. 
Vaguely felt differences between the intrinsic 
"semantic values" of different elements of language have given 
rise to the notions of "size" or "content" of Semantic elements 
/5/ and several attempts - both to define these notions and to 
furnish models of the underlying ~,~eohanism \[~ave been ,,~r~e /p~6/. 
The m~in assumption - based on observations of ,lillis -','as 
that there existed a "nat-ral hierarchy" of concepts in natural 
languages~ forming a tree or at least a lattice with some de- 
finite statistical properties. 
The ipresent Daper gives some results of an in- 
vest':.S,,':'tion undertaken in order to test this hy;?o,-,eses.~ 
Because of the marvelous clarity of the grai-,Imatical structure 
~rabic has been chosen as a "laboratory exar,\])7~le ~'. ~,bout 90',~ of 
Arabic semantemes are verbal roots~ with very fev~ exceft!o:~s 
consisting of three consonants CI-C>-CD; the usual dictionary 
form is the 3 d i~ers, as. masc. Derf of the form '~ ~ ~ ~ • ~ la"',cuCDa ' s. ," 
kasara "to break" /lit. '~he has broken"/. There are more than 
ten different verbal stem-patterns i.e. word formation rules~ 
modyfying the basic meaning of the root in a s~,eoif!c way; 
thus the stem-pattern Ii: C la~2~zaC3a adds to ti~e ~)ssio 
meaning the shade of intensity, e.E. kasara ~to breei: t~ ~ ..... ~u~sar& 
~ ~.~u iV- causative~ "to smash"; the stem-pattern ~i~ is conative~ t '-~ 
etc. 
i~ll the trilit~,~l veto-' al roots ir~ ~-'une ~rab:Lc 
vocabul~,--~,zo have bee~l div±ded into se"arate, classos_ according to 
their ability to forn s = I~,, ~...~ n d.i.i.feren s te,,:s c onlj 
I 
the number of stem-patterns was considered and further ai<'licab!e 
word formation rules /substantivisations~ adjeotivisations etc./ 
were disregarded this classification is a ver~, rough approxi- 
mation to the h~i~othetical underlying hierarci~. It ha~ been 
assumecL that the number of stem-l~atterns definin,~ a given class 
al.pro~m~tely viewed as an exponent of the '"oo~Itent ~ or 
~'semantio value" of the semantic units belongin~ to tkis class 
and that - if the hypothetical hierarchy was really based on 
this principle - the number of roots with greater s should be 
smaller than that with smaller s. 3aranov's Arabic-Russian 
Diotionar~ /7/ has been used for countin& the roots and it has 
been found that the relation between s /the number of stem- 
-patterns oharacterlzln~ the given class/ and r /the number of 
roots belonging to this class/ was not only inversly proportional 
but also nearly functional and that the distribution of roots in 
the Arabic vocabulary may be described as a simple function 
r/s/ = i'~/As ~ +De +C/~ where \[~ is the sum-total of roots and 
. c odness of fit has been ~ B and C are specific constants ~q~e ~o 
tested by the ohi-square distribution and it has been found that 
the differences between the empirical data a~d the theoretical 
distribution - except for one value - do not exceed 0.3 signific- 
ance level. 
In order to estimate the possible differences 
between ~,artioular diotionarir~s -wilioh could arise from 
differences between the materials used for their compilation - 
two samples of ca. 700 items each have been taken from two 
different diotionarles /7~8/ and the distribution of roots in 
them compared with each other and with the over-all distribution. 
Skalmowski - 6 
All the distributions show a striking similarity, renderin~ 
nearly identical chl-square values, x/ 
This result is a strong argument for the general validity of tue 
discussed distribution in Arabic -and this fact in its turn 
speaks in favour of the existence of "natural hierarchies" of 
the semantic units in general. 
x/ 
S 
r 
Baranov s 1988 
Dictionary i I 
I theoretical I 974 
distribution s 
The figures are as follows: I" 
I 
' .... --~--------~ . "," _ 'l _ lr r--4--~--@--~------~ .... 
11 2 1 3 I 4 t 5 t 6~7 I 8 ~ 9 I N 
I I I I I I I i I I 
1 1 1 J- ........ ' .......... .1---- :t:- 
3209 714 I 86 14.11 11 4 I 
I I I I I I I 
L .... & .......... .1- .... ~-------.1------I-----4 
1 I I i I I I 7.54 1561 1398 262. I~ 176 I;'6 I 4. I 
I I I I I I I 
.... -~------~ .... @--------4- .... ~------4-------+------~ 
sample I i , , I , ' , I I tBaranovl 1213 1163 1131 I 99l 3 1 , 
I I I I I I I I I I 4 ..... 4 .... + .... + .... ~ .... ~ .... b-----~-----t------~ 
I I I I I I I I I I 11229 1163 1117 I 99 I 901 26113 I 3 I 1 1 
1 I I 1 I I I I I I 
..... @ .... + .... ~ .... ~ .... 4-------I.I-------T------T------T 
sample 
/Wehr/ 
7O8 
697 
The constants for Baranov's Dictionary are: 
A = 0.004419 ~ B = 0.082 , C = 0.3812 
It seems very probable that similar regular dis- 
tributions might be found in other languages, too - perhaps the 
ensemble of the "semantic parameters" would have to be much 
wider and the "trial and error" investigations would require 
more time but the whole work can be easily mechanised. The idea 
of interconnections between the syntactic and semantic structures 
of language is not new in structural linguistics /s.9 and 10/ 
and investigations alon~ these lines have already been led in 
the domain of computational linguistics under direction of 
P.Garvin /11/. My suggestions go towards discovering such regular 
Skalmowski - 7 
distributions which would facilitate the task of finding more 
strict correlations between the synonyms within \[)articular con- 
cepts on computational basis. The underlying assumption is that 
the "universes of disoours" in various languages are of about 
the same "size" /whatever it would mean - but such an assumption 
is tacitely made in every translation/~ and that the semantic 
units underlying the components of concepts are ordered 
aocordin~ to their "content"~ so that the problem of "true" 
multiple meaning in certain oases ;nay be solved by means of 
matching the components of concepts of the source and target 
languages on the basis of their ':semantic value". 
As an illustration let us consider a fev: equivalent 
.... " ~a~ions /A. -12~ ~. -13/ of ~n~!ish verbs in two different tz'ens ~ ~ 
the Koranio Sura 84, being translations of ~'~r~bic verba derived 
/ from roots all belonging to the ss.le class ,/5 stem -Datterns~, 
i.e. according to our assumi)tion kavin L about the sa,le 
'Tsemantio value". The "value" of corresponding ~nglish verbs has 
been tentativelj estimated by the number of different sub-entries 
~entury l\]ictiona. .... /numbers in brac<ets/: 
EnLlish 
si !it /16/ 
infatara 
/N./to severe /3/ 
to deceive /5/ 
garra 
to beguile /4/ 
to shape /IS/ 
sawiya 
to fashion /I 1/ 
to roast /97' 
sala 
to burn /.30/ 
in .~"hambers s -th 
~rabio 
Skalmowski - 8 
The applied "method" bein~ unsystematic and ad hoc 
the example allows no generallsations but it may illustrate our 
argument that the problem of'~ue" multiple meaninz arises in 
cases of "expressive language" from the fact that even when the 
concepts of source and target languages a~;ree there is no 
correlation between their respective oom~onents e~oept for 
differences between their "value", based on differences on ti.e 
paradigmatic level. Titus e.g. for the concept "aplplyins heat on 
solnething '~ two different semantic units could have been 
arbitrarily chosen by t~e two interpreters, as they reEarded th 
subsets of synonyms within the concepts as unordered..iF 
suggestion is that these subsets might be at least ~art±ally 
ordered by means of the intril\]sic value of the semantic u~its 
underlyin~ them and that correlations between them might be 
established in more objective terms of numeric measures of their 
content. 

References 

/I/ Coyaud i,i. - Quelques probl~mes de construction d'un "langa~ 
formalis~ s~mantique". La Traduction Automatique 
1963 fasc.2 

/2/ katz J.J., Fodor J.A. - The structure of a semantic theory. 
Language 39/2//,1963 

//3/ Sparck-Jones Ii. -~,!ec~anised semantic classification. 1961 
International Conference on, ech.Transl, and 
~pplied Language ~nalFsis. London 196~. Vol.ll 
Skalmowski - 9 

/4/ Janiotis A., Josselson H.II. - ~viultiple 71eaning in ~\[aohine 
Translation. ibid. 

/5/ Herdan G. - Type-Token Mathematics, ~louton et Co. The Hague 
1960 

/6/ i,;andelbrot B. - On the Language of Taxonomy: an Outline of a 
"Tbermostatistical" Theory of ~ystems of Cate- 
gories with ,Villis ~i~atural/ Stz~uotureo 
Information Theory, ed. G.Cherry,London 1956 

/7/ Baranov X.K. -Arabsko-russkij slovar, /2 d ed./, i\[oskwa 1958 

/8/ Wehr If.- Arablsches Worterbuch fur die Sohriftspraohe der 
uegenwart. O.l:arrasowitz, Leipzig 1952 

i, • /9/ ~ury~ovloz J. -Derivation lexicale et derivation syntoxique. 
/Contribution "~ I~ theorie des parties du'disoours/. 
Dull. de la Soc. de Linguistique de Paris~ 
VoI.lYLiVII ~ 1936 

/10/ kurylowioz J. - Zai~etki o zna~enii slova. Vo?rosy Jazyko- 
znanija, 1955, ilo 3. 

/11/ Swanson D.R. -The i,att~re of i lultiple ~deaning. i~roceedings 
of the ilational oym osium on .... ± /'Los i~ngeles 1960/, 
ed. l l.PoEdmundson 

/12/ ~rberry A.J. - The ~oran InterDreted. Oxford Unv. fress 1964 

/13/ i~Ioholson R.A. -A Literary F~istor$ of the Arabs. 
The Cambridge Unv. Press 1907 
