GEar J. VAN DER STEEN 
A TI~EATMENT OF INDEPENDENT SEMANTIC 
COMPONENTS 
1. A TREATMENT OF INDEPENDENT SEMANTIC COMPONENTS 
To distinguish things, we use terms which characterize them for us. 
For two balls it may be their color, for two people it may be their 
height, or their manner of speaking. In order to illustrate the differ- 
ences in meaning for many words J. j. KATZ and J. A. FODOR (1963) 
proposed the use of "semantic characteristics ". They give an exam- 
ple for the meanings of man and ball: 
man- ...- (physical object) -- (human)- (adult) -- (male) 
ball~ --...- (social activity) -- (large) -- (assembly) 
ball~- ...- (physical object) 
D. BOLINGER (1965) proposed the systematizing of these character- 
istics, with hierarchic structures, so that the meaning of the word 
bachelor could be represented by a row of characteristics (fig. 1). 
bachelor 
human~-- ,. an~nal 
male "~""~"~'-~ed ucand phocizm 
adult military hierarchic hirsute 
nonbecoming hierarchic permanent male 
unmated noble inferior adult 
inferior young 
dependent nubile 
proximate unmated 
young 
~g.l. 
Here the meaning of a word is given by a refering it to other words. 
These words, in their turn, can be referenced by other words. There 
is the feeling that from here endless references will originate. 
Let us suppose that there are a number of elementary characteris- 
tics which can not be expressed in other characteristics. We shall call 
~mmiP. lmWlmalr"uamm=- ~ ""=,=--,~ ~ w~ __~ .... -- ........ 
202 GERT. J. VAN DER STEEN 
them el ... ez. The question if they correspond with any existing word 
or expression let be leaved as it is, just as the limitation of I. We shall 
represent the meaning of a word by the intensity of the presence of 
specific characteristics er If we construct a model in an/-dimensional 
vector-space with unit-vectors _el ... _ei we may represent the meaning 
of a word or expression W by the vector W = wx _e~ + w2_e~ + ... + 
+w~zwithw~>O for i=1 ... L 
The common in the meaning of two words is the sum of the com- 
mon in each of the basis characteristics. 
In our model this is for the vectors 
I I I 
V=.~ v,e, and W= ~ w,f, : V NW=.~ min(v,,w,) e 
i=1 i~l i~l 
For I = 2 refer to fig. 2. 
¢~: I I 
Fig. 2. 
For the determination of the norm of the vectors we consider that 
the common of V and W is determined via their characteristics. Our 
consciousness can evaluate the factors v~ and w~ only one by one. There- 
fore we put as norm: 
! 
II vii = v,. 
i=1 
Therewith is 
! 
lie n _wll = 
i~l , 
by definition called the measure Of association between _V and W. 
(rain is the minimum-function, e.g. rain (5, 7)= 5). 
To test this model we designed two tests (G.J. VAN ,r~r STE~N,'1971). 
In the first test, individuals are asked to write down 12 words, 
starting with the word bird, and, relative to the associations between 
them, to indicate the measure of the association. This has to be a num- 
ber between 0 and 10; ' 0' for: "no association ", ' 10' for: "syn- 
A TREATMENT OF INDEPENDENT SEMANTIC COMPONENTS 203 
onym". For an example: see table 1. For the Words W1 to WI~. in our 
model, we use the equations: 
I 
(1) .~, rain (w'nl,, wn2,)= v,a,,,2 
i=1 
for 1 < nl < N-1 
nl < n2 ~< N (here N = 12) 
wherein the numbers v,l,, 2 are given. From these equations the un- 
knowns wn i (i = 1, .... 1; n = 1 .... , N) have to be solved. At the same 
time, the number of characteristics I has to be determined. An upper 
limit for I is the number of equations: each association runs over a 
separate characteristic. 
We determine I and the unknown wn¢ by an iteration-process. 
Suppose that the factors wn~ (1 ~< n ~< N; 1 ~< i < I1) are determined 
(/1 = 1, 2..1-1). Then we may write: 
,,c~,+1) = vii,) (1 ~< nl ~ N-l, nl < n2 ~< N) rain (wnlz:, wn21) + Vnl,n 2 nl,n2 
with v m and ,,re+l) = (wnl, wn2). nl,n2 ~ l)nl,n2 Vnl,n2 .~, rain 
i=I~+1 
Let us denote the sum of all v's in step/1 q- 1 with S, so 
N-1 N 
S = .~ .~ ,,(~'+~) ~nl,n2 
nl=l n2=nl+l 
To minimize I we try to solve the system with S as small as pos- 
sible. By successively assuming that a specific wn,, is the smallest of 
all wn, 1 's we can determine for each of the suppositions the sum S. 
We choose now the wn,1 which belongs to the smallest sum S. If there 
are more sums S with this value then there are more refined criteria 
available. Suppose this is wnllc In the equation 
rain (wnl,l, wn2,1 ) q- !,~t1+11 = vl~0 • ~nl,n2 nl,n2 • 
we choose then wnl,~ = v,,1,2."~I') Therewith Vnl,ne"(Z'+l) = 0. wn2z, will be 
determined later. 
In all equations wherein wnl,~ appears v,l,, 2''(z'+1) can now be determined. 
In the remaining equations we now apply the same process till there 
stays at last one equation, for instance 
~(11+1) l)(lt) min (wn3iI, wn4,)-q-",3,,~ = .3,,~ 
204 GERT J. VAN DER $TEEN 
Here we choose wn3,,----wn41, = ,',3,,,4.~'(I') Therewith our iteration step 
for I1 has been ended. When all vc~,+l) ___ 0 then I---- I1 and the whole - ni,nj 
iteration-process has come to an end. 
The process is illustrated in table 2 for 4 words with associations 
Vl,~ = 3, Vl,3 = 6, Vl,4 = 8, v~,3 = 2, v~,t = 4 and va,4 = 5 (randomly 
chosen). With si we denote the sum which belongs to a w i which ap- 
pears in a line with the lowest v. 
With some small modifications this solution-scheme can be used 
also if some equations have been deleted in the beginning; in others 
words: when some associations are not given. This is illustrated in 
table 3 with the same v's as in table 2, except for vl,3 which is omitted. 
The small modification concerns the calculation of si: we divide s i 
by the number of lines minus 1 in which wj. appears. 
Wc now try our model by omitting some of the given associations 
from one individual. According to the foregoing method we deter- 
mine the number of characteristics I and the vector-representations of 
the words. From them we calculate the omitted associations with the 
aid of formula (1). For the discrepancies between the thus predicted 
and the omitted associations we can determine statistically an esti- 
mation. 
If the associations are randomly given then the mean and the stan- 
dard-deviation do agree indeed with their calculated values. If the 
associations are given by test-persons these numbers are significantly 
lower. 
There are interesting discrepancies if associations are left out which 
express an extra aspect of meaning. In a specific case the words bird, 
leg, table and chair were given among others. If the association between 
leg and bird was left out an association of 0 was predicted, as it should 
be. The number of characteristics was decreased by one. 
The evaluation of associations between given words by test-per- 
sons is subjective. This, however, plays no role here: the relations be- 
tween a number of consciousness-contents are concerned. If the test- 
individual is not consistent in his evaluations then the discrepancies 
between the given and the predicted associations become greater. In 
practice there seems to be a good correlation of consistency in evalua- 
tion and the intelligence-level of the test-person. 
On a more reliable level are the extended observations of the lan- 
guage expressions of a test-individual. For this purpose a second test 
was designed. As source material 20 pages of the novel De verliezers 
of Anna Blaman were taken. With the aid of the programming language 
A TREATMENT OF INDEPENDENT SEMANTIC COMPONENTS 205 
SNOBOL a frequency-table was made for all words in that piece" of the 
text. From them 20 words with a high frequency were chosen which 
are relevant wkh regard to each other. Then it was determined how 
many times each of the 20 words was found together with each of the 
other words in the same sentence. This number was taken as a mea- 
sure of the associations between the words. If we leave out the 0-as- 
sociations and predict their associations by the method of the first test 
our model will not be unreliable if we predict the value 0. 
Indeed, it appears that there are O's predicted, except in associations 
between nouns and the words mine and your which give values 2 and 
3, and some, randomly distributed, exceptions. The method of predic- 
tion of 0-associations was chosen to avoid the rather crude measure 
of association. This measure was used because of the absence of a well- 
defined method to detect coherent subphrases. If two words never 
occur together in a phrase they will certainly never appear in the same 
sub-phrase. 
It will be interesting to try this model on the common usage of 
languages. The obtained vector-representations can be transferred to 
other characteristic systems by means of matrix-manipulations. A fur- 
ther extension lies in the determination of the representation of words 
in different natural languages with a mutual comparison and eventual 
transformation of the representations. The measure of association is 
critical here. This should be refined by using more knowledge about 
the syntactic structure of the sentences. 
A further restriction lies in the number of developed characteristics. 
For the first test this was approximately 13, for the second approximately 
76. From these 76, the first 20 were the most relevant. The remaining 
characteristics served more to compensate for several small discrep- 
ancies. By chancing to a larger amount of language information the 
number of relevant characteristics will naturally increase. 1 
1 Programming languages used: PL/I, FORTRAN and SNOBOL 
Machines used: mM 360/65 and PDV-9. 
A FORTZAN-II program for determining the semantic components from given asso- 
ciatiom is available upon request. 
t~ 
Z 0 
0 
0 
e~ 
0 
o ~o 
r~ 
o 
t~ Ol 
0 0 
0 
0 
0 0 0 0 
0 0 t~ 
0 
~0 
O 0 0 0 
0 0 0 0 
u.l 
0 u~ 
~0 ~0 
U 
Z Q t.~ O0 
o 
r.z.1 
~0 
D 
0 
r~ 
o 
U 
Z o o 
r~ 
~0 
~0 
0 
0 
~ ~ ~- I ~ ~ ~ ~ ~ I~O ~ 
e4 
7 
I\] I\] ~, 
II 
I\] 
ii i~ ~G 
II II 
II II 
~ u 
LI 
II 
~G 
i L i, ? 
ii ii 
\[I 
in ~G 
II ~G 
t ttft 
t 
t 
tt tt 
t 
tt I 
I 
t 
f L 
f t 
~b 
II 
II 
o~ 
I\] 
0 
II 
li 
II 
? 
II "& 
II II ~I 
~G~G 
il 
li 
il II 
li ~I "" 
to 
t 
t 
~ t t t t 
: i 
I u 
i t t t t 
; t 
t : t t 
t t t t T i 
: t J t 
v 
II 
o~ 
IT b\] 
~o ~ °~ 
• ~ ° ~ 

REFERENCES 

D. BouNarm, The atomization of meaning, 
in ~ Language ,, XLI (1965) 4, pp. 555- 
573. 

J. J. KATZ, J. A. FODOR, The structure of a 
semantic theory, in, Language ~, XXXIX 
(1963) 2, pp. 170-210. 

G. J. VAN D~ STY, Semantic processes 
in artificial intelligence systems (in Dutch), 
Delft, 1971. 
