15 
1965 International Conference on 
Computational Linguistics 
MODELS OF LEXICAL DECAY 
D. Kleinecke 
TECHNICAL MILITARY PLANNING OPERATION 
GENERAL ELECTRIC COMPANY 
735 State Street (P. O. Drawer QQ) 
SANTA BARBARA, CALIFORNIA 9310Z 
Abstract 
Lexical decay is the phenomenon underlying the dating tech- 
niques known as "glottochronology" and"lexicostatistics." Much of 
the contraversial nature of work in this field is the result of extremely 
imprecise foundations and lack of attention to the underlying statistical 
and semantic models. 
A satisfactory semantic model can be found in the concept of se- 
mantic atom. Notwithstanding a number of philosophical objections, 
the semantic atom is an operationally feasible support for a lexicon 
which is a semantic subset of all possible meanings and at the same 
time, exhausts the vocabulary of a language. Lexical decay is the 
process by which the lexical item covering an atom is replaced by 
another lexical item. 
Exponential lexical preservation is, in this model, directly 
analogous to decay phenomena in nuclear physics. Consistency re- 
quires that the decay process involved in exponentially preserved 
vocabularies be a Poisson process. This shows how to form test 
vocabularies for dating and proves that presently used vocabularies 
are not correctly formed. 
Dialectation studies show that historically diverging populations 
must be modelled by correlated Poisson processes. Definitive sta- 
tistical treatment of these questions is not possible at this time, but 
much desirable research can be indicated. 
Kleinecke - I 
Introduction 
This paper is an attempt to establish the method of dating by 
lexical decay upon an adequate theoretical foundation. The method 
discussed is that invented by Swadesh (1) over a decade ago and 
usually known as glottochronology or lexicostatistics. In the inter- 
vening years it has been widely applied, but often to the accompa- 
niment of much confusion and contraversy. It seems that much of 
the confusion can be removed by a rigorous treatment of the phenom- 
enological model and careful application of statistics. The contraversy 
can be removed only by the completion of a sufficient number of 
supporting studies. Rigorous formulation permits us to'pinpoint 
what studies are needed and what conclusions are being sought. 
Granting (as not everyone seems willing to do) that the basic 
fact of "uniform" lexical decay occurs, the problem to be attacked 
is that of correctly formulating models for lexical decay and of 
correctly deriving statistical consequences from these models. In 
what follows, we will construct a set of models which seem to fit 
the needs of the method of dating by lexical decay, Our approach 
is strictly pragmatic, that is, we construct the model we need with- 
out concerning ourselves about its a priori reasonableness. Later 
we try to assemble some arguments which justify the model. In no 
sense is this an approach for first principles. 
The analogy between lexical decay and the decay phenomena 
of nuclear physics has been often noted and dismissed. In the pre- 
sent paper, we insist that this analogy is much more than an analogy; 
it is, on the first level, an identity. The only alternative to this 
hypothesis seems to be a kind of mystic faith that the decay occurs 
but without palpable manipulable principles. The burden of the proof 
that the identity is false lies with the doubter and we will make no 
further demonstration of its validity. 
Kleinecke - Z 
Decay phenomena in nuclear physics are governed by relatively 
simple, well understood principles. To apply these results to lexical 
decay we first establish the concepts of a semantic atom and a set of 
independent semantic atoms. The observed fact of exponential decay 
of vocabulary then is accounted for by assuming that the lexical item 
covering an atom decays according to a Poisson process. Oenerally 
speaking, the converse of this is also true,and only a Poisson process 
would produce exponential decay. From these considerations, we 
can draw many conclusions about how to and how not to construct test 
vocabularies for dating purposes. 
With this model in hand, we can draw conclusions of a statistical 
nature. For example, we can develop formulas for the proper method 
of dating the split between three or more languages and for good esti- 
mators in more complex situations. 
We can construct an inprecise heuristic model for the dynamic 
semantics underlying the Poisson process. So long as the first order 
theory is adequate, this is much in the nature of a curiosity. It seems, 
however, that first order theory is not adequate. Actually, such a 
conclusion is really premature because the kind of verification studies 
needed have not been made. Assuming the pessimistic conclusion, we 
have to construct second (or higher) order theories to account for the 
inadequacies of first order theory. At the moment, we have no useful 
results in this direction--the problem merges into the problem of 
dialectation. Probably the most important service we can render is 
to indicate exactly what kind of detailed studies are needed. 
Semantic Atoms 
It is very easy to raise objections of a philosophical nature to 
the concept of a semantic atom. In this paper we will simply ignore 
Kleinecke - 3 
these objections and define semantic atom in an operational way. 
There are also operational difficulties, but these seem to be sur- 
mountable. 
A semantic atom is a 
ciently specified to remove 
completely defined unit concept suffi- 
all ambiguity. For example, in an 
anthropological context, we might have "sun, as pointed at by a male 
anthropologist at high noon in the middle of summer on an average 
day among a group of young men with plenty to drink". The kind of 
subtilities needed to complete the definition reminds one of Korzybskian 
General Semantics, but the intention is not the same. We seek to 
remove ambiguity but we must have a non-unique concept--one that is 
always present. 
Certainly there has been little use of semantic atoms anywhere 
in the past. Those interested in semantics for its own self will reject 
them as useless or meaningless; lexographers deal in more generalized 
concepts. It would be hard to argue that they have general utility, but 
they are precisely what is needed for studying lexical decay. 
Each semantic atom, in any speech at any time, is assumed to 
be covered by some lexical item. That is, there is some word whose 
meaning includes that of the atom. Thus, vocabularies can be formed 
over any set of semantic atoms by listing the covering lexical item 
for each atom. The kind of decay being studied is that where the cov- 
ering lexical item is replaced by another item. The replaced word 
only rarely immediately disappears from the language as a whole, but 
it has disappeared from the semantic atom. 
An independent set of semantic atoms is a set of atoms all of 
which differ among themselves enough to make the decay at any atom 
completely independent of that at any other atom. Thus, only one from 
sets of words, like numerals or pronouns, with habitual interrelations 
Kleinecke - 4 
can appear in the set. Independent sets are useful because in 
them, problems of inter-atom correlations need not be considered. 
Before passing on, we should say a few words as to the prac- 
tical use of semantic atoms. There does not seem to be any doubt 
that the collectors of vocabularies want to work with semantic atoms-- 
even if their results are completely unsuccessful. In an entry "dog= 
hund" they would like to say that there is a semantic atom and its cover 
in English is "dog", in German, "hund". The pitfalls of this sort of 
thing are well-known. Some care in defining atoms might make it 
feasible if we require not complete identity of the English and German 
semantics, but rather the existence of some concrete concept where 
both the English and German words are appropriate. Clearly this much 
weaker requirement will be easier to satisfy, so we adopt it. 
We conclude that, with adequate precautions, semantic atoms 
can be operationally feasible even if true rigor is impossible. In the 
case of little-known languages, there is much more chance for error. 
We should encourage collectors of vocabularies to improve the pre- 
cision of their definitions so that the atom in question can be identified. 
Decay Process 
We assume that lexical decay, for a set of independent semantic 
atoms, is a Poisson process. That is, it satisfies three conditions: 
i. Each atom decays independently of all the other atoms. 
2. Each atom decays independently of its history of earlier 
de c ay. 
3. There is a constant k such that for each atom the pro- 
bability of one decay in a short time interval At is kAt, 
and the probability of more than one decay is negligible. 
Kleinecke - 5 
It is rather easy to deduce that for longer time intervals t , 
the probability of not decaying is exp(-It) , and if there are N atoms, 
the expected number of undecayed atoms after time t is Nexp(-lt). 
This formula is the usual formula for lexical decay. It should 
be pointed out that it was tested, statistically, in the first publication 
by Swadesh, and it failed to pass. The difficulty is probably due to 
the word list used which is not an independent set of atoms. 
If we examine the assumptions made so far, we see that any 
list of semantic atoms can be used if they are: (1) independent; and 
(2) assured of existence throughout the time in question. There is no 
satisfactory a priori basis for assuming that some kinds of semantic 
atoms decay at different rates than other kinds, and it is doubtful if 
enough historical evidence can be collected to make such a conclusion 
stati stically significant. 
The question whether l is a universal constant, a constant 
within any one language but possibly differing between languages, or 
a variable, is easier to discuss. So far, indications are that k is 
about equal to 1/5000 years. Now this means that over the span of 
most historic evidence, exp(-kt) will be greater than about 0. 60. 
There is a great deal of scatter to be expected in the results because 
N exp(-kt) is an expectation, not an exact prediction. 
There have been a number of studies of the exponent of expo- 
nential decay. All of them are too superficial to be conclusive (Z) . 
An adequate study in any one language would have to meet several 
criteria which make it into a major research effort. A set of inde- 
pendent semantic atoms must be selected--selected prior to detailed 
study--and no atoms, however difficult, dropped without complete 
explanations (3). Then the history of each atom must be traced through 
the historical record to locate the lexical item covering the atom at 
Kleinecke - 6 
each point in time. In reporting the study, all of this should be fully 
documented in detail. Each instance of decay can then be recognized 
and tallied. Statistical tests should be applied to see whether or not 
the model is satisfied and to estimate )~ . For example, if there are 
i00 semantic items T, there should be about one decay every 50 years 
uniformly spread through time. These things can be checked statis- 
tically. We hope that scholars will undertake definitive studies of this 
type for as many cases as possible (4). 
Until the results of the kind of research just mentioned are avail- 
able, the status of ~ is unsure. We anticipate it will be recognized 
as a universal constant. 
There remains the problem of making a Poisson process a rea- 
sonable assumption. In other words, we need to describe some sort 
of mechanism which makes words slip off semantic atoms independently 
of how long they have been covering the atom, and at a constant rate 
per unit time, at least over short time intervals. Incidentally, 
since )~ is on the order of 1/5000 years, 50 years is a short time 
interval. Since the speakers of normal languages are not historians, 
the independence from history seems easy to accept. 
The constant rate is harder to accept. First of all we have to 
account for an identical figure in populations, literate and illiterate, 
and between a handful of speakers and half a billion speakers. The 
decay effect must be independent of the number of speakers, hence it 
must be operative at the level of the single isolated speaker. This is 
satisfactory since, by and large, the amount of speech reaching an 
individual does not seem to have changed much throughout history and 
does not vary much between cultures at the present day. 
But why does a speaker decide to change an occasional lexical 
item--about i~0 in his lifetime--and maintain the rest. The only 
Kleinecke - 7 
hypothesis we have been able to construct is that all words are always 
under pressure--perhaps from several semantic "directions" at the 
same time. Most atoms resist change most of the time, but some set 
of accidents (all very real events at the sociological and psychological 
levels, but random accidents in our context) weakens a few, and the 
lexicon decays. In other words, there is a constant dynamic move- 
ment among secondary and incidental covers of the semantic atom 
which threaten the principal cover. Usually the threatening lexical 
items recede, but occasionally, in a random way, about once every 
five thousand years the principal cover is displaced and a lexical de- 
cay occurs. 
The hypothetical mechanism advanced to explain lexical decay 
can be checked against history by case studies of semantic atoms. 
Each atom should show time periods when the principal word was 
nearly displaced. During these periods it is difficult to decide whether 
the old word or a new word is the principal cover. Usually the new 
word will pass away again, but sometimes it will displace the old word. 
A very tentative guess based on a casual examination of one hundred 
current English words suggests there are about four very heavily 
threatened words per hundred. Since we can expect about one word 
to be decaying at this moment, we conclude that about three out of 
four times the old word survives. All of this needs to be verified or 
disproven in detailed studies. 
Decay Statistics 
The statistical consequences of the model--the first order model 
described above--need to be explored. We cannot handle all possible 
situations, but the following examples should provide an adequate dem- 
onstration of technique so that any other problems which occur can be 
solved in the same manner. 
Kleinecke - 8 
First, let us consider N languages deviating independently from 
a common parent which is not known to us. The following discussion 
is a bit more cumbersome than some alternative approaches, but it 
generalizes more easily. 
Let ~ be any set of the N languages and let P(a) be the pro- 
bability that the given semantic atom is covered by the original lexical 
item in exactly the languages of set C~ New covering words are 
assumed to be different in each of the innovating languages. P(cc) is 
a function of time and satisfies the following differential equation: 
where i and j are languages, ~ and ¢ mean "belongs to" and 
"does not belong to" respectively, and ~) is the union of c~ and the 
set containing only the language j . 
Let lal denote the number of languages in ~. If 10~1 = N, the 
equation is easy to solve: 
= exp(-Xt) I=\[ = N p 
If a few cases-- I~I = N- 1 , I~I : N-Z. etc.--are solved, we are 
lead to hypothesize that 
P(~) = exp(-kt) loci (1 - exp(-kt) )N- I~1 
This can be proven by induction on \]~I from IO~l : IN downward since 
IOC(~jl = I~I + I . Then 
Kleinecke - 9 
d p(g) : _ \]CciXp(g ) + X(N-\[C~\[) exp(-kt) \[al+ l(1-exp(-Xt))N-Ic~l-1 
dt 
so that 
d<p(c~) exp (xt)laIS.¢ = (N-\]C~l)k exp(-kt) (1-exp(-kt))N-Ic~l-1 
P(g) exp(Xt)\]al : {1 -exp(-Xt))N- I(l\] ," 
and the hypothesis is proven by induction. 
Thus, P(c~) depends only on the value of Ic~l = n . We can rec- 
ognize P(n) for n=Z, 3,...,N but P(0) and P(1) cannot be distin- 
guished so we combine these into P' which is obtained by 
P' : 1 N(N- I) P(Z) .... N'. 
Z n: N - n: P(n) ..... P(N) 
~,,N 
= 1 - ((1-exp(-Xt)) + exp(-kt) , - P{0) - NP(1) ; ./ 
since there are N; /n! (N-n)' sets with \]0~ I = n . Thus, 
P' = (1 - exp(-kt})N-1 (1 + (N-1) exp(-kt)) . 
Now suppose that from K semantic atoms we observe that k N atoms 
are covered by the same word in all languages, and kN_ 1 in all but 
one, and so on to k 2 , and there are k' atoms differently covered in 
all languages. The probability of this occuring is 
Kleinecke - i0 
kn kN-I k Z k' P(N) P(N- I) .... P(2) P' 
(2ks + 3k s+ ... 
x NkN) (I - x) 
NK-k' - (Zk 2 + ... NkN) k' (1 + (N- l)x) 
where x=exp(-kt) . A maximum liklihood estimate for x seems to 
be the best single value we can assign to x . This is obtained by 
setting the (logarithmic) derivative of probability to zero so that 
0 = A-k' NK-A + (N- l)k' 
x l-x i + (N-l)x 
where A = k' + Zk s + 3k s + ... + Nk N . Or 
N(N- l)Kxs - ((N-I) A - NK + k')x - (A-k') = 0 . 
If N=Z , x a =ks/K, which is the well-known formula for the separa- 
tion between two languages. For general N, is the solution of the 
X 
quatratic equation given above. Note that the answer depends on the 
statistic A which does not usually appear in discussions of lexical 
dating. 
An even more general difference between this treatment and 
usual treatment by pairs is found in the use made of the number of all 
the languages containing a certain lexical item as the cover of a se- 
mantic atom. This kind of count is almost never made in the literature 
on dating problems. 
Another case which constantly recurs in practice is that of three 
languages; i, 2 and 3, say. "The pair 1 and 2 are more closely 
Kleinecke - Ii 
related than language 3 is to either I or ? . Suppose t is time 
from the common ancestor of i, 2 and 3 to 3, and t' the time 
from the common ancestor of 1 and 2 to 1 or Z. Let x=exp(-It) , 
x = exp(-kt') so that x/x' is the probability associated with the 
time from the common ancestor of I, Z and 3 to that of 1 and 2. 
We might observe any of five situations concerning the cover 
of a semantic atom. It may be the same in all (i, Z, 3); or in any 
pair (I, 2), (i, 3) or (Z, 3), or different in each. The probability of 
each of these events is 
X Xl ~ X s I XlS 3 = X ~-T = X , 
X I t ) X ~ t xl~ = x~3 = x--r x (l-x = (l-x) , 
X 
x1~ = ~l-x+x(l- x I~ = x (x -x ~) , 
t = . I . - t X~S x I x ~x - Zx 2 (I x') x (x'-x 2) = 1 - - Zx ~ + Zx ~x' 
/ / = (i -x ) (I + x - Zx s) 
Supposing klz 3 , k~s, ks8 , kle and k' of each of these is observed 
when K atoms are considered. The total probability is 
' (kls + klss ) x' + k~s + k2s xS) kls x' x Z(kls+k~s+k~ss) x (i - )k' (x'- (i+ -Zx2) k' 
l Maximum liklihood estimates for x and x are gotten from the 
equations obtained by setting the (logarithmic) derivatives by x and 
! x to zero separately. 
Kleinecke - iZ 
2(k~3 + k~3 + k12 s ) 2klsx ' 4k 'x 
x x -X ~ l+x -2x2 ' 
k t 
k~s+klsa k'+k~s+k~ + + l+x' 2x ~ = x' " 1 - x I 
These equations are best solved numerically for given values of k~s s, 
kls , kls, kss and k' . 
The methodology is straight-forward and there is no need to 
multiply examples. In every case we obtain new formulas based on 
maximum liklihood estimators. Another area in which these methods 
could also be utilized is in the construction of significance tests and 
confidence bands. With this basis, most of the machinery of modern 
statistics would be available for use. 
Criticism of First Order Theory 
As we explained in discussing semantic atoms, we feel there is 
no adequate observational data to which to apply these formulas for a 
conclusive test of their value. We have made a few experimental 
applications using the unsatisfactory data available in the literature. 
Numerically, the time estimates we obtained, which we will 
not quote here, do not differ a great deal from those obtained by con- 
sidering pairs alone. This is to be expected if the phenomena are at 
all consistent. The value in the formulas derived above lies in the 
fact that they correctly combine the data from several pairs. 
The first-order method does have one very important difficulty 
which appears almost immediately if we try to treat more than three 
languages. This difficulty is in the family tree of the languages. 
In the entire first-order development, we have implicitly used 
the concept of a tree. Languages go together as a "common ancestor" 
Kleinecke - 13 
until some point in time when they divide and become two separate 
languages. The tree is the first-order model of dialectation--it is 
known to be inadequate, at least in many situations. In spite of a 
century or so of studies, we simply do not understand how dialec- 
tation occurs. More study is greatly needed, especially in the con- 
struction of higher-order models, but the problem lies outside the 
scope of this paper. 
The difficulty with the tree rises in decay studies because only 
splitting is compatible with our statistical model. We have no alter- 
native to constructing a family tree if we wish to apply the method 
outlined above. However, it seems to be easy to find examples which 
do not allow a tree to be constructed. Consider four languages; A , 
B, C and D . Suppose one semantic atom has the same cover in A 
and B, and another different cover in C and D . And at the same 
time, some other atom has one cover in A and C , and a different 
cover in B and D. We cannot fit this data into any family tree. 
A little more specifically in the Romance languages, we find 
that the same innovation with respect to Latin is shared by several or 
all the later languages. Some of this can be explained by the colloquial 
versus learned speech theory, but no family tree can be constructed 
to explain all the combinations of innovations. If we had an adequate 
explanation of the phenomena involved in these shared innovations, it 
is quite possible that we could assume Romance was the direct descen- 
dent of Imperial Latin without going back to Plautus or thereabouts, 
as seems to be required by the first order theory. 
A tentative beginning in this direction can be made by a second- 
order theory based on the dynamic model of lexical influence. 
Kleinecke - 14 
Second-Order Lexical Decay 
The imprecise model of semantic pressures we formed to 
explain lexical decay suggests the following second-order model. 
For each semantic atom, we consider not only a covering 
lexical item as before, but also a potential covering item. The poten- 
tial cover is the source of pressure against the cover. When the 
cover decays, it is replaced by the potential cover. Naturally we 
also assume that the potential cover decays and is replaced by a 
new potential cover. In the interest of simplicity and because we 
have no numerical data, we will assume both decays have the same 
constant k 
First, let us consider a single language. The situation at an 
atom can be of four types: (1) both the original cover and potential 
cover remain; (If) the original cover remains, but the potential 
cover has decayed; (III) the original cover has decayed and the poten- 
tial cover has replaced it; (IV) the cover is now neither the original 
nor the potential cover. 
Let Pl and PII be the probability of the first two situations. 
Then 
d 
~'-t PI = "ZXPI ' 
= -k PII + XPI ; 
so that 
Pl = exp(-Zit) , 
Kleinecke - 15 
PII = exp(-Xt) (I - exp(-%t)) 
The original cover remains in these two cases only so that the prob- 
ability of it remaining is 
PI + PII = exp(-kt) 
which is exactly the same as in first-order theory. 
When the second-order theory is applied to N languages, the 
results are quite complicated. We divide the languages into four sets 
(~, B, Y, 6 depending on which situation holds in the language; in set 0% 
situation I holds, and so on. Then we have the basic differential 
equation 
- ;',) P(a,~,y, 6) = - 2%p(a,~,y, 6),S~je ~ + %P(a,~,¥,8) dt iec~ .~ L~k¢ 
~jc~ %P(~@j,~j,y, 6) + Ik¢¥ x P(~@k, ~, X -Cgj, 6) 
+ ~.~6%P( C~,~E)~,Y, 6~,f.)+ Z~.c6 
which has the solution 
P(&,~,¥,6) = \[exp(-Xt)\] 21Ct{ + IBI + IYI \[1 - exp(-Xt)\] l~I + IYi + 2161 
We have no way of recognizing the condition of the potential cover, so 
sets ez and ~ should be combined into a set ~ and 
P(~,y, 6) = \[exp(-%t)\] Irll + Ivl \[I - exp(-%t}\] IYI + 2t61 
Kleinecke - 16 
Before we can actually apply the maximum liklihood technique to lan- 
guages without known ancestors, we have to make Some further com- 
binations because sets with I~\]I = 1 can not be distinguished from 
those with I~I- o or those with I'~I- i from those with l~J- 0 
Moreover, we cannot distinguish original covers from potential covers 
so that two sets T\] and y must be combined with the same sets in 
the reverse order. 
The general case is very complicated, so we restrict ourselves 
to two languages. We then observe that the covers are either the 
same or different. If they are the same, we have either I~ I = 2 and 
IYI = 161 = 0,or I~{I = 2 and IT\]I = 161 = 0 Thus, the probability is 
\[exp(-kt)\] s + \[exp(-kt)\] s \[i - exp(-kt)\] s 
= exp(-Zkt) \[i + (i exp(-kt)) s\] 
which differs from the first order theory by the term in the square 
bracket. 
The simplest case where the second-order theory is really re- 
quired is that of four languages. We will illustrate the results by one 
expression. If kss words are covered by two items both in two lan- 
guages, k4 words by one item in all languages, k 8 by one item in 
three languages, k s by one item in two languages, and k' by no 
common items,then the expression to be solved for maximum liklihood 
is 
4kss + 4k4 + 3k s +2k s 2kss + k 3 + 3ks + 5k' 
p 1 - p 
Kleinecke - 17 
4k 4(I _ p)S ks(3 4p + 4p s) 
2 - 4p + 6p ~ - 4p ~ + p%" + 2 + 3p - 2p~ + p4 
k s (4 - Zp - 3p s) 
+ 2 + 4p - p~ - p3 + 
k'(5 + 6p + 9p 2) 
+ 5p + 3p s + 3p s 
where p = exp(-%t) . 
This second-order theory is not satisfactory not only because 
it leads to very complex formulas, but it also seems to be qualita- 
tively inadequate. The formula for splitting between two languages 
is not greatly modified except for very long times, and the change 
does not seem to be enough to account for data showing short times 
of division. It is hard to tell whether the formula for several lan- 
guages including the quantity k2e is any help--so far we have no 
striking results to quote from its use. 
A second-order theory where potential cover decayed at a 
different rate than the original cover might correct some of these 
defects, but we have no evidence upon which to estimate the decay 
rate in this case. It is much likely that a more elaborate mechanism 
must be postulated--it need not lead to more elaborate results. The 
model must be based on a kind of dialectation study which seems to 
be absent as yet from the literature. 
Conclusion 
We have derived a number of formulas relating to the estimation 
of time depths by observations of lexical decay. The methods used 
can be applied to obtain many more similar formulas as required in 
studies of actual data. 
All of these formulas are based on models of lexical decay using 
the concept of semantic atoms and their lexical covers. Lexical decay 
Kleinecke - 18 
is identified with a change in lexical cover. If the semantic atoms 
are sufficiently independent, the decay is a Poisson process. 
Probably the most important practical conclusion is the result 
that any set of semantic atoms can be used to evaluate lexical decay 
provided the set is made up of atoms: 
. 
. 
far enough removed in meaning from one another to assure 
independence, 
which represent concepts assured to have been in exis- 
tence throughout the time period being studied. 
(1) 
(z) 
(3) 
(4) 
End Notes 
There is no outstanding study of this problem. Attempts 
to "improve" the test vocabulary by limiting it to mean- 
ings which have behaved well in earlier studies are meth- 
odologically disasterous because they bias the value of k. 
This requirement is also intended to remove bias from 
the estimate of k . 
This is a matter of classical philological research inde- 
pendent of statistical syntheses made from the results. 

References

Robert B. Lees, "The Basis of Glottochronology" 
Language, 29. I13-27 (1953). 
