File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/73/c73-1012_metho.xml

Size: 18,644 bytes

Last Modified: 2025-10-06 14:11:05

<?xml version="1.0" standalone="yes"?>
<Paper uid="C73-1012">
  <Title>Hittite Phrygian Greek Macedonian Illyrian Old Slavonic, Old Czech, Russian, Ukranian, Serbian, Middle Bulgarian (Slavonic group) Albanian</Title>
  <Section position="1" start_page="0" end_page="11" type="metho">
    <SectionTitle>
ENRICO CAMPANILE- ANTO~qIO ZA~POLr~
PROBLEMS IN COMPUTERIZED HISTORICAL LINGUISTICS:
THE OLD CORNISH LEXICON *
</SectionTitle>
    <Paragraph position="0"> ft This work represents an attempt to utilize the computer in solving problems in historical linguistics.</Paragraph>
    <Paragraph position="1"> The corpus upon which it operates is not a language but a recently published etymological dictionary of Old Cornish. 1 Any observations regarding the scarcity or inaccuracy of the data utilized are, therefore, irrelevant, as far as the present paper is concerned.</Paragraph>
    <Paragraph position="2"> As the dictionary in question was compiled according to the usual methods employed with such works, a detailed explanation of methodology is unnecessary. It should also be noted that Old Cornish is known only through glosses to Latin words, and that in this case &lt;~ Cornish gloss ~ is equivalent to ~ Cornish word ~.</Paragraph>
    <Paragraph position="3"> With the help of the computer, we have attempted to solve the following problems: a) To establish the percentage of words with and without Indo-European etymology in the Cornish lexicon. (Let us stress that this study concerns not a language but an etymological lexicon; hence, the presence or absence of Indo-European etymology should not be construed as a definitive characteristic of a Cornish word. Such statistics are, in fact, relevant only to the present state of research on the subject). b) To establish the degree of certainty concerning the material of Indo-European etymology.</Paragraph>
    <Paragraph position="4"> c) To evaluate the extent of the connection between elements of Indo-European etymology existing in the Cornish lexicon and the other Indo-European linguistic groups according to the degree of certainty of each individual etymology.</Paragraph>
    <Paragraph position="5"> d) To establish, on the basis of existing etymological studies of  162 E. CAMPANILE- A. ZAMPOLLI The reader will observe that the ftrst problem is purely statistical (though it has an obvious diachronic premise), that the second aims at attaining qualitative data (though they are expressed quantitatively), that the third concerns the area of Indo-European dialectology, and that the fourth has its own specific heuristic and methodological signifidance. In order to acc0mplish these goals, the contents of the etymological dictionary were put on cards, each of which contained the following entries: a) a non-Cornish word (with an indication of the language to which it belongs); b) the Cornish word related in the dictionary to the item under a) ; c) the type of relationship existing between item a) and item b) ; and whether this relationship is afftrmed, denied or uncertain; d) the indication that item b) is or is not a nominal compound (this being the only type of compound found in Old Cornish); e) in the event that item b) is a nominal compound, a breakdown of the elements contained in it; 2 f) the page from which the foregoing material was taken.</Paragraph>
    <Paragraph position="6"> With regard to item c), the possible types of relationships have been described (see below) according to the information supplied, either explicitly or implicitly, by the etymological dictionary and have been rated according to the following numerical system:</Paragraph>
    <Paragraph position="8"> but the nature of the relationship cannot be determined exactly (that is, whether it is a matter of kinship or loan).a Every element has been given either in the Cornish form (if it is attested elsewhere in the text or if it is not attested only be cause of lack of documentation), or in the common Celtic form or in the Indo-European form; certain diacritic signs indicate which possibility has been chosen.</Paragraph>
    <Paragraph position="9"> * ~ The distinction between borrowed words and co-radicals is that provided by the etymological dictionaries and handbooks of historical linguistics. Since the difference ..... , - , ^ . - .....</Paragraph>
  </Section>
  <Section position="2" start_page="11" end_page="11" type="metho">
    <SectionTitle>
SOME EXPERIMENTS IN HISTORICAL COMPUTATIONAL LINGUISTICS 163
</SectionTitle>
    <Paragraph position="0"> 9-----the Celtic co-radical of the Cornish word (this rating prevails over ratings 1,2 and 3 because the prime object of the present research is Indo-European etymology rather than the Celtic connections of Cornish). -&amp;quot; To these eleven ratings will be added that of 0 which will not indicate the relationship between Cornish and non-Cornish voices, as in the case of the other ratings, but will serve instead to distinguish the non-Cornish words (actually, Cymric) which, due to the various vicissitudes of the handwritten tradition, have crept into the authentic Cornish glosses and which, as such, do not form part of the present study. The following items, taken from the etymological dictionary, and their ratings illustrate the preceding principles: roan gl. fornax I. clibanus 920. Come il bret. fo(u)rn (ant. bret. gufor(n) gl. clibani), il cimr. ffwrn e l'irl, sorn, ~ prestito dal lat. furnus. HV, 179; VG, 221; LH, 274; VB, 190.</Paragraph>
    <Paragraph position="1"> rRIIC gl. nasus 30. Formazione in -IC (con originario valore, forse, diminutivo), da compararsi con bret. j~i ~ naso~. Non ~ da escludersi un rapporto con formazioni (originariamente onomatopeiche) in *sr- designanti il russaree il naso; cPS gr. ~kyXc0, arm. ;ngunk' etc. IEW, 1002.</Paragraph>
    <Paragraph position="2"> tROT gl. alueus 737. Identico a bret. froucl ~ torrente ~, cimr. ffrwd ~ corrente ~, irl. sruth (gen. srotha) ~ flume, corrente ~, gall. OpouS~ (leggi OOou-:uC/), tutti da *sprutu-. Mail confronto con lit. spria~nas ~ fresco ~, ted. sprtde ~ secco ~ non ~ semanticamente convincente. II termine sopravvive anche nell'ital.</Paragraph>
    <Paragraph position="3"> dial. froda ~ torrente ~ (REW, 3545), VG, 35; Pokorny, Celtica 3, 1956, 308; LH, 541; Meid, IF 65, 1960, 39; IEW, 994. ~ between the two concepts exists only as a chronological distinction, the problem is, therefore, irrelevant. Cf. V. Pis^m, Parent~ linguistique, in &lt;, Lingua ~, (1952), p. 3 (or Saggi di linguistica storica, Torino, 1959, p. 29) and Variazioni sul problema indoeuropeo, in Lingua e culture, Brescia, 1969, p. 21.</Paragraph>
    <Paragraph position="4"> 4 FORN gl. fornax l. dibanus 920. Like the Breton fo(u)rn (OBr. gufor(n) gl. clibani), the Cymr. ffwrn and the Irish. sorn, was borrowed from the Latin furnus. HV, 179; VG, 221; LH, 274; VB, 190.</Paragraph>
    <Paragraph position="5"> FRIIC gl. nasus 30. Formation in --IC (originally, perhaps, diminutive), is comparable to Breton fri * nose ~. They may also have kinship with formations, originally onomatopoeic, in *sr- which designate both snoring and nose; cf. gr. ~'~'Xco, arm. bngunk' etc. IEW, 1002.</Paragraph>
    <Paragraph position="6"> FR.OT gl. alueus 737. Identical to Bret. froud ~ brook ~, Cymr. ffrwd ~ stream *, Irish sruth (gen. srotha) * rover, stream~, Gaul. ~pouS~g (read ~po~'rug), all from *sprutu- But the comparison with Lit. spria~nas ~ cool ,, Germ. spri~de ~, dry ~ is not semantically convincing. The term survives in Italian (dial.) froda ~ brook ~ (REW, 3545). VG, 35; Pokorny, Celtica 3, 1956, 308; LH, 5, 1; Meid, IF 65, 1960, 39; IEW, 994.</Paragraph>
    <Paragraph position="7"> 164 E. CAMPANILE- A. ZAMPOLLI These three paragraphs gave the following 15 cards: br. fo(u)rn 5 9 0 forn 47 a. br. gufor(n) 9 forn 47 cim. ffwrn 9 forn 47 irl. sorn 9 forn 47 lat. furnus 8 forn 47 br. fri 9 friic 47 gr. ~-fXco 3 fiiic 47 arm. #ngunk' 3 friic 47 br. froud 9 riot 47 cim. ffrwd 9 frot 47 irl. sruth 9 frot 47 gall. ~po~.ruC/ 9 riot 47 lit. spriafmas 5 frot 47 ted. spr6de 5 riot 47 ital. dl. froda 1 frot 47 All the words with an index of 0 were eliminated prior to the operation. The analysis of compounds was found to be a particular problem. When the rating was carried out, the section of the compound with a kinship with the non-Cornish word a) was indicated (and hence a numerical rating was given). For example, the following paragraph: m~wuir gl. uigil 401. Composto dal prefisso celt. *so- ~ bene, buono ~ (ant. bret. ho-, hu-, he-, ant. cimr. hi-, he-, hu-, irl. su-, so-) simile ma non identico al scr. su-, gr. 6- (in b~,~C/ da *su-g~ii.~s &lt;&lt; che vive bene ~) e da *gull ~ veglia (= cimr. g(vyl &lt;~ festa ~&gt;, bret. goel ~id. ~, irl. f3il ~ id. ~, tutti dal tardo lat. u~lia, per uigilia). HV, 140; VG, 214; LH, 463 e 659. 7 yielded the following 14 cards: a. br. ho- = 9 hewuil (degdegso deg'gull) 64 a. br. hu- = 9 hewuil (degdegso degguil) 64 Column reserved for information concerning nominal compounds.</Paragraph>
    <Paragraph position="8"> 6 Column reserved for the analysis of nominal compounds.</Paragraph>
    <Paragraph position="9"> 7 I-I~wtm., gl. vigil 401. Composed by the Celtic prefhx *so- * well, good* (Old Bret. ho-, hu-, he-, Old Cymr. hi-, he- ho-, hu-, Irish su-, so-), similar but not identical to Scr. su-, Gr. ~- (in ~y~ from *su-gWi~s ~ that lives well ~) and by *guil * vigil J (= Cymr. gfvyl &lt;~ feast ~), Bret. god &lt;~ id. ~&gt;, Irish f3il, id. ~, all from late Latin u~a, equal to vigilia). HV, 140; VG, 214; LH, 463 and 659.</Paragraph>
  </Section>
  <Section position="3" start_page="11" end_page="200" type="metho">
    <SectionTitle>
SOME EXPERIMENTS IN HISTORICAL COMPUTATIONAL LINGUISTICS 165
</SectionTitle>
    <Paragraph position="0"> a. br. he- -~ 9 hewuil (degdegso deggull) 64 a. cim. hi- = 9 hewuil (degdegso deggull) 64 a. cim. ho- = 9 hewuil (&amp;quot;'so deggull) 64 a. cim. hu- = 9 hewuil (degdegso deggull) 64 irl. su- = 9 hewuil (degdegso deggull) 64 irl. so- ----- 9 hewuil (degdegso degguil) 64 scr. su- = 5 hewuil (degdegso degguil) 64 gr. 6- = 5 hewuil (&amp;quot;'so deggull) 64 cim. g(vyl ~ 9 hewuil (degdegso degguil) 64 br. goel ~ 9 hewuil (degdegso deggull) 64 irl. fdil ~ 9 hewuil (degdegso degguil) 64 It. volg. u.Hia ~ 8 hewuil (degdegso deggull) 64 (Note: in the preceding table, the sign = indicates that the kinship of word (a) is with the ftrst part of the Cornish compound; the sign - indicates that the kinship is with the second part; the sign oo indicates that the given form of the ftrst member of the dissolved compound is referable to the common Celtic period; and the sign o indicates that the word does not happen to be attested).</Paragraph>
    <Paragraph position="1"> But, from the point of view of historical linguistics, it is evident that, while gull has not been attested as an autonomous form merely because no documentation happens to be available on the subject, he- existed (and always has existed) only as a member of a compound. Nevertheless, while gull could possibly be included among the autonomous lexical elements of our text, he- could only be found among the morphemes. And finally, the compound hewuil, as a creation of the Cornish (or Celtic) age, has no precise equivalents in other Indo-European languages, and any equivalents that happen to exist may be considered a priori only the result of chance.</Paragraph>
    <Paragraph position="2"> The task of analyzing compounds is further complicated by the presence of words (the Latin credere, for instance) that from a diachronic point of view are compounds while from a synchronic point of view they are not.</Paragraph>
    <Paragraph position="3"> For the reasons just stated, we decided to eliminate the compounds from the present analyses and to make them the object of a separate study.</Paragraph>
    <Paragraph position="4"> Thus, in addition to the words with a rating of 0, entries containing the signs = and-or - have also been discarded.</Paragraph>
    <Paragraph position="5"> 166 E. CAMPANILE- A. ZAMPOLLI After the words with a rating of 0 and the nominal compounds were discarded, the surviving Cornish material consisted of 745 elements that, in relation to our first problem, were subdivided in the following way.</Paragraph>
    <Paragraph position="6"> words of Indo-European etymology s 284 38 deg/o words borrowed from other languages 9 254 34 deg/o calques from other languages 10 0 0 deg/o uncertain kind of kinship 11 0 0 % words without Indo-European etymology 1~ 207 28 deg/o 745 100 % With regard to the second problem, the 284 words of Indo-European etymology were divided according to the degree of probability. The breakdown is as follows: words of certain etymology 13 words of very probable etymology 14 words of probable etymology 15</Paragraph>
    <Paragraph position="8"> In order to solve the third problem, all the entries containing non-Cornish words correlated to one of the 284 Cornish words of Indo-European etymology were taken into consideration. These entries (742 in all) were subdivided into 17 groups according to the \]inguistic kinship of the language to which the word in item (a) belongs:  The reader will notice that not all Indo-European languages are represented here. This is due to the fact that not all Indo-European languages are represented in the etymological dictionary that provided the material for the present work. On the other hand, there are two non-Indo-European languages in group 17 because one Cornish word is thought to have a kinship with non-Indo-European words.</Paragraph>
    <Paragraph position="9"> Each of the 742 words has an etymological kinship with Cornish words that is either certain (rating 1), very probable (rating 2) or probable (rating 3). These words were arranged into linguistic groups with the rank of 1 going to the group that had at least one exponent with a rating of 1, the rank of 2 going to the group with at least one exponent with a rating of 2, and the rank of 3 to the group with neither rating. Here are the results: 168 E. CAMPANILE - A. ZAMPOLLI r. 1 r. 2 r. 3 tot. %r. 1 %r. 2 %r. 3 % of tot.</Paragraph>
    <Paragraph position="10"> GP~. 1 12 1 0 13 0,9231 0.0769 0.0 0.0175 GK. 2 95 9 9 113 0.8407 0.0796 0.0796 0.1523 GK. 3 24 0 3 27 0.8889 0.0 0.1111 0.0364 GR.. 4 11 0 1 12 0.9167 0.0 0.0833 0.0162 GI~. 5 0 0 0 0 0.0 0.0 0.0 0.0 GR.. 6 95 6 8 109 0.8716 0.0550 0.0734 0.1469  GI~. 7 0 1 0 1 0.0 1.0 0.0 0.0013 GR.. 8 1 0 0 1 1.0 0.0 0.0 0.0013 GK. 9 45 4 2 51 0.8824,0.0784 0.0392 0,0687 GIk. 10 11 3 2 16 0.6875 0.1875 0.1250 0.0216 GIk. 11 71 9 5 85 0.8353 0.1059 0.0588 0.1146 GK. 12 154 8 8 170 0.9059 0.0471 0.0471 0.2291 GK. 13 1 0 0 1 1.0 0.0 0.0 0.0013 GR.. 14 10 0 0 10 1.0 0.0 0.0 0.0135 GK. 15 108 6 12 125 0.8571 0.0476 0.0952 0.1698 GI~. 16 3 1 3 7 0.4286 0.1429 0.4286 0.0094 Gtk. 17 0 0 0 0 0.0000 0.0 0.0000 0.0  ToT. 641 48 53 742 0.0002 0.0 It will be observed that the Celtic group appears to be poorly represented in that the existing etymological kinship with Cornish words is normally expressed by the rating 9 (and not, therefore, 1,2 or 3), while the rank of 1,2 or 3 has been attributed only to those words which, though still within the Celtic group, are part of other linguistic traditions (words of the Celtic substratum in the romance languages, for example).</Paragraph>
    <Paragraph position="11"> An analogous operation was then carried out with all the material having a rating of 4,5,6 or 7 (that is, negative etymologies). The restilts are as follows: r. 4 r. 5 r. 6 r. 7 tot. %r. 4 %r, 5 %r. 6 %r. 7 % oftot.  From all this material it was possible to draw the following conelusions: null 1) In the Cornish lexicon there are 254 (34 ~) lexical loan-words, there are 284 (38 ~) words with Indo-European etymologies, and 207 (28 ~) without any known etymology.</Paragraph>
    <Paragraph position="12"> 2) The vast majority of the words with an Indo-European etymology (238 out of 284 = 84 ~/o) have an etymology that is certain, as far as is known at the present state of research on the subject. Another 16 ~ have etymologies that are either very probable (23; 8 ~) or merely probable (23; 8 ~).</Paragraph>
    <Paragraph position="13"> 3) With regard to etymological kinships with non-Celtic Indo-European linguistic groups, the closest connections are with German (0.2291), Latin (0.1698), Indo-~yan (0.1523), Greek (0.1469) and with Baltic (0.1146). Such results appear to be extremely important in that they conftrm the innovative character of the occidental lexicon (kinships with German, Latin and, at least in part, Baltic) existing side with the preservation of archaic elements in lateral areas (kinship with Indo-Aryan), thereby showing strong kinships with the central area of the Indo-European world (Greek and, at least in part, Baltic) which have yet to be adequately assessed.</Paragraph>
    <Paragraph position="14"> 4) The highest' percentages of now unacceptable relationships suggested by scholars in the past are those with Lati/a (0.2600), Greek (0.1350) and Indo-A_ryan (0.1300). This, together with the fact that these same groups have also yielded a very high percentage of acceptable etymologies, suggests that these areas have been exhausted. As working hypothesis, new etymological comparisons ought now to be considered particularly with German and Baltic, which combine a high yield with a more tolerable percentage of acknowledged errors (0.0850 and 0.0750 respectively).</Paragraph>
    <Paragraph position="15"> 170 E. CAMPANILE- A. ZAMPOLLI 5) Of the 745 Cornish words which have supplied the material for the present study, as many as 671, almost 90 deg/o, bear at least an index of 9; that is, have one or more Celtic co-radicals. This confirms the , compact ~) character of the Celtic lexicon.</Paragraph>
    <Paragraph position="16"> Moreover, there are Cornish words which have one or more indices of 9 to the exclusion of any other index (139; 18 ~/o)&amp;quot; These are words that have co-radicals exclusively in the Celtic world. On the heuristic level, this verification gives rise to a question that is at the same time a working hypothesis: are they substratum words? The same question and the same working hypothesis also arise with the words where one or more indices of 9 accompany the indices 4, 5, 6, 7: these are words with Celtic co-radicals formerly thought to be of Indo-European etymology but now refuted in the dictionary, They are 60.</Paragraph>
    <Paragraph position="17"> Our analysis, therefore, seems to suggest, too, that future linguistic research will fred rich material for substratum studies in Cornish and, more generally, in Celtic.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML