Computers in the Yu~oslavSerbo-Croat/English Contrastive 
Analysis Project 
~eljko Bujas, Ph.D. 
Assistant Professor 
Department of English 
Zagreb University, Zagreb, Yugoslavia 
0.i. As far as the present writer is aware, the Yu~o- 
s-~av Serbo-Croat/English Contrastive Analysis Project'is 
the first contrastlve analysis effort to use a large cor- 
pus of parallel texts. The corpus is made up of the Brown 
Corpus (reduced by 50%) with its Serbo-Croat translation, 
and a smaller Control Corpus (Serbo-Croat originals and 
English translation). A total, thus, of twice 500,000 
words plus twice 150,000 words, or a grand total of some 
1,300,000 words of running text. 
0.2. The Project, let us make it clear, is not exclu- 
sively based on this corpus. Compilation and confron- 
tation of grammatical statements by various authors, plus 
plain old intuition, figure prominently in the methodol- 
ogy. The insistence on a large corpus, however, is due to 
the conviction, prevailing among the Project workers, that 
only an extensive investigation of correspondences 
(original-language elements and their translations) can 
adequately reveal the less predictable patterns which ten~ 
to have a considerable contrastive analysis potential. 
0.21. The most productive method of obtaining correspon- 
dences from our corpus is to concordance separately its 
Serbo-Croat and English parts, then to merge the resulting 
KWIC concordances into a contrastive KWIC concordance 
(with English keywords and alternating English and Serbo- 
Croat lines). For the more promising patterns, the merging 
procedure will be used twice, with both English and Serbo- 
Croat keywords. 
0.22. In view of the size of the corpus, and the exten- 
sive concordancing required as a major procedure in the 
Project, the need for computer processing is obvious. It 
requires no undue strain on imagination to realize the 
soul-numbing effect of sheer physical handling of this 
mass of text if written out on slips. 
Even in its most efficient and flexible form of a 
manual concordance (a sentence-slip file with keywords 
underlined monolingually), without which no manual pairing 
of correspondences is possible, the manual handling of 
this 1,300,O00-word corpus calls for a staggering amount 
of time and effort to prepare. According to our careful 
estimate, a total of 7,100 man-hours is required to make 
such a concordance (without the 1,900 hours of transla- 
tion from English to Sarbo-Croat, and vice versa). 
0.23. The slip file thus obtained would, however, secure 
only a one-way approach: either from English or Serbo- 
Croat. A slip-file allowing a two-way approach would re- 
quire an additional effort of at least 4,500 man-hours. 
0.24. Finally, even these two manual concordances would 
still leave unfilled the need for reverse concordancing, 
go important for morphosyntactic research. To meet this 
need, two additional (though less ample) slip files would 
i~ave to be established. 
1.0____~. In view of all this, the Yugoslav Serbo-Croat/ 
English Contrastive Analysis Project has from the outset 
linked the planning of its work to the services of a local 
computer, the City of Zagreb IBM 360/30 machine~ 
1.1.._~. StaF.e 1 of computer processing. The tape with the 
full text of the Brown Corpus (purchased from Brown Uni- 
versity, ~rovidence, R.l., U.S.~.), which had been pre- 
pared on an IBM 7090 machine, had first to be converted 
from the density of 800 ~'I to 1,600 BPI, required by the 
Zagreb computer. 
1.±~__..._:_. After this, a printout Of the entire text was ob- 
tained on the Zagreb machine. The printing took about 
eight hours, with a special program~restructuring the 
original format of the Brown Corpus text. This program 
left out the location-marker column on the right-hand mar- 
gin of printout ~, and added a sequence of sentence numbers 
(from 00001 to 52533) on the left. 
1.12. The full text of the Brown Corpus was now reduced 
hy 50%, retaining, however, as closely as possible, the 
s~Lme proportions of the 15 genres (styles) contained in 
the Corpus. 
Printouts of the samples retained in this reduced 
version were then sent out to reliable translators, se- 
lected to be representative of the three major regional 
variants of Serbo-Croat (western, central and eastern). 
Their instructions were to translate at normal speed, and 
as carefully as when they do any other paid translation 
~or~. The only limitation imposed upon them was to observe 
the sentence limit in the original (English or, in the 
used for the preparation on the IBM 360/30 of a full for- 
w~rd EWIC concordance of the Serbo-Croat Corpus. 
~.~. St~e 8. Using the same tape, we now plan to pro- 
~uc~e a reverse KWIC concordance of the Serbo-Croat text. 
Ibis concordance will be selective in the same sense that 
the English reverse concordance was (cf. Stage 4). 
Sta~e ~. With the normal and reverse KWIC concord- 
a~ce~ of both the English and Serbo-Croat corpora now ob- 
tained ~, we can move on to the final stage(s) of central 
importance to the ~roject, i.e. the merging of these mono- 
lingual concordances to get contrastive concordances (cf. 
~ ). We have planned four such concordances, and have 
~'~empted to illustrate them here by short simulated sam- 
~es. AS at the time of writing this no concordances of 
the Brown Corpus text (either original or translation) 
w~re available, the text used for these samples is the 
S~zbo-Croat original and its translation into English of 
the novel Povratak Filipa Latinovicz_a~ (The Return of 
ihilip Latinovicz) by the contemporary Croat writer Miro- 
~lav Krle~a. 
!.91. Forward contrastive concordance ~ 
(English to Serbo-Cro~t) 
~jJ3 ND THE DO0~ LOCCEOt AND HIqSELF StIUr ~UT IN THE STqEET, AND EVER SINCE THEN 
~,~3~ ASZA~ ZAKLJUCYANA VRATA I OS|AU NA ULICI, IE O\[ADA ZLIVl NA ULICI fEEL NNOG 
~312 E |3~3JL OF THE C~KE FLAqE FLICKERED OUT FR~H UN#ER 1HE PAINTED IRON STOVE/ 
3312 srl J~L\[CYAC KOKSOVO~ PLAMENA PU~ ST&LkOH NASLIKANE ~VOZDENE PArENT-PECZIt2 
3~3 RIAN\[S STATJE AND rHEY HEdER GOT HIM OUT, AN~KH~ NA|ER AOOVE HIM HAS STAJN 
~3 JRZJANA IDA GA VISZE NIKADA NISU JZVUKLJt NEGO SE JE SAND VODA LAKRfAR\[LA 
~J4~ u dHEN HiS D~N flO|HER HA~ \[URNeD HiM OU| |N|O THE S|~EE| IN MORAL |NDIGNA|\[ 
~J~ JUTRAt KADA ~A JE ~ODZENA MAJKA IZBAC\]LA NA ULICU S MURALNiM ZGRALZANJEM~ 
~b~ ~E D\[S|ANCEt EVERYTHING HAS SHELLING 3U\[ iN THE SILENT iNSrRUMEN|AT|ON OF T 
J~B~ J GALJ|NAMAt SVE JE RASLG KAO T\]HA INSTRUHENIAEIJA MOORDG JUrARNJEG 6UDZENJ 
31~3 W||H ITS ROLLS 3F ~LAD, -- ALL GAVE OUT THE ACRID AND PUNGENTLY ACRID SMEL 
3L~ EMLJANAt KAO KDPRENA/I/ \[Z SVESA STRUJ\] OSLT~R \] OSJE|LJiVO VLALLAN VONJ DU 
~L~& ARL\]ERm |HE STUFFING HAD BEEN ~OM|NG OUT~ A MASS OF ~ANDSt CURLY FEA|HERS A 
~L~b ~I $~O\[NA, PROVIRiVALA UrRO~A, |SPUNJENA GU~TAMAt PERAST|H KOLUTiMA | CYUPE 
3ZIJ FIRE ~LAZES OUT OF IHE \[R~N T~ROATS AND THERE IS A 
J~13 sUKLJA ~A~J \[Z ZLELJ~ZNId ZZORIJ~LA I MiRISLE BARUTIII JE 
~g¢~ A ~AX CANDLE WAS ~URN\[N; OUT ON A MARgLh S~UAR~ OF THE CHJRCH F 
3Z~3 ~$3~\]JBTALA JE ~A M~AMO~NOJ CYEfVORIN\[ CRKVcNOG PODA JEDNA VOS 
3ZIL RY HJMAN EYE+ LiKE AN ANIMAL PEEPING OUT OF A CAGE/B/ HUMAN GESTURES ARE LI 
321t JDSKOM OKU |MA Iu~E, KAKVOM DOSADZ&JE PRQMATRAJU ZZIVOI|NJE |Z KAVEZA/2/ KR 
Control Corpus, in Serbo-Croat). They were not to split 
the English sentence into two or more Serbo-Croat senten- 
ces, norwere they allowed to combine two or more English 
sentences into one Serbo-Croat sentence. 
The reason for this was the need to secure a me- 
chanical palrin~ of the English (or Serbo-Croat) keyword~ 
marked by its sentence number, with the same-numbered, 
parallel, Serbo-Croat (or English) sentence in the two- 
language concordancing planned for the later Project 
stages. 
1.2..__~. Sta~e 2. A new magnetic tape will be prepared of 
the reduced Brown Corpus text, and with the sentence se- 
quence numbers interpolated. This version will be used 
for all subsequent concordancing. 
1.3. Sta~e 3. Using this magnetic tape, the IBM 360/30 
will new prepare a full forward EWIC concordance of the 
reduced Brown Corpus text~ 
1.4. ~ Now (while the reduced Brown Corpus is 
still being translated) we shall use the same tape to ob- 
tain a reverse EWIC concordance of the same text. S~nce 
all "function words" - such as of, had, most, those, did, 
etc. - were already isolated in'~he-~evi-~ st~g-~-(in--the 
forward concordance)°this will further reduce the n~ass of 
text to be concordanced by one-half~ 
1.5. Sta~e 5. The Serbo-Croat trsnslation of the reduced 
Brown Corpus, by now in an advanced stsge, will be copied 
out on a Flexowriter in batches (as translators Rend in 
their typescripts), resulting in a paper tape. 
The same procedure can, at thiz stage, be applied 
to the 300,000 words of the Control Corpus. No time for 
translation has to be set spart here, since only already 
published English translations of Serbo-Croat originsls 
are to be used. 
1.6. Sta~e 6. Although the Serbo-Cro~t paper tapes ob- 
~-ned in the preceding stage are immeg~ately computer- 
processable, we shell convert them to a magnetic tape, be- 
cause this medium secures an incomparably speedier proees- 
sing on the computer. 
1.61. We hope that stages 2 to 6 will not take more than 
twenty weeks (if enough personnel can be hired simulta- 
neously). 
1.7.__.=. Stage 7. The Serbo-Croat magnetic tspe will now be 
2205 
2205 
2L46 
2146 
02\[6 
32E6 
2166 
2166 
0986 
0986 
156b 
1566 
0210 
0210 
3673 
0673 
1316 
1316 
0268 
0248 
Forward contrastive concordance 
(Serbo-Croat to English) 
EUKVON, GDJE SU SE BILl SKLONIL| ONE BORNE NOCZ\[t POi~LIJE ROKUVUD PRUSZrENJ 
LD OAK-TREE WHEI~E THEY HAD FOUND SHELTER THAT $10RNY NIGHI ON THEIR WAiF BAC 
AJJ LIJECYNIC| U SVOJIM TAJANSTVENIN 8URNUSINA /SZTO |ZGLEDAJU KAO STAROMOO 
MOVED PHYSICIANS IN IHEIR MYSTERIOUS ~URNOUSES LIKE OLD-FASHIONED NIGHrSHIR 
ISERAt NAKOSTRIJESZENA LAVLJA GR\[VA, BURSKE (~ATERIJE PRED LADYSNITHONo MARS 
RL-DIVERSt THE LIONt. S ~\[SI'LiNO MANE, IHE 80E~ BATTERIES AT LADYSNITH, THE 
RIRISZU JE, IRA Li U NJOJ KARAMELAt 8USZE MU PO ZUOIMA, MJERE MU TLAK KRVI 
S AND SMELLI~IG IT TO FI~ID OUT WHETHER IHERE WAS ANY SUGAR IN llp DRILLING H 
~&PROSIM VAS, JAGO, guTE SPAMETNII~,/ 
&&PLEAS~.t YAr'Ap BE SEN~,IBLE/@/ 
SMATO TALASANJE GUZOVA I LISNJ~CYA | ~UTINA, DEBELIH MASNIFI LLENSKIH NOGU, 
HAIRY BUTTOCKS AND CALVES AND THIGHS, EAT WOMEN&S LEGS, ANKLES, JOINIS, SK 
AKAVA STEGNA KONJS~A, KRVAVE RANJENE B'U\[INE, UZNEMIRENE CRNE REPINE, RASKRV 
LANKS, BLOOD-STAINeD AND WUUNUED, THF_IR LONG BLACK LASHING TAILS, THEIR \[:LL 
EKANE POJASE MESA 0KO KUKOVA I ILNAD ~UIUVA U LJELINI POIEZA, A OVAJ TU E.LE 
SOFT ROLLS OF FLESH ROUND IHE HIPS AN~ ABOVE \[HE IHIGHS, WHILE IHIS FELLU~ 
AVODLAKAVOJ OBLINI KONJSKIH STLG~IA I suruvA, \[O JE J~DINI VELIKI DOZLIVLJAJ 
INING HAIRY FLANKS AND HINDQUARTERS, HAD P.EEN IHE ONLY G~IEAI EXPERIENCE OF 
LAVE~ ZZALOSNE PT1CYJE OCYI~ KRAVLJE BUTOVE, KONJSKA STEGNA~ A SJNOCL JUSZ 
SE LEGS, WRETCHED BIRDS& WIN~3S, COWS~ bUTIOC.~,S, HORSES& HAUNCIIES, WHILE CJNL 
Reverse contrastive concordance 
(English to Serbo-Croat) 
0002 WAS ALL STILL FAHILIA~ TO HI~/|/ THE ~OTItN3, SLINY RUOFSp FHE RUUND BALL U 
0002 NAO JE JOSZ UVIJEK SVb KAKO OULAZI/A/ I IRULI SLINAV| KROVOVi I JABUKA FRAT 
0003 NTY-THREE YEARS HAD PASSED SINGE \[HE NURNIN~ WHEN HE HAD SLUNK UP rO THAT O 
0003 DESET I TRI G3DINE SO PRDSZLe OD ONDG JUTRAm KAOA SE OUVUKAU PUD OVA VRAIA 
0003 EETt AND EVER SINCE THEN HE hAD BEEN LIVING IN THE STREEIt AND NUTHING HAD 
0003 LICIt TE UTADA LZlVl NA ULECI VLCZ H,~OGO GDDINA, A NISZTA SL NIJE PRONIJENI 
0003 E HAD BEEN LIVING IN Tile SIREETB AND NUTHtN~ HAD REALLY CliANSEO. 
0003 Yl NA ULIC\[ VECZ MNOGO GDOINA, A NIS~TA SE N|JE PRURIJENILU UGLAVNOM. 
0006 OLY LOCKED DOOR ANDe JUST AS Dt~ THAr MURNIN3t HE COULD FEEL \[HE COLD, IRON 
0004 M ZAXLJUCYA~IM VRATIRA, I KA3 I O~OG JUTRA IMAO ~E OSJECLA3 FJLADNOG, GVUZDE 
0006 AS HE PUSHED lit HOW THE LEAVES WERE QUIVER~N3 IN THE UPPER ORANCHES OF \[HE 
0006 NJEGOVUM RUKO~ I ZNAO JEt KAKO SE LISZCLE MIC~E U KRUSZNJAMA KESIENOVA ! CY 
0006 AS IF IN A DREAM -- AS ON THAI OTtiE~ MORNIN3 -- /\[/ HE #AS ALL DIRTYB TIRtD 
0006 ILO RU JE /ONOG JUT~A/ KA~ DA SA~JAtI/ BlO JE SAY CYAUZAVe UMORANI NEISPAVA 
0036 RED, IN NEED 3F SLEEP, HE CgULD FEEL SONEIH|N3 CRAWLING iNSIDE HIS COLLAR - 
000~ RAN, NEISPAVAN, OSJECZAJUCZI KAKO MU NESZIO PLAZI OKU UKO~RATNIKA/I/ PU SVU 
0006 ED OF SLEEP, HE COULD FEEL SOMETHING CRAWLING INSIDE HIS COLLAR -- A BEO-£U 
0006 Nt QSJECZAJUCZl KAK3 RU NESZTO PLAZI ~KO OKOVRA\[NIKA/\[/ PU SVOJ PR|LICI STJ 
000~ ROt LAST DRUNKEN NIGHr~ AND THE GREY HORNIN3. 
0005 PIJANL, POSLJED~JEj fREEZE NOCZI I ONOG SIVOG JUIRA -- DOK 2ZIVl. 
Reverse contrastive concordance 
(Serbo-Croat to English) 
0002 R~OREDA, MEDUZ INA GLAVA 0D SADRE NAD TESZKIM, OKOVANI M HRASTOVIN VRAT IMA I 
0002 LASTER HEAD OF MEDUSA SURMOUHTING THE HEAVY, IRON-BOUND OAK DOOR WITH ITS C 
0002 MEDU~ INA GLAVA OD SADRE NAD \[ESZKiM, OKOVAN|M HRAS\[OVIM VRA\[IMA I HLADNA KV 
0002 OF MEDUSA SURMOUNTING THE HEAVY, IRON-BOUND OAK DOUR WITH ITS COLD LATCH. 
0002 GLAVA OD SADRE NAD IESZKIM, OKOVANIM HRASTOVIM V~AI\[HA I HLADNA KvAKA. 
0002 USA SURMOUNTING THE HEAVY, IRON-BOUND OAK DOOR WITtl ITS COLD LATCH. 
0306 ZASTAO JE PRED STRANIM ZAKLJUCYANIM VRATIMA, I KAO I 
009~ HE STOPPED iN FI~O~IT OF THE UNFRIENDLY LOCKED O001t AND, JUST AS ON T 
0004 ZASTAO JE PRED STRANIM ZAKLJUEYANIM VRATIMA, I KAO I ONOG JUT 
000~ HE STOPPED IN FRONT OF THE UNFRIENDLY LOCKED DOOR AND, JUST AS ON THAT NOR 
0006 GDJE SE JE KAO HALl DECYI~O IGRAD SA SVOJIH BIJELIN JANJCEH, ~,TAJALO JE GRA 
0006 ERE AS A BOY HE HAD PLAYED WITH HIS WHITE LAHB, THERE WAS A ~UILDING-SITE W 
0006 E JE KA~ HALl DECYKO IG;tAO SA SVUJIM BIJELIM JANJCEM, STAJALD JE GRADILISZT 
OODb $ A UOY HE HAD PLAYED WITH HIS WHITE LAMB, THERE WAS A BuILDING-SIrE WALLED 
0006 JE GRADILISZTE OBZIDANO KAD rYOVJEK VISGRIH ZIDON I NA TO~4 VISO~UN zIOg I~\[ 
0006 -SIFE WALLED IN LIKE A MAN BEHIND A HIGH WALL, AND ON THIS HIGH WALL THERE 
0009 DUGD JE STAJAU POD VITKKM ZZENSKIH STEZNICINA, A PRST\[ SU 
0009 RE FOR A LONG TIME UNDER THE SLIH CORSETS, AI~D HIS FINGERS WERE ALL DIRTY N 
0009 DUGD JE SIAJAO POD VITK\[M ZZENSKItt STEZNICINA, A PRSTI SU MU OIL 
0009 A LONG TIME UNDER THE SLiM CORSETSt AND HIS FINEERS WERE ALL DIRTY WITH DUS 
1.10. The reason why these four concordances have been 
presented under one processing stage (9) is that, first, 
~e are not sure whether we can afford the computer for 
each of them, and, second, we do not, at this point, know 
how selective each of them is going to be. A considerable 
reduction of the text to be concordanced can be achieved 
in reverse concordancing if we restrict ourselves only to 
word~, ending in a characteristic morpheme with clearly 
foreseeable contrastive analysis potential (such as -e._dd, 
-l_~! -est, -in/~, -ness, -less, etc. in English, and -ao, 
-vsl, -e--~n, -sc_..~u, -o-~, -~etc. in Serbo-Croat). -- 
1.11. It may be pointed out here that, irrespective of 
how restrictive the selection of keywords for concordanc- 
±n~ may have to be, no concessions should be made in the 
principle of bilingual approach. Only if, in our investi- 
6ation of the contrastive potential of individual ele- 
ments, we strictly observe the approach from both the 
English and the Serbo-Croat texts, can we be certain that 
we shall hsve covered all possible contrastive description 
patterns based on correspondences in both corpora. 
2.0__._t. Once contrastive concordancing has been completed 
we shall still be facing some practical technical problems. 
2.1. Project analysts, for instance, will often have to 
be provided with slips instead of computer printout sheets. 
Only if the material being analyzed is in the form of 
slips will they be able to classify and reclassify the key 
elements swiftly and flexibly (by putting together, break- 
ing up and re-establishing batches of slips). 
2.11. Cutting up the concordance printouts to get the 
slips is not very practical in view of the varying size of 
contrasted pairs of elements with their context (cf. n. 9, 
second half). The way around this, clearly, is to have the 
pairs printed out at regular intervals with sufficient 
blank space in between. This, however, would probably 
triple the amount of printout paper required. Also, this 
is complicated further by the need for a number of copies 
for each pair (slip), because of simultaneous demands that 
may often be made upon the same slip by several Project 
analysts, approaching the same element from various des- 
criptive levels. These copies could be secured by using 
special, multiple-carbon printout paper, but this might 
prove quite expensive. 
2.2. In view of all this, the Yugoslav Serbo-Croat/ 
English Contrastive Analysis Project has envisaged the use 
of a Flexowriter here as an alternative method. This ma- 
chine has already provided us with the paper tape of the 
Serbo-Croat translation of the reduced Brown Corpus, plus 
the tapes of Serbo-Croat originals and English transla- 
tions of the Control Corpus (cf. Stage 5). The missing 
paper tape of the English text of the Brown Corpus can be 
obtained on a magtape-to-papertape converter. Once both 
paper tapes are ready, running them through the Flexo- 
writer provides us with up to 13 (some claim 20) carbons 
of each contrasted pair. An additional advantage of using 
the Flexowriter for slip duplication is in the less awk- 
ward shape of slips. Paper tapes reproduce the text in 
60-character-wide lines of the original translators' type- 
script, as opposed to the llO to 120-character streamers 
of normal computer printout (unless the concordance print- 
out was programmed for a narrower format, requiring con- 
siderably more paper). 
2.3. The resulting slip files of sentence-numbered 
English and Serbo-Croat texts, coupled with the Project's 
basic (monolingual - forward and reverse) concordances, 
can now be used as a replacement for contrastive concor- 
dances. It would work approximately like this: upon receiv- 
& 
ing an analyst's request for examples of all corresponden- 
ces in the corpus of an element under analysis, the Project 
headquarters in Zagreb would look the element up in one of 
the basic concordances, record semtence numbers of all the 
occurrences, extract slips bearing these numbers from the 
Flexowriter-produced slip file, and forward them to the 
analyst for further research. 
Footnotes 
I. Launched in 1968, at the Institute of Linguistics, 
Faculty of Arts and Letters, Zagreb University. Direc- 
tor: ~rofessor Rudolf Filipovic, Ph.D. (Fostal address: 
Jugoslavenski projekt za kontrastivnu analizu srpsko- 
hrvatskog i engleskog jezika, Institut za lingvistiku, 
?ilozofski fakultet, Djure Salaja 3, Zagreb, Yugosla- 
via). Project analysts, numbering 20, are on English 
department staffs from all parts of Yugoslavia. 
2. ~ize of storage: 32K. Other equipment: three 2311 discs, 
two 2415/4 tape drives, one 2540 card reader, one 2671 
vsper-tape reader, one 1403/2 printer. 
). .,riLten by Dipl.ing. ~ilutin Cihlar, Chief Programmer 
of the Zsgreb system. 
4. Cf. ;.snual of Information (for the Brown Corpus), Brown 
~miversity, 1964, p. 7. 
~.~ .,e ~ope to use forward and reverse concordancing prog- 
ra::~s developed by a US project for an IBM 360/30, or a 
si::~ilar machine. 
6. In a total reverse concordance they would only appear 
in a different place: of under F, had under D, etc. 
7. ~uttimg the top lO0 words from the Brown Corpus Rank 
List on the exclusion list (compared to a total of some 
1SO "function words", in the present author's estimate), 
would reduce the text by 47.4 per cent, while including 
only one morphologically marked word (YEARS) and two 
lexical words (~iEW, TIME). Expanding the exclusion list 
to cover the top 200 words would probably not be econom- 
ical (though only two additional morphologically marked 
words would be included: UI~ITED and STATES), because 
the computer would be slowed down, whereas the textual 
8. 
9. 
mass would be reduced by only 6 more per cent (to 53.6 
per cent. 
Which may take between 40 and 60 computer hours, as op- 
posed to an estimated 2,350 hours of manual processing 
(for only the English forward concordance at that). 
In addition to being simulations, all these concordance 
samples are in an idealized format, with the correspon- 
dences spatially parallel to the keyword. In practice, 
however, it is impossible to achieve this ideal textual 
parallelism, because there are no other formal signals 
to govern it, except the sentence sequence number which 
can only mark the sentence as a whole. 
For this reason, the actual computer concordances will, 
when ready, have the correspondence to the keyword 
printed out with the whole sentence in which it occurs, 
under the single line with the keyword. This will, natu- 
rally, increase the size of the concordance, but not 
more than about 50 per cent in our estimate. This is be- 
cause only an approximate 40 per cent of all sentences 
in the original text of the Brown Corpus are in excess 
of 20 words (which can be accommodated by the average 
printout line). A mere 6 per cent of these sentences 
are longer than 40 words, requiring, consequently, :.ore 
than two printout lines. 
4 
