ANNOTATING 200 MILLION WORDS: 
THE BANK OF ENGLISH PROJECT 
Timo Jiirvinen 
Research Unit for Conq)ut;;Ltional l,inguistics 
Uniwersity of Itelsinki 
Abstract 
The llank of English is an international English hm- 
guage project sponsored by llarper-Collins Publish- 
ers, Glasgow, and conducl;ed by the COBUILD team 
at the University of Birrnhlgham, UK. The text hank 
will comprise some 200 million words of both written 
and spoken English. The whole 200 million word col 
pns is being annotated morphologically and syntacti- 
cally during 1993-94 at the Research Unit for Cor,,- 
Imtational Linguistics (IL/I(3L), University of Ilel- 
sinkl, using the Fmglish nmrphological analyser (ENC,- 
TW()I,) and English Constraint (:h'ammar (EN(:I(:.'(:~) 
parser. The first half of the texts (103 million words) 
has ah'eady been processed in 1993. The project is 
lead by Prof. 3ohn Sinchdr in Birmingham, and l'rof. 
Fred Karlsson in Ilelsinld. The present author is re- 
sponsible for conducting the annotation. 
In the introdnction of this paper the r,.:mtines Ibr 
dealing with htrge text corpora are presented and 
our analysing system outlined. Chapter 2 gives an' 
overlook how the texts are preprocessed. Chapter 
3 descrihes the lexicon updating, which is a prelim- 
inary step to the analysis. The last part presents the 
li;N(:'C(~ parser and the ongoing developtnel,t of it.s 
syntactic cornponent. 
1 INTRODUCTION 
Each rno,fl.h the (~OBUILI) tean-i supplies an apl)rox- 
imately 10 million word b;ttch of lw:trl~.tlp coded run- 
ning text (see Appendix A) in ASCII format. Every 
new batch is lh'st scanned by the EN(TI'WOI, lexi- 
eel and rnorphological analyser \[Koskentdetni, 1!.183\] 
in filtering rhode for the purpose of detecting words 
not inchaled in the present lexicon. This is followed hy 
a semi-autou-mtic tlpdal.illg of the lexicon. After these 
ad,iustments, the whole sysl.em is used for annol.at.ing; 
the data. 
()ur mmlyslng system, which is presented in del.ail 
in \[Karlsson, 19{),1\], consists of the following successive 
sLages: 
• preprocessing 
• EN(V\]'WOL lexical analysis 
• EN(:K~G morl~hological disambiguation 
• ENGC(.-I syntactic mapping and disamhiguation 
'\]'he main routines performed on the rnonthly data, 
includitlg constant monitoring of hoth inconting 
texts and ~malysed output and management (doc- 
umentation, backul~S ) are closely linked to the hi)- 
dating of the preprocessing module and the EN(L 
TWOL lexicon. 
2 PREPROCESSOR 
The preprocesshlg moduh,s stan,ihu.di:+e the runnh+g 
text and tokenise it into a, fc.z'm suitable for I,hc:~ 
EN(VI'WOI, lexica\] aualyser. 
I'~N(~I(,'G has been developed so that it, takes lute 
;tccotllll, Variolls Lexttl,'d codhlg cotwelfl;iofts \[l(al'\]s- 
son, 1994\]. "W'e In;ave develoF, ed i~reprocessing pro- 
cedures furthc:r to ca.ter %r the: dillbrc.nl, t.ypes of 
nmrkup c,w.tes systema+tica.lly. Since l.exl;s usually 
cofne frolft varJotls sollrces+ t\]lel'<~ frilly be tlrldocll- 
rnented i(\]\[osyllcra, cies or systelflat, ic cl'rops ill SOlfie 
S;II ill)leg. 
The inh)rmation cotweyed by the markup codes 
is lntilised iu the parshlg process. ,Ul-+lati,g the 
i~reprc.cesshtg module to achiew.~ t,he highest, possi- 
hie systematisat, iotl is therefore consldcred worth- 
while. The present system cau ,,teal with any 
code properly if it is used unambignously in either 
a selfl,encc~delirrtitlng function @4'; codes indicai,- 
ins headings, para.o'aph marke,'s), sentenlce-intertml 
function (e.g. I+cmt change codes)or w~,'d-iuternal 
(e.g. accent, codes) fmlction. 
Since I+relm~cessint,; is the lirst sl.ep helYu'e lexicat 
lilt.erht,e;, it. inldk'atcs the kinds el I dillk'ulties we are 
likc'ly to enccnmter. If error messages are produced 
at this st.aD~ , 1 do i.he nmcessa+ry adiustluenll.s l.c, the. 
preprof:f'SSOl' tlllLil iL 51!elll.q I,o iH'Odllce \[.he oIItpllt 
smoothly, l'h','ors in pr<'I~rocessing may occasio,mlly 
result in a tmmcation of lengthy passag;es of text, or 
even a crlls\]l. 
It is inq-wl.ant k~u' the utillsation of l, he corpus 
that no iuformatlolJ is lost during st.andardisal.iou. 
Therefore, we aim to marl+ all correcl;ious Jumle +.o 
the text. For exalul',le ~ I.he preprocessor hlsenq.s 
a code marking the era'reel.ion when it. sep;tr;tt.c's 
strings stnch as oJ'l\]Je "rod andthc. 
Most errol's are not corrected, StlCh ;IS COllfttsioll of 
selltence, Imundal'ies, trtlltCatlOll ofselll,el|ces dllO~ I,o 
running h(!&dillgS ()r page nmnlmrs, znisl+lacenJent 
or douhlint; of blocks of text, etc. 
565 
3 THE LEXICON 
Filtering produces a list of all tokenised word-forms 
in the input text which are not included in the cur- 
rent ENGTWOL lexicon. The most eon-nnon types 
are taken nnder closer scrutiny. It has to be decided 
whetlier these are genuine word forms or non-words 
(e.g. misspellings). 
At the begimfing, I used several (I;tys to update the 
lexical module for a new batch of text but experi- 
ence and increased coverage of the lexicon haw~ di- 
minished the time needed tbr this task considerably. 
\[ have added words above a certain frequency rou- 
tinely to the ENG'I'WOL lexicon. The fi'equency 
is no~ fixed but de~ermined by practical considera- 
tions. For instance, when the data contain a great, 
deal of duplication (as in l, he BBC material owing 
to the repetitive nature of daily broadcasting), sim- 
ple token fl.equency is a poor indicator of what is a 
suigable item to add to the lexicon. IIowever, sarn- 
piing methods have not heen developed t.o optimise 
the size of the lexicon, beea/ise it is not crucial for 
the present purpose. 
My lexieal practices differ sornewhat fi.om the 
updating procedure doeurnented in \[V'outilain,'n, 
1994\]. If our aim is to SUpl)ly every word in run- 
ning text; with all prol)er rnorphological and syn- 
tactic readings, we c,'Lnllot deprive frequent IlOll- 
standard words (e.g. htrn, veggie, wanna) of their 
obvious morphological readings because this might 
cause the whole sentence to be misanalysed. Siuce 
prescriptive considerations were not taken into ae- 
eotmt in the design of ENGTWOI,, many el~t.rles 
marked as informal' or lang' in conventional dic- 
tionaries were added to the lexicon. I have also 
included highly domain-specific entries into the lex- 
icon if they were frequent enough in certain types 
of data, especially when heuristics might produce 
erroneous or incomplete analyses for the word in 
question (e.g. species of fish which have the saule 
form in singular and plural: brill, chub, ¢laTfish) t . 
()he advantage of iucluding all frequetfl, gralfldeal 
words tO the lexicon is that EN(TFWOL filterlnl2; of 
incornitlg texts produces output which can he more 
reliably dealt with by autoulatic nleaus. \Vh,m :111 
frequent nonstandard and even foreign words are 
listed in the lexicon, the otttput can be used in a 
straightforward way for generating new entries. 
The procedure of adding new entries to the lexicon 
goes its follows: first, all words are classified aec¢~rd- 
ing to the part-of-speech they belong to. Second, 
new entries in the ENGTWOL format are generat.ed 
automatically from these word-lists using ready- 
made tools presented in \[Voul.ihfiuen, 1994\]. Lists or 
new entries are carefully checked up, and additiolml 
feat.ures (such as transiLivity and complemenl;ation 
IThe default category of morphological heuristics is :t 
singular Iioun. ht the case of a potential plural form (s- 
ending), au underspecified tag S(I/PL is given. 
Datures for verbs) a.re suPl)tied rnannally. In de- 
scribing the items, 1 h;we relied mainly oil Collins 
COIIUII, D \[)ietionary (i.¢)87) and Collins English 
Dictionary (19.¢11) which have been avaihible for us 
ill electronic form. Ilut when the usage a.nd dis- 
tribution seems to be /lllelea.r, \[ have generated an 
on-line concordance directly from the corpus..qlnee 
I have dealt with words which have a frequency of, 
say, at, least 10 tokens in the corpus, tJfis method 
seems to be quite reliable. 
We cannot detect errors ill the lexicon during the 
initial liltering phase. ()nee a certain string has had 
one or more entries in the lexicon, it is not present 
in the output of the filtering, and other potential 
uses might not. be added to the lexicon ~. And fi'e- 
qllent errors telld to get., corrected since all incorrect 
analyses detected during the manual iuspee(.ion a.re 
corrected directly it, the lexicon. 
The I'\]NCI'I'W()I, lexicon which is used in the 
Bank analyses contains al\]proxlnmLely 75,0{}{} en- 
tries. Morphological analysis caters for all inflected 
forlns el' the lexical items. 'Fhe coverage of I.he lexi- 
con bel'¢n'e updating is between 97% - 98% of all 
wo\['d-fclrtll toketls ill l'llllllitlg text. Al~pemlix A 
presents the nurnber of additional lexical eutries 
generated from each bateh of daLa. The cumula- 
lave treml shows that a w~ry small nurnber c+f new 
entries is needed when analysing the hll.ter half of 
the corpus. 
Morphc, logical heuristics is applied afl.er I'3NG- 
TWOI, analysis as a separate module (by Voul.i- 
laiuen, Tapanainen). It assigns reliable analyses to 
words which were not: iueluded iu I,he lexicon. This 
also coutril)utes to the fact; that lexicon updating 
will be a minor task ill the future. 
4 ENC.,CC, D\]SAMBICIUAT1ON 
AND SYNTAX 
English Constraiut (h'armuar is a rule-based roof 
phologieal and depeudency-~)riented surface syntae- 
tic ;ulalysor id" runnill,p; English text. 
Marl)hologieal dlsamhlguat.lon of rruiltilfle p:lrt-o\['- 
speech mid other inlh~cLi~m:d t.ags is carried out. he- 
\[',re syllt;l¢lrlC almlysls. M(wphc)logieat dls:lluhigua- 
t.ion reached a mature level well hef, re the begin- 
ning of t.his project. (see evaluation ill \[\Zoutihfiuen, 
19.9~\]). 
'\]'he morphological disand)iguatiou rules (same 1100 
ill the present grauunar) were writteu hy Atr. 
Voutilainen. The I~auk data is analysed using 
both gralnumr-hased' mid heurisl.ic' disambiguation 
rules. This leaves less morlqlological aulhiguity (be- 
low :1%), although the errm' rate is st~ill extremely 
low (belmv 0£%). 
2Although missing entries are possihle to tirol huli- 
rectly, e.g. Ju9 and -cd forms ill t.lle filtering output imli- 
cages that the base form is not. described in t.he lexicm~ as 
a verb 
566 
d.;\[ Curl:eiiL state of ENGCC., syntax 
The first, version of I~N(I~C(', syntax was written hy 
Arto Anttila \[Anttila, 1994\]. At the beginning of 
tile Bank project, new Constrahit {h'amnlar Parser 
irnlflementations for syntactic rnaplfing and disam- 
l)iguation were written by Pasi Tapanainen. These 
have been tested during the first, nlonths of this 
project. ,qome adjustineut to the syntax was needed 
to cater for new speeitications, e.g. in rule applica o 
ti<m order. 
\[ have tested all constraints extensively with differ-. 
ent types of text rroin the Bank. \[ have revised al-- 
rnost ;ill syntactic rules and written new ones. The 
current EN(IIC(I parser uses 282 syntactic niapl'lillg 
rules, 492 syntactic eonstraillts and 204 heuristic 
syrii, a.el, ic eonstra.h/ts. The lnapl~hlg rules should 
be l;he lrlOSl; relial)le, since they attach all possil)le 
syntactic alternatives to the inorpllologieally disaul- 
biguated olil,pllt. ~ylitaetie i'llleS I)rillle coiitextil- 
ally hiappropriate synt;ael.ie tags, or accept, ,iilSl; OllC 
conl;extuaily ;ipproprlal;e tlt\[r~, ~ynl.aetic and heuris- 
tic rule eanip(melits are \[}ll'liiiiily shnilar trill, th¢~y 
dilrer hi reliMfility. 1t. is Imssil)le ilot to use heurls- 
tie rules lit all if olie ahliS ill, lnaxhiially error-ri'ee 
out\]nit, hut the. eos~ is aii hlerease hi arul)iguity. 
I)urhig the project, the quallt.y of synDax ha.'; hn- 
proved considerably. '\]'\]ie itnrrei/t error i'ate~ ',vlieli 
pai'shlg~ new unresl.ricl,ed rliunhig text~ is appi'c~xi- 
Irl3tely 7(~ll i.e., 7 words (lilt or JO0 t,~r:t I.\]le Wl'Oll~ 
syntael.ic code. Ihit I,\]ie auihiy;uit.y rate is still fah'ly 
high, 16.d% in ~ 0.Sili word Samllh.~, which tile;illS 
thai, lfi w(n'ds olig o\[" i00 sl.iII hgve liiOl'l; l.htiil (Jilt? 
niorilhological or syntael, ic ;.llt0riiativo. Much o\[" the 
i'eiu,:lhlili<C~ arili)it,;uity is or the prepositional attaeli- 
lriei\]t type, This pa.rl, icLll\[u" tylle of ainl)ig;uil,y ae- 
eoiilll;s ror .:lpproxhriately Q()% el ~ all renlahliilg; alri- 
lligliil.y. More heuristic rules ar0 needed for prulihig 
the I'Olrlliiliill~_ ~, ambiguities. Of eoilrse> lllaily o\[" I.he 
rerrlahiilig alul)lguities (espechllly i'P al.tae\]ililel/I,) 
are \[r>enuh\]e and should I)e rel,ahied. 
The spee,.I ij\[' l.he whoh! sysgelll used hi Ill~.)l'l/h(li(it,~i- 
e;.ll and syntactic aniiotation is about 400 woi'ds per 
seeolid O11 iI, ~UN ,'ql)lil'e,~l,atioll |0/:l~(), 
4.2 DeveloI)hlg the syntax 
I,'aeilities J'l)r the fast eOml)ihli.ion of a parser with a 
iiow rule file and the spe(~d of the ana\]ysls iiiak(~s a 
very ~ood elivh'OilliieliL for the \]ingulst 1.o test new 
constl'ahit.s. 
A special debllgghig verslon of the parser can he 
used \['or testhig ptirpos(~s. The delJlig~giilg; W~l'SiOli 
(;akes tully disauibignated EN(\]~C(II texts as iullul.. 
Ideally, (':v0ry rill(: is tested against ~t represelltative 
SalYil)le frolri a, corpus. 'Fhis would sel; the require- 
iFleilt that the test eorp/ls should be lll~Lde of large 
randolri sauiples. Ilowever, it is l, hne-eolisunihi<t,~ 1.o 
prep.:tre li'la.i\]lially large ailiOllilts o1" corrected alid 
(lisainl.li<guatod da.La, 0veil frolll I"N(IC(~ ould)ut+ 
'l'\]iere\['ore, it very large test tin'pus i;~ I)eyond tile 
scope of this t)roject. 
The c.urrl~lit syntaeLic test, COl'pUs contains approxi- 
niately :10,000 words. It is large ellOligh r(n. tesl:hip: 
reliable syntactic i'il\]eS, but if we Walls to rate the 
aeceptahility or heurisl;ic syntael, ie rules> \[i larger 
Sylltaetic Cfll'jlilS wouhl lie lleeessal'y. The tesl, eor- 
l)tls efnlsist.s or 16 individual I,cxL sarriples \['roiri the 
flank o\[' \]'\]nglish da.ta. The texts have I)ee:n chosen 
so that; they l, ake text t, ype vari:.ltiou hire ;tccrltliit. 
All salullles but one are eontillu{)us, unedil.ed glll)- 
lml'ts oJ" the eorpllS. 
It seems WOl'thwhile to eoilthllle l\]reparhi<~ ;t disani- 
biguat,ed eorl)us rrolrl seleei,ed piee,,s or I;exl,. ()llee 
tww data is reccfived, it. is (!xpedielil, to ;i(l(i a rep- 
resent.;d.ive Salliiile I'rOlil it to the test COl'tillS. A 
Inauualiy (\[isarlil\]igjlatc(l Lest e(u'pus e(msl;itutes a 
very sl;raiyflitrorward dclcuirlental.ion (;)\[' 1.he apr, lied 
parsiug sclmnm (as described in \[,~anG~s,,u, 1987\]). 
5 C()NCJL(IS\]ON 
The analysin.c sy.~teni h:ls reached ~ irial.lire sl.age, 
WJlere ill\[ I.echnica\] pr(li)ielliS seOIll to lie s(llved. \'V,~ 
have develol>ed ilt(!l, ii(Itl~ deal\[ili-~ with I.lle dal.;i, wil.h 
a COllsideraiile de/.,jrce ()\[' ;lllttililatisa{.i(ili. I'\]Nt(\]~C(7~ 
it;IS pl'<wed Ixl Im a. fast ;lil(i ;icelll'\[li;e rtlJe-i)a:~ed .qys- 
telu f~u' anaiysliig~ illlrestricted I.exl;. 
\,Vrithig and dOCuluenthig Ii',N(~CC syntax will be 
the lliaili e(}licerll durhlg I.he \['clll()wiu<0; lil<}llttl.<J. ()llr 
i',art, or the i,roj(,ci, will lie O<>ll'll)\]el.ed hy March> 
1995. 
It is p(~.~sihle i.\]mi, the whMe 200--ulillion corpns will 
be aualyse(I al'resh uear I.he elm ,r tlw prctie(.t. 'l'\]lis 
wt)uld pill. I,o II.~ all I.he \[llllH'(ivelrlellts IlHl(h~ dlll'iM?~ 
tlw l,w()-year i}erio(\] and w~mh\[ Q;uar:ull;('e a umxilrml 
(Icgree or uuil'ormity aud tile ()verall aeelu'acy ~1' I,he 
allllOl.aJ ed (:or pits. 
(i ACKNOWLEDCI~MENTS 
.~lmehd Lhauks are due I,o llarper ColliNs Ihddish- 
ers, (i;lasgow, for l)ernlission to us,~ boi,h Collm:; 
C()IIUIM) aml (Mllhl.~; English \[)ieiJouary ill elee- 
tronic form. I%rs.nally, \[ am greatly imlebted to 
Pasi 'l'apalmln~m for solutions to :m hw:d('ulahle 
mmd;er of techuieal \]n'ol)lems and t(} Arm \/ou- 
tihdnen I(n' guidance and supcrvisiorl during t, hi.q 
project. \] wish to thallk also l)r(fl '. Fred I<arlss(>n, 
,\]uha II~fikld\]/i, I<ari Pitldhlen and Sari Salmi,~uo r~>r 
reviewing earlier drafts .r this paper. 
567 
A List of annotated Bank of English 
data 
data size additional 
in words lexical entries 
Today 10,019,195 6,540 
'Times 1,g37 
BBC 
10,090,991 
18,076,12,1 3,379 
The l;~conomist, WSJ 11,195,100 455 
British Books 1 9,232,527 1,488 
British 13ooks 2 1:t,925,852 1,961 
\[Independent, Magazines 1'0\]199,542 1,143 
Magazines 10,36,5,173 1 ,(159 
American books 10,532,267 972 
Total: 103,636,771 18,8:14 
The table above shows the size of the 11 batches 
ailnotll.ted so far in words and the lnlnlber of new 
lexieal enl,rles a derived fi'om them. 
B An Example of the EN(-ICG 
analysed sentence (from the 
American Books data) 
The original text: 
<g> 
The situation at Stangord, to be examined 
in more detail later, is hardly unique. 
Annot, ated text: 
<t> 
"<The>" 
"the" <*> <Def> DET CERTRAL ART SG/PL @DN> 
ii • • <sltuat fen>" 
"situation" N NON SG @SUBJ 
"<at>" 
"at" PREP ~<NOM 
"<StanTord>" 
"stanTord" <*> <Proper> N NOM SC @<P 
"<$, >- 
"<to>" 
"to" INFMARK> @INFMARK> 
"<be>" 
"be" <SV> <SVC/N> <SVC/A> V INF @-FAUXV 
<oxamlnod>" 
"examine" <SVO> <P/in> PCP2 @-FMAINV 
"<in>" 
"in" PREP @ADVL 
"<more>" 
"milch" <Quant> DET POST CMP SG OQN> 
"<detail>" 
"detail" N NOM SG @<P 
"<let er>" 
"late" ADV CMP @ADVL 
,'<$,>" 
"<is>" 
"be" <SV> <SVC/A> V PRES SG3 VFIN @+FMAINV 
"<hardly>" 
aThe Sallie VVSJ niaterlal fi'cmi A(I,'L has \[leen used ill 
updal.ing the EN(\]'I'W()L lexicon llefore this project 
"hardly" ADV @ADVL OAD-A> 
"<unique>" 
"unique" A ABS @PCOMPL-S 
,,<$.>,, 
Syntactic tags, listed in \[Tapa,uainen, 1994; 
Voutilainen, 1992\] are marked with an at-sign 
(@). The shallow syntax distinguishes faur wn'b 
chain hlbels and nominal head and nlodiller 
functioris. Modilier fimctions have ~ pointer (> or 
<) 1,o t, he head to the right or to the lefl;, 
respectively. PP and adverbial attachnmnt is 
solved when it can be done reliably. 
References 
\[Auttila, 1994\] Argo Anttila. 199'I. Ilow to recog- 
nise subjects in EnglMi. In I(arls.,Ion £4 al 1994. 
\[Garside, 1987\] ltoger (larside, (ih~ofl'rey /,eech and 
(iieoll'rey Sa.rnpsoli. 1987. The Corupul.ationaI 
AImlysis of l;;uglish - A Corpus-Based A l~proach. 
Lolldol/: \[,OI1~111\[111. 
\[l{arlsson, 1994\] I"red Karlsson. l\[194, l\[ohusl; pars- 
ing of unc, nstraiued t.ext, hi grelh!l:e Oost- 
dijL' and l"ieler de llaan (eds.), (.;'orpus-lmsed 
Research Inlo Langnage. , pp. 121-1,12, Kodopi, 
A rnsterdain-At.hmt a.. 
\[l(arlsson, 1119,1\] Fred Karlsson. 1994. The formal- 
isrn and l';nviroument of (:(ii I'arsiug. In l(arlsso',. 
/~!: al 1994. 
\[l(arlsson, 19!)'1\] Fred I(arlss~m, ,,\fro Vcmtihli- 
hen, .luha I1eikkilii mul Arto Ant.tile (eds.). 
1994. Constraint (\]ralrnmu': a. Language- 
hldel'lendent Sysl;ei'll l~ll' Parsing Ill,restricted 
Texl,. llerlhi/New York:Mot,tom de (\]ruyter. 
\[l(oskennieuli, 1983\] Khiuricl Koskennh'.ilii. 1983. 
Two-level lriorpholo.t~,y: a. geileral coinpul;aliioriM 
lflodol for word-forrri reco<c, jiiition and productioil. 
Pullllca!.ions llrO. 1 I. l)epl;, of (\]el,era\] l,in/.,;uls- 
I.ics, Uniw~rsit.y of Ilelsinkl. 1983. 
\[,%lUll~S<.l, 19<<47\] (h~,dl'r,Ly Sailll.a,ll, 1987, The 
,l~t'~lllLILl;~l.icll\] dat~d>ase aud \]liii'Sillg .~chelil~. lii 
Carside 1987, pp. 82-96. 
\['l'allau:iili,!il , 1!)!14\] I'asi 'l':llmnahlen iliu\[ Tiulll 
,lih'viueli. ~yul.acl.ic allalysis of iial,ill'al lailgila<tg! 
ushlg lillgilistic rlllC.s aw.\] corpus-imsed pa.tl,orrls. 
hi proceedings of COLIN(7-O/j. Kyoto, 1994. 
\[Voul.ilahien, 199'2\] ktro Voul,ilainen, 
,\] lille lleikkili{ and ArLo Antl.ila. 1992. (Jolisl;railit 
~l'lllflllltir of I;',il,glls\]i. A l>erfornianc.l>()rlenl;ed \[ll- 
troductlon. Publications No. ~l> I)el~artrrlent of 
C, ellerill l,inguistics, Universil.y o\[" Ih'lsinki. 
\[Volltil;ih/ell, 199,1\] At.re \lOlll,ilaillell lllld JIllia. 
IlelkldliL 199.'1. Coull)ilhi<g and te.qthit,; the lexi- 
coil. hi \](arlsso'n ~7 al 199d. 
568 
