An evaluation of the usefglness of machine translations 
nroduce~ at the National Physical Lab.oratory. Tea diD~ton, 
with a summary of the translation method. ~ ~ ~./~ 
l / 
Introdnction 
/ 
The machine translation project at the National Physical 
Laboratory (NFL) has b?en terminated. It has always ha~ as its 
prime aim a demonstrat~n of the practicability of translation by 
computer of Russian scientific texts into En@lish. In order to 
test how far this aim has been fulfilled and further, to provide 
evidence to 6~i~e a potential agency intereste~ in givin~ a machine 
translation service, we he.carried out an evaluation experiment 
on our translations, the conditions of which as far as possible 
emulated those of a translations service. 
The results of this experiment are presented in this paper, 
together with a statuary of the translation methods used. The 
paper as a whole will thus give an independent presentation of "what 
methods produced what results". For a comprehensive account of the 
NFL translation techniques, see reference I. 
Evaluation of Translations 
We have been concerned with the translation of scientific 
Russian texts only. In considering how we might evaluate the 
results of our work, the context of use of scientific translations 
imposed two main constraints. Thus, firstly, in the vast m~jority 
of cases we woul~ expect readers of translations to be themselves 
experts in the subject matter of the material translated, i.e. they 
would be reading the translations because these reflect their main 
professional responsibilities. We may then expect that the inherent 
background knowledge of such readers will ensure a hiKh impetus to 
their comprehension of translations and help them through syntactic 
awkwardnesses and multiple-meshing choices. We would also expect 
that only a small peroenta6e of these readers would have any 
competence in Russian. Secondly, the items of translation being 
read by the above typical readers will normally be whole infor.~tion 
units (journal article, chapter of book, abstract, review, &c.), and 
they will have the freedom to ignore unimportant sections of such 
units an& to use sentence or paragraph context (or even remoter 
references) to help elucidate obscure sections. More specifically, 
a particular sentence may be poorly translate~, but because the 
reader can see that this is not an important sentence or because 
the context of (hopefUlly, better-translated) neighbourin6 sentences 
clarifies its meaning, that sentence may not affect at all an 
adequate comprehension of the whole. 
Both these constraints are reflected in our evaluation 
experiment. We ensured that our evaluators were expert in the 
-I - 
field of the material they were evaluating, and also that they 
commented on the adequacy of an information unit as a whole, not 
on individual sentences. 
We have included in this paper (FI~. 2(A)) a short passage 
from one of the evaluated translations, as the full translation is 
inappropriate for this printed version. However, the full 
translation will be available for inspection at the presentation 
of the paper, or the full translation of another paper can be 
examined in reference 1, 
The evaluation experiment 
In order to fulfil the first constraint above, we invited 
practising scientists to send in Russian papers, reflecti~ their 
professional speciality and preferably in the fields of general 
physics, electronics, or electrical engineering. Some papers 
resulted from direct invitation, others resulted from an open 
invitation published in our house journal, "NFL Quarterly". We 
undertook to send them the machine translations of their papers in 
return for their comments on how useful the results were. We also 
obtained second opinions from other specialists in the subjects 
COnc erlle~ 
These evaluators were therefore as far as possible typical 
of the "customers" of a production MT service; in particular they 
had a personal interest in the subject matter and usually little 
if any knowledge of Russian. 
In all ~O+ papers were received in responce to our invitation; 
of these 28 were transl~ted in full I . 3even of these were dis- 
regarded for various reasonc 2, and the remaining 21 were included 
in the evaluation. 
38 comments were received on 19 of these 21 papers. Of these 
two were rejected for vagueness, az~ three brief comments from one 
group were treated as one, so in all the experiment produced 3@ 
comments on 19 papers. 
1The other t6 are accounted for as follows: I was on a remote 
subject; 2 were deferred since we had already translated three 
papers for the same 'customer'; @ were withdrawn; 3 were 
translated only in part; and 6 were not reached by the date our 
computer was sorapped~ 
23 were on inappropriate subjects; 
earlier version of the program; 
for inclusion. 
2 were translated only by an 
and 2 were translated too late 
-2- 
We had decided to give our evaluators a free han~ in dis- 
cussing the usefulness to them of translations of this quality. 
This meant that a scale had to be devised by which their comments 
could then be graded by us. A scale recently published in the 
U.S.A. (reference 2) was considered but not adopted since we felt 
that for our purposes more space should be given to the middle 
range of the scale. The following wording was adopted~ 
Fully adequate. Meaning immediately clear, even though 
not always conventionally expressed. 
Mostly very good. A few sentences obscure, so that some- 
thing essential may be lost, but normally clear enough. 
Fair. Takes a good deal of time to extract meaning and 
even then there is no great confidence in it, resulting in a 
partial understanding. 
Poor. Could only be useful to someone prepared to struggle 
hard, and even he would often be disappointed. 
Useless. Although some semblance of meaning may appear 
occasionally, it would never be worth the trouble of finding 
it. 
The wording of this scale is not derived on ar~ scientific basis, 
but it has proved useful in practice, since when four of us came 
to grade the comments by it independently, there was a good agree- 
ment between our markings. Our four individual ratings for each 
comment were reduced to a single rating (normally the mean) after 
discussion. The range of scores is shown in FIG. I ; the mean 
score is 5.6. 
The spread is no doubt due to a real variation in the quality 
of the translations combined with the prejudices and degrees of 
patience of the evaluators. The lowest scores thus come from 
impatient professional translators dealing with a poorer-than- 
average text, while the highest ones are perhaps over-enthusiastic 
supporters dealing with a better-than-average text. 
The consensus though, is that there is a real demand for 
translations of this quality, and this result provides, we feel, 
ample justification for mounting a broader evaluation exercise, 
over a wider range of potential readers of such translations, to 
strengthen, if possible, this verdict and make it possible to 
-3- 
FIG. 1 Assessment of usefulness of N.P.L. MT output. 
No. of 
comments 
13 
12 
II 
IO 
9 
8 
, 
6 
5 
4 
3 
2 
I 
O 
34 comments . 
usefulness : 5- 6 
Grading of usefulness 
decide on the viability of a Production machine translation 
service base~ on our system. 
Evaluators ' _ criticisms 
Apart from the opinions as to the general usefulness of 
translation, evaluaters' 'comments contained ma~y particular points 
of criticism which ~eserve discussion. We are able to comment 
ourselves on some of these points from the position of having 
done considerable aevelopment work, just short of full imple- 
mentation, on techniques designed to overcome the particular 
translation faults. Yull details of this further work are given 
in reference 1, an& specific points of reference are given below. 
Most of these criticisms can be classified into three groups, 
concerning respectively: (i) the English equivalents offere~ 
(ii) the syntactic resolution and (iii) the wor@ order. 
A frequent criticism concerned missing or inappropriate 
equivalents. In addition to fully justified remarks of this 
kind there were also cases in which the meaning proposed, or pre- 
ferre~ by the rea~er, was uncommon. Its absence from the 
dictionary was the result of a preferential choice having been 
made, a compromise between completeness and simplicity. The other 
alternative, including all possible equivalents, would of course 
drastically impair readability. The particular solution is often 
very difficult and can only be achieved to a satisfactory degree 
after lon~ experience. 
In other oases there is no obvious preference an& the problem 
is further aggravated by the very high frequency of occurrence of 
the word. Here belong some special classes, for example all 
prepositions and some very common words such as w , a , an~ 
~ro . Prepositions can and should be resolved by considering 
them together with either the governin~ word or the governed com- 
plemsnt (nominal or otherwise)1. (For example, yBe~T~.. ~a.., 
' to increase .... by .... ' ). For the awkward common words 
specific syntactic sub-routines should be devised~ in practical~y 
all cases the solution is unique (see reference ~). 
Only two evaluators complained about the necessity of 
selection among two or three equivalents. This is a matter of 
preference, but it seems to us that for a bona lids reader an 
additional possibility of meaning (if it is not carried too far) 
is more an asset than a disadvantage, even if it impairs to some 
1On the lines already used for the reoo~mition of idioms, 
expanded to include non-adjacent words; see below in the 
summary of methods. 
-5- 
extent smooth reading I . Until a semantic analysis can be 
achieved, multiple equivalents are bound to stay in M~. 
A minor point, but nevertheless worth attention, was to the 
effect that when multiple equivalents followed each other, the 
difficulty in understanding increased out of proportion. For 
example, c~yqaeTcN nt~ appears as: 'occurs in ' when the 
results with 
actual meaning is often 'results in'. This was undoubtedly a 
real problem, which could perhaps be helped by using a longer 
space between sets of multiple equivalents in the output. 
Complaints concerning un-idiomatic translations (e.g. 'period 
of work' instead of ' life-time' ) would be allayed by more work 
spent on our idiom list, which contained only about 5@0 items, 
whereas 1,500 would be a more realistic figure. 
Complaints about inadequate syntactic analysis, leadimg to 
obscurities, ambiguities, and wrong resolutions, would have been 
considerably reduce& by a full implementation of the syntactic 
routines described in reference I. One of the minor but 
annoying ambiguities, which ha~ been resolved theoretically, but 
only partially implemented, was that of adver~short adjective. 
Order of clause components was a frequent subject of criticism; 
of course they can be re-arranged according to the English usage 
only after a complete analysis has been made. 
Among other things criticized was an inadequate treatment 
of abbreviations and abbreviated units, some of which were cover@d 
by dictionary entries, while others were not, and this led to some 
misunderstandiu~s. Obviously this again is a matter for a more 
complete dictionary'. The most difficult case is "nonce" 
abbreviations (we met, for instance, He~Tp. for He~TpO~HN~ 
and produced 'non-itr.', which helped no one') Here we see no 
prospect of a solution. 
Our "anglicizing" routine was criticized (while appreciating 
the general idea) for unorthodox transliteration, which made it 
more difficult to identify the word in a standard dictionary, if 
necessary3. A partial solution may be to exclude certain word 
1Much can be said on this point. Readsrs, no aoubt, will realise 
how a velvet smoothness of translation may hide ready a grievous 
fault. 
~ith a few exceptions, however. Thus'B' may be very trouble- 
some, as regards the choice between the preposition and the 
abbreviated unit ("volt"), without a special syntactic sub- 
routine. 
3This criticism clearly implied some knowle~e of Russian. 
-6- 
classes, e.g. acror~ymic abbreviations, which are obviously not 
suitable objects for the routine (they can be automatically 
reoo~xized as clusters of capital letters). Also, in our 
prefix-recognizing routine there is an inherent &anger that a 
"not-in-dictionary" word may have a part of the stem identical 
with an accepted prefix. This applies in particular to short 
prefixes, like He-, in the above example of He~Tp.. There is 
no general way of dealing with such words. The best solution, 
in respect of both routines, seems to be, however, to include in 
the output both the original (in Cyrillic, if possible) and the 
synthetic equivalent for all "not-in-dictionary" words. 
A few comments contained bouquets rather than brickbats. 
One evaluator commented that the translation became easier to 
read as he got used to the unusual 'style'; and another found 
an instance where a slip in the published human translation had 
reversed the intended meaning; our version of the passag~while 
not perfect by ar~ means, was certainly not misleading in this 
way. 
Finally, several evaluators commented that machine trans- 
lations would need to show advantages in cost and speed over 
human translations in order for them to be attractive as well as 
acceptable, and these are indeed criteria that we would ourselves 
put forward without fear of contradiction. V~e have not included 
a studs of cost and speed within this evaluation experiment, as 
we do not have the market data to prepare a translation service 
specification that we could then refer such a study to. However 
it is evident that our machine equivalent of the human translator 
i.e. input punchin~ machiue translation and output printing 
~ith no humanpost-editor) will show a clear advantage on both 
these points. It would be essential to fit this component, 
though, into an overall translation system which was specified 
carefully to fit the translation market. 
In Yl@. 2(A) is shown a facsimile of a short passage of our 
machine translation into English of a Russian text on electric 
furnaces, completely non-post-edited. The vertical lists of two 
or three words are to be read a~ alternative English correspondents 
for the Russian word in that position. FI~ 2(B) is a facsimile of 
the original Russian text. 
Asummary of the translation methods 
Text Preparation aud Dictionar~ Look-up 
The dictionary used in the NFL -~chine translation system 
was developed from an early version of the Harvard Russian-Er~lish 
computer dictionary. Our dictionary contains about 48,000 
entries (with additional cross-reference entries) covering the 
fields of electronics and electrical engineering. 
We chose to organize the dictionary on a stem and suffix 
-7- 
.PI&. 2(A) English ~aohlne ~ansla~ion 
O O 
o "~ 
,-.4 0 
o .~ 
o "~ ,--4 ~, 
o 
o ~N ~.~ ~ 0~ 
O O 
o ~ ~ ~ o 
,-C ~ 0 "~ .,-4 
-~ o 
"00 O O~ 
(D 
>~ ~ ~ O 
o ~ ~ ~ o "~ 
o o 
O~ 
o o ~ ~ o 4~ o 
u~ o © ,r4 
o~ o ~ ~ ~ 
(D 0 
~ -,-4 r--I 0 
• ,--¢ .r4 4.~ "~ o 
• rt r.-I O~ .-C 
45 
© 
4-~ e-t 
© 
"r't 
0 
0 
0 
C 
0 
4-~ 
'r-t 
C 
o 
o 
% 
,r-t 
,--I 
o 
% 
o © 
(1) 
m 
P~ © 
% 
o 
"lJ 
,r-t 
r-'l 
O> • r-t ,,r-I 
4-~.~ 
0 0 
r..) 0 
o 
o 
0~ 
o 
o © 
o 
c 
~0 
-8- 
FI@. 2(3) Russian original text. 
pacnpe e ewag, aaeKrpHqec: oro Toga. ,. 
Bau e pacnaaBaeHHoro Meraa a 
(anOaOam mexa. sayl¢. 0o~. A. H, dlE~'ILIHH 
tueecKud uadycmpuaAbguE uHcmumym u~. KyiLd~uteea 
I('KrpHqeCl{OFO TO-' ,qeHHH TOl(a B CH.tI'OIII'HbIX npoBoAff\[I~HX c pe~ax .nplf- 
,qecKoti neqn, KaK M(NIRIOTCR'TDH BH/Ia Mo;tenefi:npooo~nutne iidiac'r,I{ltbl. 
OJIbLUOe' TeODeTllqe- " H3\]I{ .qllCTblp 3JIPA(TD'O~qHTIIq~GKI,|¢ ~aHHbI tl 'DCLU, CTKFI, 
i'\]OSTOMy n.on'pocbh Ha. ceTl¢n. H,ono~bao~aaHe poLu,e~'m¢, Hint ceTm<, ~<a 
ICoRHa (J'I. I\], Tpe- conpoTHB~eHH~ ttMeeT 6eccnopH,oe .npeaMymecTaO no 
H3yqeHHa. Oco6yIo cpa.oHeHlllo C ocra.amlb~il~ ono¢o6aM~, TaK KaK n0" 
~aHHe' pacnpeJleae- :~o.naer Henocpe~tcTBeaHo xcc.neAoBaTh pacnpeAe.ae- 
~IM MeTaa~1oM-. Bbl- title TO~al\] ~ MoJIenH. 
a.Hoe netm ~acToa- ~ ocyuIecT.B~IeHH~I rlOJIO6H~l (~H3HqeCKHX '/\]DO" 
)MtlJIeKOHOFO peuJe-: ~eccon O6"b~liTa H MO~eJIIH IIeo6xo~HM d~'panH JibHbIl~l 
'o .nepeMemHaaHa~ ,nbt6op .x,pHTe,pae~',noz~odum Heod~0~aMb~e g JZOCTa- 
e.qeHH,e 'xapaKTepa TOqHble yCJI(~BHR I\]OI\[O6HR. ~bH314~IeCKHX R.Bae}~Hfi ycTa- 
~laTb 6oaee paano- Ha.B~.H'BaioTcR TpeTbe~\[ TeopeMo~ noao6aa, .~o~aaaH- 
\[ ii ,KOHCTpyKIII\[IO" BOI~ eute ~ 1930 r. M. B. I(t~pnHqea~/M. : 
qHc~e 14 .pa3Meu~e- 06mH~t ~¢parep,n~i n oAof.a, " " 
x ,ne~ax, npa~aa~,- k = l V/~-~, ' 
\[H . 3~OKTpOMarHHT" 
"13. B ~acToa~e~. r~e l--mme~ab\[e pa3Mepb~; 
c~ ~rpHMeHHTe.qbHO . • ~ ~yr.qoBa~ qaCTOTa; 
p-- MaFHHTHa~I rlpoHHilaeMocTb; 
aaaena0i ~eTaaae '. T -- y~eab~aa npoaoA~OCTb. 
OT ~OpMbI aaHltbI, I~pH pa.BeHCTBe (0 B p. oS~eKxa H MoAe~H Ha;16O- 
'pymeHH~ 9neKT.pO- aee ~aNtHblMH KpHTepHflMH .FIO/J.O6HR RBJ'IHIOTC£ pa3- 
ae~Tp~qecKHFI TOK Mept,~ H n poa.O~aMOCTt, MaTepHana CeTKt\[ ,',to#eaH. 
i ~xapa'xTepaayexcn. Pa~naaa.neaHbt/~ ,a rleq~t ieTa.n.n MOI'KHO n.peacTa- 
ypaageHneM $Ia: BHTB 13 BnJI.e ,CI1.qoIIIHOFO. 6.qoKa, a 3aTeM .Bblpe3aTb H3 
.nero aaeMeHTapnb~ ay6 mo6oro paaM.epa ~ 0n.pe~e- 
dl,nTb ero o~npoznaaeane. 3aMeHtta .~aeMeaTapHbff~ 
~---0 ~y6 pa.onnaBneHugro MeTaaaa y3J~oM 3neKTpHqecKo/i 
IJ, Ol'ltl MOJ\],e,rlH MO>IOHO Bbl,qBHTb paCnpeAenem{e TOKa 
HeM IL CM0.~Ie.aHpoBa,B T'qKItM 06pa30,'4 Bcto .BaHay 
neqn, MO~¢aO yanaTb xapaKTep pacnpeAeaeHan ToKa 
)e/Ibt; B ,pacnaa;aaeHno,Xl MeTaJ\]ae. 
Koro rio/Hi. }(OHCTpM~KTH'B'HO CeTOqHaR Mo,/le.at, (\]pe,o, CTa.B.a~eT 
TOKa qepe3 Mac-' co6of~ reOMeTptlqecKH ,FI0\]J, O6HBIH O61,eM BatfHbl ,B 3Ha- 
~aeTc~ JI.eI~£TBHeM qHTedlbHO ~MeHblLleHHOM MaCIllTa6e. 
~I~HUM 'roqKaM ~a " GOnp6~nB.heHHe ~ae~teHTa.pH~x ~yOm~.oa Me'Tadiaa 
rb BP.,KTOp .II~iOTHO'.. H;blllT~tTpye'Pc.q L-~OII,DO'PH'BdICHHe'M COeAII~HHTeJlbHbIX FIpO- 
:BO~lOB~Luara flqeeK Mo/le.nH. I.I\]ar CeTKIf 3aBHCHT OT. 
reOMeTpHq,ecKllX pa3MepOB O6"betCTa It MO,Xe.a.ti if, c.ae- 
./IOBaTe.rlbHO, OT 06tttepo .KO/IHq~CT.Ba ~qeeK MoAe.qtI. 
TO, qHOCTh MO.Ke,q:npoaatlH~ 6yAeT .TeM Bbil.JJe, qeM 
6o.~bme qlicdlo ~qeel(, OJIHBI~O C."lllLIll(O.",f 6oabmoe 
tlilC,qO flqeeK yXy~lmaeT yc,,'lOBIHt ll3Mepeltllfl li )'Be,rl.ll - 
qH'B~'leT Fa6apllTbl MO,/~ed\]'ll It MaTe.PH,q.qblible 3-qTp,.qTbl 
.~p~lqec~a~ tlenb aa nee. 
-9- 
basis, in which each entry contains a Russian stem together 
with a coded list of suffixes which can combine with the stem. 
This gave far fewer entries than would have been found in a full- 
form dictionary covering the same words. Each entry contains 
grammatical ~ata and English equivalents of the Rus~ian~ 
The stem and suffix organisation ~eman&e~ that we create a 
system of splitting Russian words consistently into stem an~ 
suffix, fully dssoribed in Davies & Day, (1961). The split is 
made at the point determined by the maximum number of letters 
which together form a Russian suffix or string of suffixes. 
The maximum split technique sometimes causes too mar~ letters 
to be treated as part of the suffix, in other words, the split 
is made too early in the worm. Such words are provided with a 
cress-reference dictionary entry which directs the search to an 
entry in which the full information for the word is contained~ 
The dictionary is recorded on two reels of ma6netio tape, 
theentries being arra~6e& in alphabetical ordsr. Time of 
consultation of the full d/otionary is from 12 minutes upwards, 
depending on the number of entries being sought. 
A text for translation is first punched on cards by an 
operator who reco~mizes Cyrillic characters, though she cannot 
read Russian. Symbols, punctuation marks and Cyrillic 
characters are represented by one card column per character. 
Provision is made for indloatin 6. a space to be lef~ in the text 
where an equation or group of symbols occurs. These will be 
inserted in the translation by han~. The cards are treated as 
a continuous medium, card boundaries being ignored,. By this 
means quite a long paper can he encoded on a relatively small 
number of punohed cards. 
The text, now on cards, is fed into the computer. The 
first computing process gives a serial number to each text word 
and then splits the word into stem and suffix. When all text 
words have been subjected to this process, they are then sorted 
into alphabetical order. This is essential for optimum speed 
of look-up in our serially organised dictionary. 
The next programme in the translation sequence, the look- 
up programme, scans simultaneously through the dictionary and 
the sorted text, seeking dictionary entries corresponding to the 
text words. The programme allows for the occurrence of stem 
homographs and for the correct handling of cross-reference 
entries. The output of the programme (which we call, following 
Harvard, the augmented text) consists of the text words each 
with the relevant dictionary entries appended. 
Having obtained a set of augmented text entries, the 
translation sequence then sorts these back to text order, using 
the text serial number originally allocated to each text wor~ 
- t0 - 
The result of this series of operations is a text in the 
original ordsrj with dictionary entries appended to all but a 
few of the items. Symbols and punctuation marks do not, of 
course, have corresponding dictionary entries, and there may 
be words in the text which are not represented in the computer 
dictionary. The latter are given special treatment in the 
syntactic routines and translation output. 
Provision is made in the dictionary for the representation 
of idioms, using a m~thod analogous to that used in an ordinary 
dictionary. A "key word" is chosen in the idiom (normally the 
least frequently occurrin6 wor~), the idiom being represented 
in the dictionary entry of the key word. The representation 
includes a list of the component words of the idiom, using which 
the presence of an idiomatic text word sequence can be detected 
before attemptin~ any syntactic operations on the augmented text. 
The dictionary entry including the idiom contains the preferred 
English equivalent. The dictionary includes coding for 5A0 
idioms. 
Words not represented in the dictionary are given special 
treatment, as mentioned above. All text words which commence 
with one of a set of 137 Russian prefixes are looked up both 
with and without prefix. If the prefixed form does not occur 
in the d/ctionar~, but the unprefixed form is found, then the 
entry for the uuprefixed form is included in the augmented text, 
coupled with an En61ish rendering of the Russian prefix. Despite 
this provision, some text words will not intersect with the 
dictionary. For these an attempt is made to determine part of 
speech, case, number, etc., by an inspection of grammatical and 
derivational suffixes. In the translation output the stem of 
the not-in-dictionary word is transliterated, aJuain6 to anglicize 
as far as possible the original word. A derivations/ suffix 
is given its English equivalent in the output rendering; any 
prefix that was recognised is also given its English rendering. 
Yrom an augmented text produced by the foregoing procedures 
it would be a simple mechanical process to achieve a word-for- 
word "translation". We felt this was not worthwhile, as the 
application of relatively simple rules of grammar and syntax 
greatly enhance intelligibility of such a product. 
Russian Analysis Algorithm 
In the first place we designed and implemented a system of 
noun blocking and a simple predicate analysis. The results 
obtained were not by any means ideal, but we were encouraged to 
extend and refine our syntactic processes. In our first 
attempt the functions of Russian analysis and English synthesis 
were closely interwoven. As our syntactic procedures were 
extended to cover more features it became evident that it was 
essential to separate the functions of analysis and synthesis. 
In order to make this possible the linguistic model, described 
-11- 
in Yates (this conference) was developed. The mo~el Permits the 
analysis routines to express the Russian syntax as far as 
necessary and facilitates a transformation to the oorrespondtr~ 
En61ish sentence structure. 
The analysis routines operate in a succession cf passes 
through each sentence, defined by major punctuation mark 
boundaries (full stop, question mark and s'emi-c01o~ 
The functions of the successive ~sses are as follows: 
1. A preliminary pass which establishes from the augmented 
text the' terminal element for each discrete member of the 
sentence. Punctuation marks are indicated in the elements 
for preceding or fpllowing sentence.items , ~ccording to a 
set of formal rules .... " 
2* A pass whose prime concern is the determination of nominal 
structures, i.e. nouns and words with which they are closely 
connected, such as adjectives or prepositions. 
3. A pass which establishes links betwe'en adjacent n~minal 
structures; the linked elements include genitive qualifiers 
and prepositional group qualifiers. 
~° A pass which searches for" potential coordinating conjunctions 
and examines the sentence elements or struottwes separate~ 
by such conjunctions, setting up coor~tuAte ~roups where 
appropriateo 
50 A pass which creates s~nple preaicate structures, searching 
for words with a verb role andl then locatin6 adjacent 
sentence elements or structures" actin~ as verb a~unctso 
6. A pass whose function is to examine the role of some of the 
more "difficult" words such as the verb 611T~ and its 
inflected forms, a~d the persona~/possessive pronouns ere, 
e 6 an~ MX ° 
A full description of these analysis routines is given in 
reference 1. In the present paper we shall take a Russian sen- 
tence and note the effect of each analysis pass on it. 
The Russian sentence reads: 
Beam ! ~epxm nwSpan m cTpelma-w noxaeasm 
• MeCT8 8811KCH ~SHK~o 
The first ana~sis pass is not of particular interest in the 
present context. Suffice it to say that a system of reference 
addresses is set up which permits the scannin6 o£ the sentence 
whilst its structure is in an inccmplete state. 
-12- 
The state of the sentence diagram after the secon& pass 
has been complete~ is:- 
.... ..... 
~epHHu~_. 
n~Tpaxz ....... -4 
cTpeaza~ 
HO~888HM 
MOOTa 
88nMC~ 
PpaHHH. 
One noun ~roup has been forme&, of which the modifier is a 
coordinate group of adjectives. Each adjective is marked as 
a member in the coordinate group, which itself assumes the 
properties of an adjective. 
- 13 - 
The third analysis pass has the function of creatin6 
genitive and prepositional links. Only the former are 
concerned in our sample sentenoe:- 
Ee~HM~ ...... Nb, C.& M ~-& 
• qepHgM~ 
m~pau~ 
H 
CTpe~KaMK 
HOE888HH 
MeCT8 .... -- ~ --N~ - 
SSH~CH___ -- ~ & 
rpaHH~.. 
Were there a~ prepositional groups followingnounsw then the 
prepositional groups woul& also be linked in as qualifiers. 
The second analysis pass ignored all conjunctions which 
di~ not occur explicitly within simple noun groups (i.e. 
groups with a single noun as head). The fourth pass, however, 
seeks to join to existing noun groups any other nouns linke~ by coordinat~ conjunctions:- 
Beauuz .... ~g" 
qepHN~ 
nx$pauw ~ C& 
HO~888HH 
KecTa - ~ -NO, 
sanxcx__ _ 
rpaswn._ 
In addition the pass groups together in coordinate groups any 
similar words joined by coordinating conjunctions, whatever their 
part of speech. Intervening punctuation prevents the formation 
of coordinate groups. The coordinate group, when formed, is 
given the grammatical significance of its component parts. 
- 15 - 
The fifth analysis pass has little effeot on our sample 
sentence. The plural, short form participle is the sidle 
"verb" member of its verb groupx- 
Be~1~ .... M~.L-~L~ ~ 
~epwmm ........ ~ I 
........ 
. 
CTpeIEa~Z_ 
uoEasaxH...~. ~ -V~ 
=ecTa .... .e~.~. N~ 
Were there adjacent adverbs Or prepositional groups, these 
would be included in the verb group with the role of adjunct. 
The fifth pass also has provision for negative and conditional 
predicate struotuz'es. 
The function of the sixth pass is to try an~ resolve the 
roles of certain more "~fficult" words. (No instances occur 
in the sample sentence). For example, if one of the ambiguous 
persons//possessive pronouns is encounterea, ~ check is made to 
see whether the following sentence element is nominal. • If it 
is, then the pronoun is joined in the element as a modifier~ and 
the pronoun is treated as possessive. Forms of the verb 6HT~ 
• which were not covered by the provisions of p~ss five, ere also 
included in the sixth pass. 
• i H~wing completed the sixth pass, no further analysis of the 
Russian sentence is undertaken. The se~tenee structure 
-16 - 
delineate& by the analysis passes is not complete, since no 
attempt is made to set up a clause strusture. However, in order 
the prooed~lz'e, all the to facilitate the task of (andSynthesis 
separate group structures any remaining separate elements) 
are arbitrarily connecte~ together in one or more higher groups, 
presenti~ the appearance of a unified whole to the synthesis 
stage. 
On the other hand if further analysis passes were applie&, 
particularly with reference to clause delimitation (for example, 
see Appendix I of re~erenoe I), then the sample sentence woul~ 
appear as:~ 
qepw~u~ 
nzSpsml 
cTpe3~aMx _ 
II0Z888HH 
Me CT 8. 
SSH~OM --. - 
rpSH~lI. 
N4 
._~C.L. 
htb 
"~ H Ni 
Thick lines indicate those connections which our analysis routines 
have created, and the thin lines indicate those which woul~.~e~ 
created by additional routines. - .~ 
The translation sequence is completed by an English synthesis 
process. This determines re-orderings, insertions, inflections 
and selections of English equivalents, and, finally, the format 
of the printe~ output, produoe~ by the computer on paper tape an~ 
printe~ on a flexowriter. This process is desoribe~ in the 
companion paper, which also includes an account of the descriptive 
model mentioned above. 
-17- 

References 

MoDANIEL, J., DAY, A.M., PRICE, W.L., 8ZANSER, A.J., 
WHELAN, S. an& YATES, D.M. "Translation of Russian 
soientifio texts into English by oomputer -- a final 
report". National Physical Laboratory, Autonomies 
Division report 35, June 1967. 

National Acade~ of Sclenoe~/National Research Counoil, 
"Language and machines; computers in translation az~ 
linguistics". 1966. 

DAVIES, DoN. an~ DAY, A.M. "A technique for consistent 
splitting of Russian words". Proo. Intl. Conf. on 
Machine Translation of Languages end Appliea Lan6us6e 
Analysis, H.M. Stationery Office, 1962, I, 30-362° 

YATE3, D.M. "A oomputer model for Russian grammatical 
description, and a method of English synthesis in 
maohine translation". This Conference. 
The work desoribe~ above was carrie& out at the National 
Physical Laboratory. 
