AUTOMATIC TRANSLATION THROUGH UNDERSTANDING AND 
SUMMARIZING 
N. N. Leont eva 
VsesojuznyJ centr perevodov 
Kr~l~anovskogo 14, 117 218 Moskva, USSR 
French to Russian automatic translation system belng de- 
veloped In All-Unlon Centre of Translation is conceived as 
part of a multlfunctlonal information processing system in 
the sense that It should be able to use approaches and methods 
proper to the information processing field, such as summariz- 
ing, abstracting, indexing, making inferences, etc. In such a 
system translation is realized through building text informat- 
ion representation (IR). The task requires two types of ana- 
lysis: linguistic analysis (LA) and information analysis (IA) 
working in interaction, the latter being, in particular, able 
to refer to the automatic thesaurus. The ultimate aim of LA is 
the building of sentence semantic representation (SR). It is 
important that for each individual sentence its SR Is con- 
\structed as a function of the IR of the whole text. (The cur- 
rent version of the system does not operate with the whole 
text but is limited for each sentence with Its more or less 
immediate context.) Linguistic analysis calculates morpholog- 
ical structure for words, syntactic and semantic structures 
for sentences. Each of these structures is determined by the 
approprlete language realities; still remaining obscurities 
can be cleared only by referring to higher levels of analysis. 
SR built for an lndlvlduel sentence without regard to other 
sentences" SR's is normally incomplete (deficient, ambiguous I 
incorrect, etc.). SR incompleteness Is manifested by incomplet- 
- 178 - 
eness of its units° The construction of text IR requires 
operations of comparison of different SR units as well as 
their comparison with thesaurus units. As a rule, incomplet- 
eness proper to SR's As cleared onl~ partially, which calls 
for some external measures to ensure a formally correct struct- 
ure ready for the synthesis of the output text. The general 
scheme of the system functioning runs as follows: 
1 • analysis 2. reconstruc t- 3 • summarlz- 4 • synthesis 
(LA) ion (LA-IA) Ing (IA) 
lnl~tIal r SR corrected SR compressed SR 
input output 
sent eric e sent ence 
Llr~uistlc analysis contains a set of procedures aimed 
at creating initial S~'s where all cases of Incompleteness 
are exposed. Reconstruction compares SR's with each other and 
with the thesaurus and restaures the missing parts of S~'s. 
Summarizing means obtaining a klnd of an abstract from which 
all obscure and Incomplete parts are removed so that only 
essential Information Is available. 
Information processing plays an important role in ~ealle- 
ation of the scheme as the system translates only what it 
comprehends, thus the result may be called not a literal but 
a "digested" translation. The information model of automatic 
translation is based on the properties of the coherent text. 
One of the main properties Is that pieces of information 
essential for the text are repeated there In many ways and by 
various iIngulstlc means. IA alms at Identifying such infor- 
mation and making it the basis for SR reconstruction. The le- 
vel of "information noise" in the synthesized text Is expected 
to be lower than In the classical approach to AT (sentence-to- 
-sentence translation through syntactic structures). The 
degree of abstracting (summarizing) can vary depending On the 
purpose: the system can be oriented at getting a translation 
- 179 - 
proper, a detailed or a brief abstract s a summery, or, final- 
ly, a search patter~. The effect of such reproductions of the 
Input text with subsiding detallty reminds of an echo which 
gradually loses almost ell orJ~lnal features keeping the main 
pattern to the end:no degree of abstracting should affect the 
document main contents. 
The system Information orientation determines the choice 
of linguistic means of analTsIs, mainly, the structure and 
unite of syntactic and semantic representations. Two princip- 
les can be formulated: HpurIty°' Of means at each level of 
analTaI8 and possibilities of Interaction between levels. The 
~Irat principle makes It possible to use with maximum effici- 
ency the laws specific to each level and to certify the for- 
mal correctness of the result'lng structure. The second prin- 
ciple Implies a kind of hlerarchlel organlsatlon of g~ammar: 
If a unit of one level cannot be Interpreted at a hZKher 
level, It can be mgenerallzedN (8 lexema can be generalized 
to a semantic class, a labeled relation can be replaced by a 
more general or even an unlabeled relation). Building of a 
structure at each level comprises at least two stages: creat- 
Ion of the 5nttIal structure permitted to be Incomplete and 
Inco~ect, and reconstruction of a more complete and correct 
structure, after an Interpretation of the Initial structure by 
means of the hJ~her level (or levels). 
The division into levels Is manifested not onl~ by dif- 
ferent means of analTsls but also by different nature of 
unite: nodes and ~elatlone. Nodes of syntactic representation 
are wo~ds (d~fference of lexIcal meanings Is disregarded), 
nodes of semantic representation are lexical meanings, nodes 
of IR are notions having denotative status. Relations of 
syntactic structure are flmctIonal (from predicate to subject, 
form predicate to'direct or indirect object, attributive 
relation, etc.). SR-relation8 are of eemarrtIc nature (cause, 
trine, patient, etc.), IR relation8 are malnl7 the same but 
- 180 - 
vary In their lr~or~atlon value: some appear J~£de a notion 
and are devaluated, other~ connect separate notions and 
acquire denotative status. 
Uv_tts of translation ere represented by units of IR 
having an expllclte Inner structure and liable to translat- 
Ion either ss-a whole or by parts. They are formed In the ~" 
course of both linguistic and Information analyses. 
- 181 - 
