TRADITIONAL MEANS IN MACHINE TRANSLATION 
Zden~k KIRSCHHER 
Matematicko-fyzik~ln~ fakulta UK 
118 00 Praha 1, Malostransk@ n~m~st~ 25 
Czechoslovakia 
Abstract: 
The chronic problems of machine trans- 
lation cannot be solved in a fully automa- 
tic way. Human intervention is inevitable. 
The development of "traditional" means in 
connexion with advances of computer techno- 
logy represent most substantial contribution 
to further progress in the field of machine 
translation. Some of the problems are illus- 
trated using the example of the APAC32 pro- 
Ject. 
I. The hopes for a successful solution of 
the chronic problems of machine transla- 
tion (MT) have long been set on two 
fruitful and mutually dependent pros- 
pects: research in artificial intelli- 
gence (AI) and advances in the computing 
technology. The importance of the latter 
contribution is beyond dispute. As re- 
gards the former domain, some reserva- 
tions must be voiced. 
1.1~ It can be stated with some tolerance 
that the missing information required 
for automatic understanding (or desam- 
biguation) of natural language (NL) is 
supposed to be supplied by a computer 
model of the knowledge correspomding to 
the universe of discourse. The context 
of the analysed message constitutes an 
important part of this universe. There- 
fore, an essential component of such a 
model must draw on the texts processed. 
Thus, irrespective of the contingent 
form, organisation, etc., of the whole, 
the model would at least partially de- 
pend on the results of the analysis for 
which it is supposed to provide neces- 
sary information. This means that cir- 
cularity is imminent. Even if the al- 
most inevitable occurrence of elements 
not covered by any device in the sys- 
tem is disregarded, it is obvious that 
the model can be neither complete nor 
consistent. 
Since there will always remain threats 
of failure caused not by accidental 
factors but by the intrinsic inadequa- 
1.2. 
2. 
2.1. 
cy of any system of MT, human inter- 
vention is inevitable, and the ideal 
of "fully automatic high quality trans~ 
lation" (FAHQT) (which, we suspect, is 
no longer believed to be able to ever 
come true, anyway) is impossible° While 
not denying potential merits of the 
contribution of AI, the above discus- 
sion should suggest that the develop~ 
ment of means called "traditional" is 
equally important for MT. An example of 
an approach based on such means is our 
experimental system APAC32. 
If we refer to our system here, it is not 
to boast that we have achieved any extra- 
ordinary success or that we have long du- 
ly appraised the above conclusions and 
reacted on them in an original way, etc. 
It is only to illustrate our conviction 
that there is still a fairly wide and 
long path open ahead of us within the 
confines of the traditional means. To 
say the truth, it has been our material 
situation that forced us to rely exclu- 
sively on them and to dispense with any- 
thing more sophisticated. This had to be 
said to clear us of a suspicion that we 
are making a virtue of necessity. 
APAC32 is a descendant of the Montreal 
TAUM series. It has been implemented 
on computers of the type of IBM 370 to 
translate into Czech English abstracts 
in microelectronics and, later, pumping 
machinery. Using Colmerauer~s Q-systems 
the main part of the program builds 
linearized rooted-tree-llke structures, 
which stepwise identify and interpret 
elements or groups of elements of the 
input units stating their character 
and function, dependency relations and 
position in the sentential context. 
Strings with multiple interpretations 
which had not been eliminated are rs~ 
presented by parallel structures giving 
multiple parses in the final stage of 
the analysis, but not necessarily mul~ 
tiple translations at the final output. 
Bas~o or fully accomplished structures~ 
which resemble predicate calculus pat-° 
te~Ds, have a finite verb at their root 
and individual participants in depen~ 
dent pesitions~ The sense (direction 
to the left or right) of an oriented 
edge (an arrow) representing a depen- 
dency relation ~ an information per- 
raining to the mutual projective posi- 
tion of the incident nodes - as well as 
the function of a dependent participant 
are ~Lndicated in a way that simulates 
the marking of edges in a graph. The 
synthesis starts by disintegrating the 
stru~tures that result from the analy~ 
sis~ At the output of this stage, re~ 
latlvely simple trees representing in- 
divldual words appear, with all the in- 
fer~atlon necessary for generating 
form~ of the target language. This pro- 
eeed~f in steps in which occasionally 
additional target-language-specific in- 
formation has to be derived to render 
the synthesized structures complete 
and acceptable. Such adjustments are 
usually connected with the operations 
of transfer: while the action of its 
general rules mostly coincides with 
the opening phases of the synthesis~ 
the ~nformation concerning the parti~ 
cular changes is contained in the dic~ 
tionariss to be exploited in the con- 
cluding parts of the program. 
The absence of any accomplished model 
of the universe of discourse and the 
temporary abandonment (for technical 
reasons) of any device alowing the in- 
volvement of hypersentential context 
in the analysis have, of course, en- 
dowed the system with a typical proba- 
bilistic character. In this connexion, 
especially the tactics occasionally re- 
ferred to as "preferential" must be 
mentioned: some rules are applied re- 
peatedly in subsequent stages, each 
time with conditions less rigid. The 
combinatorial power of the Q-systems 
had to be reduced by introducing seve- 
ral stages - partial grammars - operat~ 
ing before the syntactic analysis pro- 
pete Thus, e.g., a (partial) analysis 
of nor, linal complexes precedes that of 
verbal structures. Therefore, a special 
device registers schematically the con- 
text (,f each element in the sentence. 
2~3~ In simulatiz~ne functions sf a model 
of the universe Of d~scou~se, the sys~ 
tom of dictionaries represents the mo~ 
important tool. 
2o3~o The basic dictionary information in 
APAC32 is a complex which consist~ of 
two main parts= information conce1~ 
ing the source language and that pe~=~ 
raining to the target language° These 
structures can be separated~ ~hey 
have been put together whenever pos~ 
sible with respect to the efficiency 
of the system° The internal structure 
of both these parts is almost the 
same and can be briefly described as 
follows: ca~egorial information~ le~, 
xical value~,pa~adigmatic information~ 
pointers to parallelmeanings, valen~ 
oy frame, combinatory frame (preposi~ 
tional, phrasal9 special~liaison, 
etc., patterns), terminological spe~ 
cifications, special syntactic inforo~ 
mation, semantic features~ Extensive 
though this apparatus may be, it 
should be stated that theze are still 
possibilities ~ and a need~ of course 
to add further data° For lack of 
space, let us confine ourselves to 
three poi~ts only® 
2~3.1~I. The apparatus of semantic featu~'e~ 
consists of four &lasses of feao- 
tures: a) features concerning the 
text vs. metatext structure~ 
b) general semantic features~ 
c) domain specific features, and 
d) features concerni~ terminologi~- 
cal status° The number of features 
is limited for reasons of which the 
most important is that excessive 
detailedness leads to unwanted ~i~ 
gidity. However, a number of poten .... 
tially very useful candidate featu~ 
res can be added. Assigning weights 
to features might be a solution to 
this dilemma, especially in the 
framework of the "preferential" 
tactics. 
2o3oi .2. Some classes of words have been 
further classified to highlight 
their intrinsic properties in thG 
translation environment° E~g., a 
special classification of verbs 
makes it possible to solve, at 
least in part, the problems of as.~ 
pest in Czech in relation to Eng~ 
329 
2.3.1.3. 
2.3.2. 
2.3.3 • 
2.3.4. 
330 ! 
fish verbal adjectives (-ED, -ING 
forms). Much more can be done in 
this direction. Unfortunately, this 
will imply extensive empirical work 
including excerption and, if possi- 
ble, organization of a usage-panel- 
like inquiry. 
As concerns combinatory frames, 
also more information will be added 
on the possibilities of adverbial 
modification of nouns. Some changes 
and additions to the present orga- 
nisation and contents of the dic- 
tionaryentries ar e considered with 
a view to structures suggested in 
the Mel%huk-Apresyan's model 
"meaning - text". 
A specific dictionary device has 
been introduced in the terminologi- 
cal section of the dictionary system. 
Special rules control, or rather, 
guide the analysis of terminological 
complexes, making it possible to de- 
cide frequent ambiguous structures 
(e.g., INTEGRATED CIRCUIT SYSTEM as 
((INTEGRATED CIRCUIT) SYSTEM) rather 
than (INTEGRATED (CIRCUIT SYSTEM))). 
In this way partial quasi-model of 
the specific domain can be formed 
whose elements are capable o~ recur- 
sive application to new combinations. 
Another dictionary device deals wlth 
unrecognized elements - the so-called 
transducing dictionary (TD). TD re- 
lies on derivational morphology, as- 
signing categorial information, and, 
in some cases, semantic status and 
other information to words hitherto 
"unknown" to the system, on the basis 
of their endings (e.g., -ING, -ED, 
-ESS, -ITY, -ION, -LY, -WISE, -PY, 
etc.); for some of them even success- 
ful adaptation to the target language 
is possible. The remaining unrecognl- 
zed.elements are regarded as nouns: 
first as proper, then, if this fails 
to be confirmed, common. A more ver- 
satile practice is planned, which 
will take into consideration other 
possible interpretations as well. 
TD, as well as some other devlces and 
rules can be also regarded as special 
fail-soft measures, though another 
component called "emergency rules" is 
2.4. 
2.5. 
included which performs this f~Lucti0~ 
as a specialized set of rules design~ 
ed to reconstruct, complete or integ~ 
rate into a (would~be_.)_.me_animgf.u! 
whole those structures that failed %0 
reach the stage of an accomplished 
parse. In some respects, the role o£ 
such measures is problematic in zela~ 
tion to h~an intervention. Our sys~ 
tem offers possibilities to introduce 
a special diagnostic device to recog~ 
nize and classify the s~mptoms of a 
failure, so that more than the pre- 
sent simple marking of "suspicious" 
or "underdone" outputs can be presen= 
ted to aid the postedition. 
Ambiguities are treated in the usual 
way. It should be pointed out that in 
the translation between the languages 
in question, the principles of agree- 
ment so widely applied in Czech unmer~ 
cifully reduce the chances to get over 
some types of unsolved ambiguities in 
an "unperceptible"~ i.e., accidental, 
way. These principles, as a rule, ob- 
stinately insist upon rendering impli~ 
cit information explicit. That is why 
in some cases structures with ambiguous 
reference are translated by equivalents 
equally ambiguous or vague. E.g., with 
some classes of verbs, (clausal) parti~ 
cipial modification with ambiguous de~ 
pendence is replaced by prepositional 
or other constructions without any di~ 
rect dependence: e.g., USING -, WITH 
USING, CAUSING -~ WHICH (referring to 
the whole of the preceding or pertinent 
clause) CAUSES, etc. 
This concerns also contrastive ambigui~ 
ties and other asymmetrical relations 
between the two languages. In this con .... 
nexion, it should be pointed out that 
one of the criteria for the classifica~ 
tion of English verbs is the classific~ 
tion of their Czech counterparts. Th,~s~ 
e.g., the verb SUPPOSE must be assigned 
information that the construction SOME~ 
ONE IS SUPPOSED TO... must be transfor- 
med to IT IS SUPPOSED (ABOUT SOMEONE) 
THAT SOMEONE... to make it correspond 
to the structure acceptable in Czech. 
Similarly, constructions like SEAT SAT 
ON BY,.. must be transformed with the 
aid of correspondi~ relative clauses~ 
Much remains to be done for the domain 
of conversion. Its productive aspects 
po~e serious problems. 
3o To come back to the opening paragraphs: 
the ~dvances of computer technology, 
while not offering ultimate solution of 
problems detrimental to the efforts to 
achieve the ideals of PAHQT, will un- 
doubtably liberate the MT from the curse 
entailed by its usually more or less im- 
mediate subservience to various practical 
applications - the strict limitations of 
computer time and storage - which so oft- 
en represented the only obstacles in in- 
trodu,~ing many a useful and, sometimes, 
even very necessary device, process or 
approach. Most of the prospective exten- 
sions, innovations and other changes re- 
quire profound empirical examination and 
more linguistic fleld-work than, up to 
now, we were able to expends 

References

Kirschner, Z. (1982) A Dependency-Based 
Analysis of English for the Purpose of 
Machine Translation. Praha, Matematicko~ 
fyzik~ln~ fakulta UK. 

Kirschner, Z. (1987) APAC3-2: An En~lish-tc~ 
Czech Machine Translation System. Praha, 
Matematicko-fyzik~In~ fmkulta UK. 
