HANDLING SYNTACTICAL AMBIGUITY IN MACHINE TRANSLATION 
Vladimir Pericliev 
Institute of Industrial Cybernetics and Robotics 
Acad. O.Bontchev Sir., bl.12 
1113 Sofia, Bulgaria 
ABSTRACT 
The difficulties to be met with the resolu- 
tion of syntactical ambiguity in MT can be at 
least partially overcome by means of preserving the 
syntactical ambiguity of the source language into 
the target language. An extensive study of the co- 
rrespondences between the syntactically ambiguous 
structures in English and Bulgarian has provided a 
solid empirical basis in favor of such an approach. 
Similar results could be expected for other suffi- 
ciently related languages as well. The paper con- 
centrates on the linguistic grounds for adopting 
the approach proposed. 
1. INTRODUCTION 
Syntactical amblgulty, as part of the ambigui- 
ty problem in general, is widely recognized as a 
major difficulty in MT. To solve this problem, the 
efforts of computational linguists have been main- 
ly directed to the process of analysis: a unique 
analysis is searched (semantical and/or world 
knowledge information being basically employed to 
this end), and only having obtained such an ana- 
lysis, it is proceeded to the process of synthesis. 
On this approach, in addition to the well known 
difficulties of general-linguistic and computa- 
tional character, there are two principle embarras- 
ments to he encountered. It makes us entirely in- 
capable to process, first, sentences with "unre- 
solvable syntactical ambiguity" (with respect to 
the disambiguation information stored), and, se- 
condly, sentences which must he translated ambi- 
guously (e.g. puns and the like). 
In this paper, the burden of solution of the 
syntactical ambiguity problem is shifted from the 
domain of analysis to the domain of synthesis of 
sentences. Thus, instead of trying to resolve such 
ambiguities in the source language (SL), syntac- 
tically ambiguous sentences are synthesized in the 
target language (TL) which preserve their ambigui- 
ty, so that the user himself rather than the par- 
ser disambiguates the ambiguities in question. 
This way of handling syntactical ambiguity 
may be viewed as an illustration of a more gene- 
ral approach, outlined earlier (Penchev and Perl- 
cliev 1982, Pericliev 1983, Penchev and Perlcllev 
1984), concerned also with other types of ambt- 
guitles in the SL translated by means of syntacti- 
cal, and not only syntactical, ambiguity in the 
TL. 
In this paper, we will concentrate on the 
linguistics ~rounds for adopting such a manner of 
handling of syntactical ambiguity in an English in- 
to Bulgarian translation system. 
2. PHILOSOPHY 
This approach may be viewed as an attempt to 
simulate the behavior of s man-translator who is 
linguistically very competent, but is quite unfa- 
miliar with the domain he is translating his texts 
from. Such a man-translator will be able to say 
what words in the original and in the translated 
sentence go together under all of the syntactica- 
lly admissible analyses; however, he will be, in 
general, unable to make a decision as to which of 
these parses "make sense". Our approach will be 
an obvious way out of this situation. And it is in 
fact not Infrequently employed in the everyday 
practice of more "smart" translators. 
We believe that the capacity of such transla- 
tors to produce quite intelligible translations is 
a fact that can have a very direct bearing on at 
least some trends in MT. Resolvlng syntactical am- 
biguity, or, to put it more accurately, evading 
syntactical ambiguity in MT following a similar 
human-like strategy is only one instance of this. 
There are two further points that should be 
made in connection with the approach discussed. 
We assume as more or less self-evident that: 
(i) MT should not be intended to explicate 
texts in the SL by means of texts in the TL as 
previous approaches imply, but should only tran- 
slate them, no matter how ambiguous they might 
happen to be; 
(ii) Since ambiguities almost always pass un- 
noticed in speech, the user will unconsciously 
dtsambtguate them (as in fact he would have done, 
had he read the text in the SL); this, in effect, 
will not diminish the quality of the translation 
in comparison with the original, at least insofar 
as ambiguity is concerned. 
521 
3. THE DESCRIPTION OF SYNTACTICAL AMBIGUITY 
IN ENGLISH AND BULGARIAN 
The empirical basis of the approach is provi- 
ded by an extensive study of syntactical ambiguity 
in English and Bulgarlan (Pericliev 19835, accom- 
plished within the framework of a version of de- 
pendency grammar using dependency arcs and bra- 
cketlngs. In this study, from a given llst of con- 
figurations for each language, all logically-ad- 
mlssible ambiguous strings of three types in En- 
gllsh and Bulgarian were calculated. The first 
type of syntactlcally ambiguous strings is of the 
form: 
(15 A ~L~B, e.g. 
adv.mod(how long?) 
f The statistician studied(V) the ~hole year(PP), 
obj.dir(wh~t?) 
where A, B, ... are complexes of word-classes, 
"---~" is a dependency arc, and 1, 2, ... are syn- 
tactical relations. 
The second type is of the form: 
(2) A -~->B<-~- C, e.g. 
adv.mod(how?) 
She greeted(V) the girl(N) ~ith a smil6(PP) 
attrib(what?) 
The third type is of the form: 
(3) A -!-~B~-~- C, e.g. 
adv.mod(how?) \[ 
He failed(V) enttrely(Adv) to cheat(Vin f) her 
adv.mod(how?) 
It was found, first, that almost all logically 
-admissible strings of the three types are actually 
realized in both languages (cf. the same result al- 
so for Russian in JordanskaJa (1967)5. Secondly, 
and more important, there turned out to be a stri- 
king coincidence between the strings in English and 
Bulgarian; the latter was to he expected from the 
coincidence of configurations in both languages as 
well as from their sufficiently similar global 
syntactic organization. 
4. TRANSLATIONAL PROBLEMS 
With a view to the aims of translation, it 
was convenient to distinguish two cases: Case A, in 
which to each syntactically ambiguous string in En- 
glish corresponds a syntactically ambiguous string 
in Bulgarlan, and Case B, in which to some English 
strings do not correspond any Bulgarian ones; 
Case A provides a possibility for literal English 
into Bulgarian translation, while there is no such 
possibillty for sentences containing strings 
classed under Case B. 
4.1. Case A: Literal Translation 
English strings which can be literally tran- 
slated into Bulgarian comprise,roughly speaking, 
the majority and the most common of strings to 
appear In real English texts. Informally, these 
strings can be included into several large groups 
of syntactically ambiguous constructions, such as 
constructions with "floating" word-classes (Ad- 
verbs, Prepositional Phrases, etc. acting as slaves 
either to one, or to another master-word), constru- 
ctions with prepositional and post-positional ad- 
juncts to conjoined groups, constructions with se- 
veral conjoined members, constructions with symmet- 
rical predicates, some elliptical constructions, 
etc. 
Due to space limitations, a few English phra- 
ses with their literal translations will suffice 
as an illustration of Case A. (Further on, syntac- 
tical relations as labels of arcs will be omitted 
where superfluous in marking the ambiguity): 
(4) 
I 41 
a review(N) "of a ^boo~(PP) ~(PP) ===~ 
I t l 
\[ 
---==>retsenzija(N) ~(PP) o~--~(PP) 
(5) I saw(V) the car(N) ouslde(Adv) --==~> 
===~Azl vidjah(V)i k°l~ Ata(N) navan(Adv)I 
 O' v°iy 'dv' ) 
===>.mnogo (Adv) ~ I skromen (Adjjl))i" razumen (Adj)i, 
522 
1 t l IVq ) 
beau ful( d )(wo n(N) II gi s(N) > 
v' !1 'v ) (ze,,, (N) " momicheta(N) ) ---->kra ivi( dj, It 
4.2. Case B: Non-Literal Translation 
English strings which cannot be literally 
translated into Bulgarian are such strings which 
contain: (i) word-classes (V i f Gerund) not pre- n ' 
sent in Bulgarian, and/or (ii) syntactical 
relations (e.g. "composite": language~-~ -- theory, 
etc.) not present in Bulgarian, and/or (iii) other 
differences (in global syntactical organization, 
agreement, etc. ). 
It will be shown how certain English strings 
falling under this heading are related to Bulgarian 
strings preserving their ambiguity. A way to over- 
come difficulties with (il) and (iii) is exempli- 
fied on a very common (complex) string, vlz. 
Adj/N/Prt+N/N's+N (e.g. stylish ~entlemen's suits). 
As an illustration, here we confine to prob- 
lems to be met with (i), and, more concretely, to 
such English strings containing Vin f. These strings 
are mapped onto Bulgarian strings containing 
da-construction or a verbal noun (V i ~ generally 
b-eeing translated either way). E.g. nXthe Vln f in 
obj. dlr 
(8) a. He promised(V) to please(Vin f) mother 
t._JI . eL. adv. mod 
(promised what or why?) is rendered by a da-con- 
struction in agreement with the subject, preserving 
the ambiguity: 
obj. dir 
~,'" I\[ ~1 ' zaradva(da-const r) objelht a (V) da b. T~J 
. ~ I __ m~Jka 
adv. mod 
In the string 
attrib 
(9) a. ~ have(V)jl, instructions(N)~, toj st~dy(Vin f)j 
obJ.dlr 
(what instructions or I have to study what?) V. _ 
can be rendered alternatively by a d_~a-construc ~nz- 
tion or by a prepositional verbal noun: 
attrib 
b. AZ imam(V) lnstruktsii(N) da ucha(d__aa-constr) 
ohj dir 
attrib 
c. instruktsii(N) za uchene(PrVblN) 
obj. dl r J 
Yet in other strings, e.g. The chicken(N) is 
ready(Adj) to eat(V. .) (the chicken eats or is 
eaten.), in order to preserve the ambiguity the 
infinitive should be rendered by a prepositional 
verbal noun: Pileto(N) e gotovo(AdJ) z_~a jadene 
(PrVblN), rather than with the finite da-construc- 
tion, since in the latter case we would obtain 
two unambiguous translations: Pileto e gotovo d a 
~ade (the chicken eats) or Pileto e got ovo da se 
~ade (the chicken is eaten), and so on. 
For some English strings no syntactically am- 
biguous Bulgarian strings could be put into corres- 
pondence, so that a translation with our method 
proved to be an impossibility. E.g. 
predicative 
V~--7 I\[ ob~ .dir ~ 
(I0) He found(V) the mechanic(N) a helper(N) 
~ Jl~bJ.indir ~ t 
obJ.dir 
(either the mechanic or someone else is the helper) 
is such a sentence due to the impossibility in Bul- 
garian~r two non-prepositional objects, a direct 
and an indirect one, to appear in a sentence. 
4.3. Mul~,,iple Syntactical Ambiguity 
Many very frequently encountered cases of mul- 
tiple syntactical ambiguity can also be handled 
successfully within this approach. E.g. a phrase 
like Cybernetical devices and systems for automatic 
control and dia~nosis in biomedicine with more than 
30 possible parsings is amenable to literal trans- 
lation into Bulgarian. 
4.4. Semantically Irrelevant Syntactical 
Ambi~uity 
Disambiguating syntactical ambiguity is an im- 
portant task in MT only because different meanings 
are usually associated with the different syntac- 
tical descriptions. This, however, is not always 
the case. There are some constructions in English 
the syntactical ambiguity of which cannot lead to 
multiple understanding. E.g. in sentences of the 
form A is not B (He is not happy), in which the ad- 
verbial particle not is either a verbal negation 
(He isn't happy) or a non-verbal negation (He's not 
happy), the different syntactical trees will be in- 
terpreted semantically as synonymous: 'A is not B' 
~-==~A is not-B'. 
523 
We should not worry about finding Bulgarlan 
syntactically ambiguous correspondences for such 
English constructions. We can choose arbitrarily 
one analysis, since either of the syntactical des- 
criptions will provide correct information for 
our translational purposes. Indeed, the construc- 
tion above has no ambiguous Bulgarian correspon- 
dence: in Bulgarian the negating particle combines 
either with the verb (then it is written as a se- 
parate word) or with the adjective (in which case 
it is prefixed to it). Either construction, how- 
ever, will yield a correct translation: To~ nee == -- 
radosten or To~ e neradosten. 
4.5. A Lexical Problem 
Certain difficulties may arise, having managed 
to map English syntactically ambiguous strings onto 
ambiguous Bulgarian ones. These difficulties are 
due to the different behavior of certain English 
lexemes in comparison to their Bulgarian equiva- 
lents. This behavior is displayed in the phenomenon 
we call "intralingual lexical-resolution of syn- 
tactical ambiguity" (the substitution of lexemes 
in the SL with their translational equivalents 
from the TL results in the resolution of the syn- 
tactical ambiguity). 
For instance, in spite of the existence of am- 
biguous strings in both languages of the form 
Verbtr/itr~->Noun, with some particular le- 
xemes (e.g. shoot~r/itr==-~>zastrel~amtr or 
strel~amitr), In which to One Engllsh lexeme co- 
rrespond two in Bulgarian (one only transitive, and 
the other only intransitive), the ambiguity in the 
translation will be lost. This situation explains 
why it seems impossible to translate ambiguously 
into Bulgarian examples containing verbs of the 
type given, or verbal nouns formed from such verbs, 
as the case is in The shootin~ of the hunters. 
This problem, however, could be generally tackled 
in the translation into Bulgarian, since it is a 
language usually providing a series of forms for a 
verb: transitive, intransitive, and transitive/in- 
transitive, which are more or less synonymous ~for 
more details, cf. Penchev and Perlcliev (1984)). 
5. CONCLUDING REMARKS 
To conclude, some syntactically ambiguous 
strings in English can have literal, others non-ll- 
teral, and still others do not have any correspon- 
dences in Bulgarian. In summary, from a total num- 
ber of approximately 200 simple strings treated in 
Engllsh more than 3/4 can, and only 1/4 cannot, be 
literally translated; about half of the latter 
strings can be put into correspondence with syntac- 
tically ambiguous strings in Bulgarian preserving 
their ambiguity. This gives quite a strong support 
to the usefulness of our approach in an English in- 
to Bulgarian translation system. 
Several advantages of this way of handling of 
syntactical ambiguity can be mentioned. 
First, in the processing of the majority of 
syntactically ambiguous sentences within an En- 
glish into Bulgarian translation system it dispen- 
ses with semantical and world knowledge information 
at the very low cost of studying the ambiguity co- 
rrespondences in both languages. It could be expec- 
ted that investigations along this line will prove 
to be frultful for other pairs of languages as 
well. 
Secondly, whenever this way of handling syn- 
tactical ambiguity is applicable, the impossibili- 
ty of previous approaches to translate sentences 
with unresolvable ambiguity, or such with verbal 
Jokes and the like, turns out to be an easily 
attainable task. 
Thirdly, the approach seems to have a very na- 
tural extension to another principal difficulty in 
MT, viz. coreference (cf. the three-ways ambiguity 
of Jim hit John and then he (Jim, John or neither?) 
went away and the same ambiguity of tQ~ (=he) in 
its literal translation into Bulgarian: D$im udari 
DJon i togava toj(?) si otide). 
And, finally, there is yet another reason for 
adopting the approach discussed here. Even if we 
choose to go another way and (somehow) dlsamblgu- 
ate sentences in the SL, almost certainly their 
translational equivalents will be again syntactl- 
cally ambiguous, and quite probably preserve the 
very ambiguity we tried to resolve. In this sense, 
for the purposes of MT (or other man-oriented 
applications of CL) we need not waste our efforts 
to disambiguate e.g. sentences like John hit the 
dog with th___ee lon~ hat or John hit th____ee do~ with the 
long woo1, since, even if we have done that, the 
correct Bulgarian translations of both these sen- 
tences are syntactically ambiguous in exactly the 
same way, the resolution of ambiguity thus proving 
to be an entirely superfluous operation (cf. D~on 
udari kucheto s dal~ata palka and Djon udari ku- 
cheto s dal~ata valna). 
6. REFERENCES 
JordanskaJa, L. 1967. Syntactical ambiguity in 
Russian (with respect to automatic analysis 
and synthesis). Scientific and Technical In- 
formation, Moscow, No.5, 1967. (in Russian). 
Penchev, J. and V. Perlcllev. 1982. On meaning in 
theoretical and computational semantics. In: 
COLING-82, Abstracts, Prague, 1982. 
Penchev, J. and V. Perlcliev. 1984. On meaning in 
theoretical and computational semantics. 
Bulgarian Language, Sofia, No.4, 1984. (in 
Bulgarian). 
Pericliev, V. 1983. Syntactical Ambiguity in Bul- 
garian and in English. Ph.D. Dissertation, 
ms., Sofia, 1983. (in Bulgarian). 
524 
