Particle Homonymy and Machine Translation 
K6roly F&bricz 
JATE University of Szeged, 
Egyetem u. 2. 
Hungary I\[ - 6722 
Abstract 
The purpose of this contribution is to 
formulate ways in which the homonymy of so- 
called 'Modal Particles' and their etymons 
can be handled. Our aim is to show that not 
only a strategy for this type of homonymy 
can be worked out, but also a formalization 
of information beyond propositional content 
can be introduced with a view to its MT ap- 
plication. 
1. Introduction 
During the almost 40 years of its exist- 
ence machine translation has undergone a con- 
siderable refinement in the fields of both 
syntactic parsing and semantic representa- 
tion. The development of MT can be seen as a 
tendency to incorporate more and more lin- 
guistic knowledge into \[:he formalization of 
translational processes. Formalization has 
thus become a keyword for MT and has had sev- 
eral major implications. Firstly, it refers 
to the hypothesis that everything related to 
a given language is s t r u c t u r e d in 
one way or another. Secondly, formalization 
is an o b j e c t i v e m e a n s of tes- 
ting the validity of the linguist's hypoth- 
eses about linguistic phenomena. Thirdly, it 
involves the linguist's h o p e that any- 
thing that has to do with language can in 
fact be formalized. 
At present, there are several semantic 
theories which could be labelled "formal se- 
mantics". They are preoccupied with explo- 
ring the propositional content of different 
text-units and they do not deal with the 
phenomenon of "subjectivity". Subjectivity, 
or self-expression, as \]Lyons /1981, 240/ has 
pointed out, "cannot be reduced to the ex- 
pression of propositional knowledge and be- 
liefs". If we think of MT in its ideal form, 
i.e. not as an abstracting device, but as a 
system producing automatic translation, then 
the inadequacy of restriction to proposi- 
tional content wi\]l be evident° 
The present paper sets out to show that 
the expression of lexical subjectivity, con- 
veyed by modal particles, should, and can, 
be accounted for in the process of MT. 
2. Particle Homony_my 
Let us consider the following pairs of 
sentences: 
i a. There is ~n~u a little beer left. 
b. ~ was ?~ too pleased to leave that 
p~ace. 
2 a. Nur ihn hatte man vergessen. 
b. Woz~ babe ich nut gelebt? 
3 a. Vous partez ddja? 
b. Comment vous vous appelez ddja? 
4 a. ~pu~oCume ~ ~a~ u saempa. 
b. ~ u ~le enam, ~mo c~asamo. 
5 a. Ann~ is elj~n hozzdnk? 
b. Hol is tartottunk? 
The words underlined in the b. example of 
each pair of sentences belong to a word- 
group now more or less uniformly referred to 
as 'Modal Particles' /cf. Arndt 1960/. 
These words represent, in Arndt's term, 
a granunatical no-man's-land, although in the 
past ten years there has been a considerable 
interest towards modal particles. 
59 
Words like the English ~ or the Ger- 
man nur present two problems from the point 
of view of machine translation. On the one 
hand, they are ambiguous and their homonymy 
1~/st be resolved. On the other hand, when 
such lexemes are used as modal particles, 
their "translation" causes serious problems 
since we can rarely translate the modal 
into German as nut, or, say, into Hungarian 
as csak. 
3. Resolution of Homon£my 
As far as homonymy is concerned, clear- 
ly the task is to set up formal rules for 
the categorization of a given word as op- 
posed to its alternative morphological and 
syntactic status. 
The implication of the assignation of 
such homonymous lexemes to certain classes 
of words is by no means a matter of "simple" 
selection restriction at surface level. Each 
modal particle has preserved much of its 
etymon's syntactic and semantic properties. 
Given this, it follows that the ambiguity 
may be resolved by constructing small "sub- 
grammars" for each of these particles, so as 
not only to set them apart from their homo- 
nyms, but also to take into consideration 
the whole co~nunicative content of the sen- 
tence. 
Thus, a subgrammar recognizing onl\[ - 
either as a logical operator, with its re- 
strictive meaning, or as a modal particle, 
with its vague and, in a sense, antonymous 
meaning -- would have to be capable of manipu- 
lating information from different levels. By 
comparing sentences /la/ and /ib/ it could 
be concluded that, say, ~ is an operator 
when it precedes an NP /e.g. Det + Adj + N/ 
and is a particle when followed by too. But 
this assumption can readily be proved faulty 
by considering /6/: 
/6/ If ~ you had come, you could have 
saved me a lot of trouble. 
It is commonly held that, in order to 
parse sentences, one needs strategies for 
locating verbs and their complements, assign- 
6O 
ing words to various categories, depending 
on context /Lehrberger 1982, 102/. The rec- 
ognition of particles can be done mainly by 
starting from semantic representations which 
should contain information concerning both 
the propositional content of sentences and 
their extrapropositional, or subjective mo- 
dal content. Thus, assigning ~ to par- 
ticles would imply an algorithm roughly de- 
fined as: "If the lexeme ~ is used with a 
word that has no restrictive component in 
its meaning, then it is a particle; other- 
wise it is an operator". 
Parsing along these lines would mean a 
very complicated presentation of different 
parts of speech, including not only NPs, 
made up of adjectives, nouns, but also ad- 
verbs, pronouns and even phrases to account 
for ~n~ constructions like /6/. In addition, 
a very sophisticated and precise definition 
of the restriction/non-restriction opposi- 
tion would have to be set up. 
Obviously, the difficulty of assigning 
homonymous lexemes to modal particles, on 
the one hand, and to operators, intensifiers, 
adverbs,conjunctions, and the like, on the 
other, lies in the fact that the former bear 
a relationship to the overall meaning of the 
sentence, while the latter add their meaning 
to the global meaning only via some lower 
level of semantic structure. 
From the above consideration it follows 
that it would be a fairly tedious and prob- 
ably unreasonable task to attempt to resolve 
this kind of homonymy by the algorithmiza- 
tion of abstract sense-components. 
Instead, it might be sufficient to con- 
struct a subgran~ar to check ~ and other 
homonyms solely by reason of their being a 
particle. One way to make the information 
contained in the subgrammar available to the 
parser may be to indicate, in the dictionary 
entry of the homonym, all the cases in which 
the given word could possibly appear as a 
particle. 
In English, or French, the resolution 
of ambiguity would mean setting up as few as 
6-10 subgrammars, while in German, Russian 
or Hungarian there are scores of homonymous 
particles and, consequently, subgrammars. In 
addition, the latter languages make quite 
frequent use of particle combinations which 
do not, as a rule, derive their meanings 
from a complement of the two /or more/ par- 
ticles, but have some different meaning, cf. 
/7/ Csak hem fdztdl meg? 
/8/ Yx ~e npocmyCunc~ nu m~? 
Nevertheless, there seems to be no reason 
why these combinations could not be included 
in the subgrammar under one or the other dic- 
tionary entry. 
4. Translation of Modal Particles 
Whilst intensifiers, conjunctions, oper- 
ators, pronouns, or adverbs have meanings 
which may be considered more or less "univer- 
sal", the semantics of particles takes us 
into a field specific to a particular lan- 
guage. In other words, using ~ as an oper- 
ator is "almost" identical to using, say, 
nur~ or seulment, or csah etc. as an Oper- 
ator in German, French or Hungarian respect- 
ively. But when it comes to particles, we 
may experience difficulties in preserving 
the operator equivalent of onl~ in the trans- 
lation of sentences like /ib/ into any other 
language. 
One possible solution, as with lots of 
different types of translation, would simply 
be to consider these words irrelevant from 
the point of view of propositional content 
matching. However, it would seem more plaus- 
ible to try to find equivalents to these par- 
ticles in the target language since, depend- 
ing on the type of context to be translated, 
the expression of subjectivity may play a 
major role in producing the actual co~nuni- 
cative message. 
Functional equivalence is a notion fre- 
quently used in linguistic theory /Arnol'd 
1976; Sanders 1980/, and it can be applied 
as a yardstick in particle matching /Fig. i/. 
A study of modal particle translation is now 
being undertaken in Szeged University's Eng- 
lish-Hungarian MT project and it is based 
on functional equivalence. 
Those researchers who study MT in re- 
stricted semantic domains might overlook 
the problem of the subjectivity of the dif- 
ferent texts. It should be noted, however, 
that "most of the unexpected structures one 
finds in a sublanguage text can be associ- 
ated not so much with a shift in semantic 
domain as with a shift /usually quite tem- 
porary/ in the attitude which the text pro- 
ducer takes towards his domain of discourse" 
/Kittredge 1982, 135/. But even with aca- 
demic papers it happens to be the case that 
during their translation one should be aware 
of the appearance of some subjective over- 
tone lest some mistranslation should ensue. 
In this respect, consider the following two 
examples with on/\]/ as a particle: 
/9/ Onl£ too often have far-reaching con- 
clusions been drawn from inadequate 
data collected from a limited number 
of languages. /Ullmann:Semantic Uni- 
versals, 1966, p. 218/ 
/\].0/ Similarly, it is ~nl~_ natural that 
verbs for "snoring" should in many 
languages contain an /r/... /Op. cit. 
p. 225/ 
The foregoing considerations lead us 
to the following sketchy representation of 
o!!!~ : 
ONLY - MP if - preceded 
- followed 
followed 
- else LO 
A translates as a: 
b: 
B translates as is 
by if + optative A 
by too 4 adverb/ B 
adject. 
by ADJ C 
D 
BArcsak 4 cond. if 
Simple Sentence 
H_aa + ~ + cond. if 
Complex Sentence 
C translates as csak 
D translates as csak 
Fig. I. Subgrammar of ~ based on its 
Hungarian functional equivalents 
MP = Modal Part. LO = Log. operator 
61 
