MODALS AS A PROBLEM FOR MT 
by Bengt Sigurd & Barbara Gawr6nska 
Dept of Linguistics and Phonetics, Lund University, Sweden 
e-mail:Bengt.Sigurd @ling,lu.se 
Summary 
Tim paper demonstrates tim problem of 
translating modal verbs and phrases and shows 
how some of these problems can be overcome 
by choosing semantic representations which 
look like representations of passive verbs. 
These semantic representations suit alternative 
ways of expressing modality by e.g. passive 
constructions, adverbs and impersonal con- 
structions in the target language. Various 
restructuring rules for English, Swe(lish and 
Russian am presented. 
Introduction 
Modal verbs belong to the ,nest frequent 
English words. Clauses with modal 
expressions make up a considerable part of the 
clauses of any text, why any MT-system 
which is claimed to cover empirical texts with 
reasonable quality must include solutions to the 
problems discussed in this paper (cf. An et al, 
1993 for a corpus based approach). One of the 
problems connected with the analysis of such 
clauses is the fact that the distinction between 
auxiliaries and medals is not clear. Verbs like 
ought to and dare (to) are often labelled semi- 
auxiliaries, begin and continue are called 
aspectual verbs etc. A common denominator of 
auxiliaries, semi-auxiliaries, medals and sev- 
eral perception verbs is their "auxiliary mean- 
ing" including tense, modality and aspectual 
perspective (of. Gawrofiska, 1993). In the 
following, the term "auxiliary "will be used 
even when referring to verbs traditionally 
called medals and perception verbs. 
The typical "auxiliary" meanings, e.g. 
modality, aspectual perspective and tense show 
great encoding variation between languages. 
And even within one language one may often 
choose between several different lexical- 
grammatical modes. In English one may 
choose between X may come, It is possible 
that X comes, and X possibly comes with only 
minor stylistic differences. 
Swedish kan has both a root meaning 
(equivalent to is able to) and an epistcmic 
meaning (equivalent to may), while English 
can only has the first meaning. The mode of 
encoding auxiliary meanings may be even 
more differentiated in other languages. Thus, 
He must come, has to be rendered by the 
passive construction II est obligd de venir or 
the impersonal construction ll fimt qu'il vienne 
in French. In Russian one would have had to 
render this sentence using an an adjective 
(dolZen) or an adverb (nado, neobxodimo). 
Japanese would have to use konakereba 
narimasen (literally: "It won't do if X does not 
come"). MT-systems dealing with a certain 
pair of languages may tailor the meaning 
representations of auxiliaries ad hoc, but multi- 
language systems such as Swelra, the Swedish 
Computer Translation Project (Sigu,'d & 
Gawrohska, 1990), must choose more 
universal representations and suitable restruct- 
uring transfe," rules its will be demonstrated. 
Tim p,'oblem of modality is also of general 
interest for linguistic, semantic and cognitive 
theo,'y (Sweetser, 1990). 
English Verb Phrases 
There are two basic types of verb phrases in 
English (el Sigurd, 1992), one (1) consisting 
only of a finite main verb (with possible 
complements), e.g. Bill jumps, the other (2) 
consisting of a finite auxiliary verb followed 
by a non-finite main verb (with possible 
complements), e.g. Bill must.jtmTp. The non- 
finite main verb in the second type may be in 
the infinitive without to as illustrated, or in an 
infinitive with to as in Bill began re jump. The 
non-finite verb may also be a past participle as 
in Bill has.fltml)ed or a present participle as in 
Bill began.jtt,ll)ing. The choice between non- 
finite forms is an automatic consequence of the 
preceding verb. We note that the have meaning 
perfect lense takes the perfect participle as in 
Bill has juml)ed, while the have which is an 
equivalent of must takes the infinitive with to 
as in Bill has to.iump. The verb begin may take 
an infinilive with to (Bill began to.jmnp) or 
alternatively a present participle (Bill began 
jttmping). The auxiliaries in other languages, 
e.g. German and Swedish display similar 
combinatorial properties. 
The second type o1' verb phrase mentioned may 
be expanded to include further non-finite 
auxiliaries, as illust,'ated in: Bill must 
.~Bill must dare to hegin t().jumj2 imd Bill 
re,st &tr¢ to he?in to be able to.jmni~ 
Occasionally there may be a short adverb 
between the non-finite forms as in Bill must 
120 
dare not to j!tmp and even after a to as in Bill 
must dare to not jttmp, although the so called 
split infinitive construction is condemned by 
prescriptive grammarians. 
Semantic Representations o1' Verb 
Phrases 
The meaning complexes corresponding to the 
verb phrases described may simply be 
rendered as lists of the constituent verb 
meanings, where the tense of the finite verb is 
shown, but the particular form of non-finite 
verbs and tim occurrence or non-occurrence of 
infinitive markers are not shown. This is the 
approach taken by Swetra. The lexical meaning 
representations or semantic markers in Swetra 
tire of the form re(S, G), where m denotes 
meaning, S is the main meaning of the word 
denoted by a kind of Machincse English and G 
the grammatical meaning. The verb form 
jumps has the ,'epresentation m(i,mp, pres). 
The rnearfings of the infinitive form (to).jump, 
the past participle jumped and the present 
participlcjttml)ing are ;ill rendered as re(jump, 
nonf). The following table shows some pre- 
liminary meaning representations of verb 
phrases under the phrases. 
j, mps 
Ira(jump, pres) l 
began to jump 
\[m(jump, non f), re(begin, past)\] 
began jumping 
Ira(jump, non f\], re(begin, past)t 
dare begin to jtmll~ 
Ira(jump, nonf), re(begin, nonf), 
re(dare, pros)\] 
may be able to jmnl) 
Ira(jump, n.onl), m(ahle, nonf), re(may, pres)\] 
was painted 
\[re(paint, nonf), re(passive, past)l 
The semantic representations ilhtstrated have 
the main verh first and the order of tim verbs is 
thus reversed if compared to l';nglish. The 
order chosen is arbitrary. We have illustrated 
tim representation of a passive phrase was 
painted as well. This representation is also 
used for the Swedish morphological passive 
which is mr~tlades (there is also a syntactic 
passive in Swedish: bh, v .,glad). 
Passive Meaning Representations l'or 
some Auxiliaries 
Linguists have often talked about the meanings 
of auxiliaries using words and concepts such 
as: compulsion, obligation, permission, 
ability, necessity, prohahility and possibility. 
Words which can be associated with 
compulsion and obligation, e.g. muvt, s'hall 
ought to have been called deontic. A d~stmctmn 
between a root me,'ming and an epistcmic 
incasing has been observed for a number of 
auxiliaries, e.g. must, where the two meanings 
are illustrated by the following two examples 
(from Swectser, 1990, p, 49). 
Bill must be home by ten; Mother won't let 
him stay out any longer 
Bill must be home already; I see h& coat 
The epistemic meaning may also be rendered 
by sentence adverbials as illustrated by: Bill is 
evidentlyZs'eemingly home or an impersonal 
expression with an adjective as in It is clem¢ 
obvious that Bill is home. It would be an 
advantage if the semantic representations of 
auxiliaries could bE related in a simple way to 
the meaning representations of such equivalent 
:.ldverbs and adjectives. 
The simplest way to represent the meanings of 
auxiliaries is illustrated by re(can, pres). It is 
then logical to choose m(ccm, nonf) fo," the 
assumed infinitive be able to. Similarly, one 
may represent the meaning of present mttst by 
m(,m.~'t, pres) and and the meaning of tim 
corresponding infinitive be obliged to by 
m(mttst, non.f). The phrase be obliged to can, 
however, also be regarded as a passive in 
which case it would be represented as: 
Ira(oblige, non.\['), re(passive, nonf)\]. With this 
representation in mind one may represent 
present must as/re(oblige, nonf), re(passive, 
pres)\] instead imdcan as Ira(able, non\]), 
re(passive, pres)\] which directly gives us the 
synonym is able to. 
ThEre are l'urthcr semantic arguments in favour 
of representing constn~ctions with modal verbs 
in a wlty similar to passive clauses. The 
referent of the subject in a sentEncE with a 
modal verb is not as "agentivc" as the referent 
of the suhjcct of a typical active content verb. 
If the term agent is to be understood ;is the 
clement of tim event-situation that is actively 
involved in and responsible for tim triggering 
of tim Event-situation in question (Oawrofiska 
1993), it becomes clear that the subject of a 
modal construction is not a typical agent. Its 
responsibility lkn" triggering the event-situation 
is reduced by the obligation, allowance or 
compulsion component. In Russian and 
Polish, tiffs property of tim subject referent is 
overtly expressed by the dative case in several 
modal construction. The equivalent of the 
English or Swedish! subject in Russian 
sentEnCes with nado ('it is obligatory'), 
121 
neobxodimo ('it is necessary'), nel'zja ('it is 
not allowed') or Polish wolno ('it is allowed') 
occurs in the dative, a case normally associated 
with the semantic roles 'beneficiary' or 
'experiencer'. 
Passive representations of some 
English auxiliaries 
One may hesitate wben choosing terms in tile 
meaning representations as is obvious fi'om 
works on modals. The following ark used by 
Swetra. 
/* allowance */ 
elex(\[m(allow, nonO, m(passive, pres)\], v, 
aux, fin, _, 1, inf, i, \[I) --> \[may\]. 
The semantic representation/re(allow, no11/'), 
re(passive, pres)\] of file finite form may makes 
it comparable to the meaning rep,'esentations of 
is~was allowed~permitted to and the infinitive 
be allowed/permitted to. This meaning may be 
illustrated by Bill may come (as he asked). The 
form may may have another (epistemic) mean- 
ing as well (below). 
/* obligation */ 
elex(\[m(oblige, nonl), re(passive, pres)\], v, 
aux, fin, _, 1, inf, i, \[\]) --> \[mustl. 
The representation lm(oblige, nonf), 
re(passive, pres)\] gives is obliged to as a 
synonym as is generally suggested in 
grammars. 
/* capability */ 
elex(\[m(able, non f), re(passive, pres)\], v, 
aux, fin, _, 1, inf, i, \[\]) --> \[can\]. 
This representation makes it possible to get is 
able w as a direct synonym and tile infinitive 
be able to which is desirable. 
/* epistemic appea,'ancc */ 
elex(lm(perceive, non0, re(passive, pres)\], v, 
aux, fin, agr(pl .... ), 1, toinf, i, It) 
--> \[seem I. 
This analysis makes the phrase Bill is 
perceived to come parallel to Bill seems to 
come, which is reasonable, although the first 
phrase seems to he too specific and implies a 
latent agent. This epistemic meaning is also 
expressed by grammarians by such words as: 
inference, conclusion. 
There is a number of cpistemic expressions 
which indicate the sense modality of the 
perception more or less clearly as illustrated 
by: Bill is said~heard/felt to come. 
/* epistemic possibility, probability, certainty 
elex(\[m(possible, nonl), m(passive, pres)\], v, 
aux, fin, _, 1, inf, i, \[\]) --> \[may\]. 
elex(\[m(pmbable, non0, re(passive, pres)\], v, 
aux, fin, _, l, toinf, i, \[\]) --> \[ought\]. 
elex(\[m(certain, nonl), m(passive, pres)\], v, 
aux, fin, _, 1, inf, i, \[\]) --> \[shall\]. 
The semantic representations of auxiliaries 
must be sensitive to tim presence/absence of 
negation. The equivalent of English mzgst, 
when not negated, is the etymologically related 
Swedish verb mc~ste, whereas must not is not 
to be rendered by m()ste inte ('must' not), but 
byfi~r inte ('is not allowed to'). The need for 
attention to negation becomes even more 
conspicuous when considering tile effects of 
the interplay between negation and aspect in 
translation between Russian and English (cf. 
Isa~enko 1962: I98): 
a. nado vernut' knigu 
'must' relurn-perf book-ace 
'one has to return the book/lbe book must be 
returned' 
b. nado vozvra,~at' knigi 
'must' return-imp books-nom/acc 
'one ought to return books' 
c. ne nado vozvrag~:at' dtu knigu 
neg 'must' return-imp tiffs book-ace 
'one does not need/have to return tills book' 
Tile problem of translation between English 
and Russian can be solved by lexically en- 
coded negation and aspect control, according 
to patterns like the following: 
elex(lm(oblige, nonl), re(passive, pros)\], v, 
aux, fin, _, 1, inf, i, 1711) --> \[must\]. 
rlex(\[m(oblige, nonl).m(passive, pres)l, adv, 
rood, inf, \[perfl, 1 ..... \[I) 
--> Inadc, l. 
The Russi,'ul lexical entry (rlex) contains the 
information that an English modal verb with 
the meaning code m(obligue, nonf), 
re(passive, pres) is to be rendered by the 
Russian modal adverb nado, provided that 
nado is not negated and that it is combined 
with a perfective infinitive clause \[perf\]. 
Negation would have shown in the slot now 
marked \[perfl.This pattern covers cases 
exemplified by a. The other patterns are 
handled in an analogous way. 
122 
English and Swedish equivalents 
It is evident from the analysis above that there 
is a number of auxiliaries which can be 
translated directly as a consequence of the 
semantic representations suggested. Table I 
shows corresponding English and Swedish 
Table 1: English and Swedish equivalents 
auxiliaries, adverbs and adjective/; which one 
would also like to be able t(i translate between. 
/* allowance */ 
Ira(allow, nonf), re(passive, pres)\] 
/* obligation */ 
Ira(oblige, nonO, re(passive, pres)\] 
English 
may 
is allowed to 
is permitted to 
lnust 
is obliged to 
has to 
obligatorily 
compulsory 
Swedish 
ffir, mfi 
{ir tillfiten att 
iil" medgiven art 
tilllits att 
medges att 
miisle 
~il" tvungen all 
tviingsvis 
n{klviindig 
/* capability */ 
\[re(able, nonf), re(passive, pres)l C~./11 
is capable of 
kan 
{Jr i stiind att 
/* epistemic appearance */ 
\[m(perceive, nonlT), re(passive, pres) l 
/* epistemic possibility etc */ 
\[m(lmSsible, nonl), re(passive> pres)l 
Ira(probable, nonf), re(passive, pros)\] 
seem to 
appear to 
is said to 
may 
should 
seemingly 
apparently 
evidently 
obviously 
clearly 
apparonl 
obvious 
lllay 
possibly 
possible 
should 
probal~ly 
probable 
verk:u" art 
1"6refaller att 
syncs, tycks 
set ut att 
s:dgs, l~il" 
kan 
torde 
skall 
synbarligen 
till syncs 
uppenbarligen 
tydligen 
klarl 
Ul~penl'lart 
tydligt 
kan 
m/.~.iligen 
m~Sjligtvis 
ln{~jlig 
tordc 
sannolikt 
sannolik 
Ira(certain, nonO, re(passive, past)l 1111_1St 
cerlainly 
miisto 
slikert 
siikerligen 
123 
Deriving Parallel Constructions 
If there is no equivalent auxiliary in the target 
lexicon an advanced MT-system may try to 
find an equivalent by deriving parallel 
constructions e.g. with adverbials. It will then 
have to make changes in the functional 
representation and move parts of the meaning 
representations. The following is a general 
Prolog rule, which moves the epistemic 
meaning from the predicate to the adverb. The 
rule assumes the analysis of auxiliaries 
presented above and suitable lexical represent- 
ations. The rule states that if there is a structure 
such as A, there is also B. 
instruct(A, B) :- 
A=\[pred(\[m(X, non0\], \[m(Epist, _)\], 
m(passive, T)II), 
advl(\[\])\], % N may come 
B= \[pmd(\[m(X, T)\]), 
advl(m(Epist, _))\].% N comes possibly 
Note how tim tense morpheme (T) is also 
moved. 
The following rule ilhlstrated how/t is 
possible that Bill colll(L'¢ C(lll be derived. It 
presupposes a certain syntactic analysis where 
English it is represented as iml)ers. 
restruct(A, B) :- 
A= \[subj(N), pred(\[m(X, non0, Im(Epist, _)l, 
re(passive, T)\]\])\], % N may come 
B=\[subj(impers), pred(lm(Epist, _), 
re(passive, pres)\]), 
obj(lsubj(N), pred(m(X, T))\])I. 
% It is l~ossible Ih(lt N comes 
In translation from and into Russian, there is a 
need for rendering an impersonal construction, 
like nado vernut' knig,, into an English or 
Swedish construction with an overtly realized 
subject (E. The book must be returned, S. 
Boken m&'te liimnas tillbaka). In such cases, 
tile 'restruct' rules move the representation of 
the Russian object into the subjecl slot of the 
target representation and instantiale the mode 
variable in the constant 'passive'. The definite- 
heSS value of the subject is assumed to be, by 
default, +definite, which generally holds for 
perfective constructions with singular objects 
and preverbal subjects. 
restruct (A, B):- 
A= \[subj(\[l), pred(\[m(Verb, \[nonf, perfl), 
Aux, Tense, Mode\]), obj(m(Z, _))1, 
% nado vetvult' kn&u 
B= \[subj(m(Z, def)), pmd(\[m(Verb, nonl), 
Aux, in(passive, Tense)D\]/ 
% the book must be returned 
As an alternative, the empty subject slot is 
filled by a generic personal (genpers) pronoun 
(E. one, S. man): 
restruct (A, B):- 
A= \[subj(\[\]), prod(Ira(Verb, \[nonl', pert'l), 
Aux, Tense, Mode\]), obj(m(Z, _))\], 
% nado vermtt' knigu 
B-- \[subj(m(genpers, _), pred(\[m(Verb, nonf), 
Aux, re(passive, Tense)\]), obj(m(Z, 
dem))\]. 
%) one must return this book 
Such rules may be used as transfer rules in MT 
systems. They may also be used in oMer to 
derive synonynlous expressions in the same 
hmguage. Some subtle semantic and stylistic 
differences between the target and the source 
sentences sometimes occur. However, the 
translations are generally comprehensible. 
References 
An, D. U., Kim, G. C., Lee, J. H & Muraki, 
K. 1993. Corpus Based Modality 
Generation for Korean Verbs in Pivot E/JK 
System. Prec. NLPRS '93, 
f;ukuoka, Japan, 25-34 
Coates, J. 1983. The Semantics ofiMo&tl 
Auxiliaries. Croom Helm 
Gawro~ska, B. 1993. An MT Oriented Model 
of Aspect cmd Article Semantics. Lund. 
Lund University Press 
Isa,~enko, A. V. 1962. Die russische Sprache 
clef" Gegenwart. Tell I: Formenlehre. Halle 
(Saale): VEB Max Niemayer Verlag. 
Sigurd, B 1992 "A New Analysis for Machine 
Translation of the Auxiliary and Main Verb 
Complex". St,dia Linguistica 46:1,30-48 
Sigurd, B. & Gawrofiska, B. 1988 "The 
potential of Swetra- a mullilanguage 
Translation System". Conqmterv and 
7)anslation 3,237-250 
Sweetser, E. 1990 From Etymology to 
Pragmatics. Cambridge Studies in 
Linguistics 54 
124 
