A Type-theoretical Analysis of Complex Verb Generation 
Satoshi Tojo 
Mitsubishi Research Institute, Inc. 
2-22 Harumi 3, Chuo-ku 
Tokyo 104, Japan 
e-mail: tojo@mrisun.mri.co.jp, m-tojo@icot.jp 
Abstract 
Tense and aspect, together with mood and modality, 
usually form the entangled structure of a complex verb. 
They are often hard to translate by machines, because 
of both syntactic and semantic differences between lan- 
guages. This problem seriously affects upon the gen- 
eration process because those verb components in in- 
terlingua are hardly rearranged correctly in the target 
language. We propose here a method in which each 
verb element is defined as a mathematical function ac- 
cording to its type of type theory. This formalism gives 
each element its legal position in the complex verb in 
the target language and certifies so-called partial trans- 
lation. In addition, the generation algorithm is totally 
free from the stepwise calculation, and is available on 
parallel architecture. 
1 The problem of complex 
verb translation 
1.1 The difference between languages 
In this section, we will briefly mention the difficulties 
of complex verb translation. The term complex verb 
refers to those verb structures that are formed with 
modal 'verbs and other auxiliary verbs in order to ex- 
press rnodMity or aspects, and which are attached to 
the original base verb. 
For example, in a complex verb: 
seems to have been swimming (1) 
'swim' is in both the progressive aspect and the past 
tense, upon which 'seem' is attached with the inflec- 
tion of agreement for the third person singular. Such a 
structure is so often hard to translate to other lan- 
guages both because surface structures become dif- 
ferent between languages, and because some aspec- 
tual/modal concepts cannot be found in the target lan- 
guage in the generation process. 1 
We admit that the most recalcitrant problem in verb 
phrase translation is that the differences of tense, as- 
pect, mood, and modality systems. Hence, we can 
XThe systems of tense, aspect, and modality are different 
from language to language, and their typology ha.s been dis- 
cussed in linguistics \[1\], \[21, \[9\]. 
hardly find a verb element in the target language, 
which corresponds with the original word exactly in 
meaning. However we also need to admit that the 
problem of knowledge representation of tense and as- 
pect (namely time), or that of mood and modality is 
not only the problem of verb translation, but that of 
the whole natural language understanding. In this pa- 
per, in order to realize a type calculus in syntax, we 
depart from a rather mundane subset of tense, aspects, 
and modMities, which will be argued in section 2. 
1.2 The difference in the structure 
In the Japanese translation of (1): 
oyoi-de-i-ta-rashii 
the syntactic main verb does not change from oyoi- 
(swim), which is the original meaning carrier; and the 
past tense is replaced by -ta- which is indistinguishable 
between past and perfect in Japanese. The feature of 
English verb complex derivation is the alternation of 
the syntactic main verb, described in fig. 1. On the 
(~ (b~ (swim))) +-- 'seem' 
past progressive 
(seem (to ~ (~ (swim)))) 
past progrcssive 
Figure 1: new verb arrival in English 
contrary in Japanese, new verb elements are aggluti- 
nated at the tail of verbs, as is shown in fig. 2. 
(~ ( ~-dei- (,~U,))) ~-- 'rashii(seem)' 
swim progressive past 
swim pro9 resslve past 
Figure 2: new verb arrival in Japanese 
1.3 The existence of external modal- 
ities 
In the history of machine translation, the analysis of a 
complex verb structure seems to have been rather ne- 
glected. Even in an advanced machine translation sys- 
1 353 
tem of CMU-CMT \[8\], a verb complex is represented 
conventionally like: 
(vp (root . ..) 
(negation +/-) 
(modality .... ) 
(tense .... ) 
(aspect ...) ) 
However this flat description fails in the following 
cases. 
First, certain modalities can have their own nega- 
tion. The sentence 
You must play. ( \[:l\[you play\]) 
can be negated in two ways: 
You must not play. 
(\[:3--\[you play\] = \[:3\[you don't play\]) 
You don't have to play. 
(-~D\[you play S 
The former is the negation of 'must' while the latter 
is the negation of the sentence, where in parentheses 
above \[\] is necessity and -~ is negation operator. 
A similar thing can be said for tense. A certain kind 
of complex verbs such as 'seem to be' can have two 
positions to be tensed as below: 
seemed to have been 
It has been discussed that there are so called ex- 
ternal modal expressions, that are embedded from the 
outside of the sentence (\[3\], \[4\], \[10\], \[11\], and \[14\]). 
The key issue is that those external ones concern only 
the speaker's attitude of the sentences in illocutionary 
situations in order to express deonticity, uncertainty, 
and so on, and are indifferent to the sentence subjects. 
Although there should be a controversial discussion as 
to whether a certain modal verb is connected to speak- 
ers or not 2, we will engage this matter in the following 
formalization of complex verb translation. 
2 Formalization of complex 
verb translation 
We will give a formal definition of modality as a func- 
tion here, and discuss the strategy of verb complex 
generation in which each modal element is used as a 
function itself. In this section we use the term 'modal 
functions' both for modal operators and aspectuM op- 
erators as long as there is no confusion. 
2Although the concept of modality is based on the deontic 
distinction/epistemic between possibility and necessity such as 
'can', 'may', 'must', tbe term is sometimes used in a broader 
sense inclusively 'will'(volitive) or 'can'(ability). We distin- 
guished the externality by its independent tense and polar- 
ity (positive/negative), so that broad-sense modal verbs such 
as volitive and ability are regarded as actually internal while 
narrow-sense ones are often external. 
354 
Externality To clarify the externality of modal 
functions mentioned in the previous section, we put 
the following definition: 
Definition of externality: We call those 
modal functions which can have their own 
tense and negation external; otherwise we call 
them internal. 
Interlingua Here we propose the simplified sets of 
modalities, aspects, and tenses, which are recognizable 
both in English and in Japanese, mostly based upon 
the work of \[7\]. 
tense = {past, present, future} 
aspect = {progressive, perfect, iterative, in- 
choative, terminative} 
external modality 
= {possible, necessaw}®{epistemic, deontic} 
or {hearsay, seem} 
internal modality = {able} 
The two dimensional analysis of eplstemic and deontic 
is considered, reflecting the duality in meaning of 'may' 
and 'must'. 
Verb elements as function We assume that the 
crude result of a parsing process for a complex verb is 
the list of verb elements such as: 
past1, seem, past2,progressive, swim,... (2) 
Our objective is: 
* to construct the interlingua expression from these 
verb elements such as: 
(q-ed(seem)) ((-t-ed(progres sive)) (swim)) 
• and to derive a surface structure of the target lan- 
guage from the interlingua expression. 
In order to do this, we will regard each verb element 
as a function. The main concern here is the domain of 
each function. For example, an external "past" oper- 
ator +ed should operate upon external verbs, not in- 
cluding internal verbs. We will realize this idea, being 
independent of each intra-grammar in complex verbs 
of various languages, in the following section. 
We express a verb complex as a composition of a 
root-word, tense, negation, and verb-complement, 
in which root and verb-form are included; veomp 
may be included recursively. 
Fig. 3 is an example of a ,k-function of tt perfect: 
`kxVe~b-,t~c~u~e.perfeet(x) 
which takes a verb structure as a parameter, and pro- 
duce a more complicated structure. 
(vp (roet; be) 
(tense negation) 
(vcomp (root swim) 
(vform ing) )) 
(~p O'oot h~ve) 
(tense negation) 
(vcomp (root be) 
(vrorm pp) 
(vcemp (root swim) 
(vform ing)))) 
Figm'e 3: perfect-ization in English 
3 Type inference for complex 
verb composition 
We will make use of polymorphic type theory \[6\] in the 
fnrther formalization. The reason we adopt the type 
formalism is to realize "the internality of concatenation 
rules"; namely each verb element shonld know which 
elements to operate upon, instead of being given a set 
of grammar on concatenation. Behind this purpose lie 
tile following two issues: 
• parallel computation: to realize a fas~ parallel 
computation on coming parallel architecture 
, partial translation: for machine translation to 
be robu:-'t for ill-founded information in the lexicon 
3.1 ~pes for verb phrase 
First we will set up several mnemonic types. They are 
not terminal symbols of type calculus, but mnemonic 
identifiers of certain types. 
A set of mnemonic types: 
{root, int, ext, teasel, negi, tense~, ne9~} 
For example, we can assume the following r' as a result 
of our parser. 
F = {seem : ext,+ed : tensei, swim : root, 
progressive : int} 
where a : a means that a is of type a. (In that parser, 
each syntactic category in the source language was re- 
placed by a mnemonic type that is actually regarded as 
a category of the interlingua.) We use cp as a type vari- 
able denoting 'verb phrase , . Because we can regard any 
verb element as a modifier from a verb phrase to an- 
other phrase, those mnemonic types are always unified 
with a type '~ -~ qo', except 'root' of type ~. The type 
<emposition is the process that specifies the internal 
variables of~ gradually. We introduce here the notion 
of 'most general unifier (mgu)' by Milner \[6\], which 
combines two type expressions and the result becomes 
a set of variable instantiations which were contained in 
dmse type expressions. 
3.2 A simple example 
Let us consider an example of concatenating seem with 
+ed(swim). First, assume that we can infer the type of 
+ed(swim) to be ~1 from F by a variable instantiation 
0, as is shown below in (3): 
r0 ~ +ed(s~im): V'~ (3) 
As for seem : ext, because 'ext' was a mnemonic for 
a certain modifier of a verb phrase, we can put ext = 
(~1 -~ 5~). Hence, for a new type variable 9a2, set r/1 
as (4): 
~q, = mgu(extO, W -4 p~) (4) 
Note that we have not specified the contents of 7 h 
yet; we will give them later. We can now infer the 
type of combined verb phrase with (4), as is shown in 
fig. 4 where we can use a canonical expression (see \[5\]) 
Ax.seem(x) instead of seem. 
\]-"/1 ~ seem: extT}l r0/\]l \[- +ed(swim): qCl~h (_+ E) 
F'igure 4: inference of seem(+ed(swim)) 
\Ve use '~' to denote that the type of the left-hand 
side is more instantiated than that of the right-hand 
side, so that: 
3.3 Full-fledged verb type 
In order to analyze the contents of 0 in (3) in detail, 
we need to clarify what tensel actually does. We will 
argue what each mnemonic means in this subsection. 
Concretely saying, we will specify what kind of internal 
variables can occur in various verb types. 
Our presupposition is that a 'full-fledged' complex 
verb contains 'external part' as well as 'internal part'. 
So that a verb type ~, is assumed to have two internal 
type variables ~oi,,t and ~oCxt. Each of them has a 'head' 
verb which incurs tense or negation if exists. We call a 
compound of tense and negation 'cap'. As for a cap, 
we assume a construction of: 
negation(tense(X)) a 
As a result, our full-fledged verb type becomes as fig. 5. 
aThis formalization is actually after the consideration of the 
following generation process. When we construct a past and 
negation of 'walk': 
walk ~ walked -, didn't walk 
is less natural than 
walk ~ don't walk -, didn't walk 
becanse it is 'don't' which incurs the operation of past. The 
exactly sinfilar thing happens also in Japanese. 
3 355 
neg --~ tense neg --~ tense 
cape eapi / / 
0 --,0-, O O 
head~ headl root 
external internal 
Figure 5: complex verb structure 
Let us get back to the type inference of +ed(swim) 
here, to see the basis of our formalism of type unifica- 
tion. +ed is of type tensei, and swim is of type root 
that was a mnemonic for simple T. 0 in (3) becomes 
as follows: 
0 = mgu(tensei,root ~ ~a) 
Because tensei itself is not a function, it must be qual- 
ified as of type capl to act as a function by itself. We 
call this qualification 'promotion', to mean that the 
component raises its type to connect with others. The 
similar thing can be said for root, which must be pro- 
moted to ~nt SO as to be in the domain of cap~. Fig. 6 
depicts the type promotion. 
tense; root 
J. J. promote 
capi\[¢/negl\] p,~,\[,'oot/head,\] 
% / 
capdPint) 
.~ promote 
I 4, 
q~,,\[root\[¢ /neg, 1/tense\]/ head,\] 
promote 
~\[¢/c2~xt, root\[C/ne9, 1/ tense\]/ headi\] 
Figure 6: type promotion 
A unifier, or what we have called a set of instanti- 
ation so far, like 0 or '7 is exactly a set of promotions 
where some missing verb elements (compared with full- 
fledged type) are ignored or replaced by other elements. 
For example, the contents of 0 becomes as follows: 
0 = \[¢/negl, tensei/capl,root/headi, 
he d,l  ,int, 
3.4 Component embedding 
In the type inference of fig. 4, we happened to choose 
two items of seer. and +ed(suim). Actually, we can 
combine any two items picked up from F. In this 
subsection we will show an example, in which we 
try to concatenate the consequence of fig. 4 with 
progressive : int. In the conventional generation, an 
internal verb element such as 'progressive' must be con- 
catenated to the structure, prior to an external element 
such as 'seem'. However, in our type expression, 'be 
+ing' can join tim 'seem(+ed(swim))' structure, and 
also correctly choose the target element 'swim' from 
the structure, instead of 'seem' which exists most ex- 
teriorly. 
If there is such a set of instantiation 772 that: 
712 = mgu(intOth, ~Th ~ ~3) (5) 
then we can validate the inference in fig. 7. However, 
because the domain of int must be 'cap'-less ~i,t, we 
cannot legalize the type inference of fig. 7 immediately. 
What we are required to do now is a type 'demo- 
tion', as opposed to the promotion. Roughly saying, 
a verb type of seem(+ed(swim)) is regarded as below: 
(6) 
though each of the right hand side of (6) is promoted 
to some qualified type. This means that the history of 
promotion must be included in the unifier 0. Hence, 
we can scrutinize the contents of 0 so that we may find 
where int can be embeddable. In this case, 
{ r oot / headi, headl/ ~p i~t } 
in 0 (viz. ~i~t ~ int) should be demoted, and we can 
redefine a verb type of (6) as: 
ten e,(root) ) --, tens ,(.d"'(root) ) ) 
Suppose that r/a replaces the history of promotions 
and demotions as below: 
(F U {~9:i,lt})0~71r/3 F seem(+ed(swim)): ~4 
then we can make the inference in fig. 8. In this case, 
{ r U q~ : int }O,h tl3 f- seem(+ed(swim)) : ~4 , ............ ~-~ I) 
\['0rl,rl3 I- ~.seem(+ed(~o(swim))): int,\]3 -~ ~4 
Figure 8: inference of A-abstraction 
there is only a place for int to be embedded in between 
+ed and swim, and int operates upon a root. 
This time, we can make an inference from abstracted 
verb structure, as is shown in fig. 9. 
3.5 Head shifting 
We mentioned that another superiority of type formal- 
ism is its partiality. Actually we can compose a verb 
stru'cture from any part of given interlingua set, and 
this feature is realized dynamically. The computation 
of verb complex generation may be stopped any time 
by ill-foundedness of machine translation system. Even 
in such a case, our formalism can offer a part of sur- 
face structure which had been partially completed so 
far. The type definition above gives us an important 
clue to how to compose verb elements. The algorithm 
necessarily becomes as follows: 
1. Pick up an original meaning base. 
4 356 
FOr/, 1- seem(+ed(sw£m)) : ~2r/, I" I- progressive:int(__+ E) 
V@lr/2 b progressive( seem( +ed( swim) ) ) : ~o3r12 
Figure 7: ill-founded type inference 
F b +ed : tensei F b swim : root 
F0 b +ed(swim) : ~1 
FF seem:ext 
F I- A<p.seem((p) : ¢Pl -9 ~2 
F0r\]I l- seem(+ed(swim)): ~2r/1 
F0rhr/s ? A~,int.seem(+ed(~(swim))): intu3 --, ~4 F K prog : int 
F0r/lr/3 1- seem(+ed.(prog(swim))): ~4 
Figure 9: inference with A-calculus 
2. Apply internal modal functions, while .. 
3. Shift the internal tense and negation to a newly 
applied modal function. 
4. Apply external modal functions, while .. 
5. Shift the external tense and negation to a newly 
applied modal flmction. 
The position shift of tense, caused by the arrival of 
a new internal verb, is diagramed in fig. 10. Fig. 11 
English 
(Vp new-root . . (tense).. (vcomp .. e.. ) 
k______j 
JRpanese 
(vp ..(w:omp ..e..(new-vcomp ..(tense)..))) 
q___ __9 
Figure 10: verb position shift 
shows that every step in the derivation process can 
offer the partial syntactic tree. 
4 Discussion 
We have shown a model of complex verb translation 
based upon type theory, in which verb elements in the 
interlingua are regarded as generation functions whose 
domain and range are elucidated so that each verb el- 
ement is certified to acquire a correct position in the 
target structure. Furthermore, the flexibility of type 
calculus was shown on the following two points. 
1. We don't need to specify all the type variables 
every time, so that all the information a type owns 
can always be regarded as partial. This means 
that we can translate partially what can be done. 
2. Because the order of calculation is not specified 
in the type expression, the verb structure can be 
composed in a way of self-organization, in tile 
meaning that tile structure is able to be decom- 
posed and to be reorganized in the process. 
In the case of complex verb translation, rephrasing to 
another part of speech is rather easier; we only need 
to 'kick out' the functional expression from the verb 
structure to point to another lexical item. Some ex- 
ternal expressions must be translated into a complex 
sentence as the feature of externality, as we have dis- 
cussed in section 1.3. We give here a definition of the 
identity in meaning between an external verb and its 
corresponding complex sentence in the type-theoreticM 
view as (7): 
<agent,qa(v) >: (7) 
..~ <affent, "u> , ~>" : O'comp 
where we denote a sentence by <agent, action(state)> 
informally, and cr is a type of sentence and cr~o~ v is a 
type of complex sentence. We can adopt (7) as the 
definition of an external verb; namely, we call such 
a verb ~ external when, for ~, there is another verb 
expression ~ which enables the type inference (7). 
Tile recent study of categorial grammar such as \[13\], 
as well as the historical feat of Montague semantics, 
claims the efficacy of type expression. The type calcu- 
lus is not specific to the complex verb nor the genera- 
tion in machine translation, so that it is applicable to 
any generation process of natural languages. Our next 
goal in due course is to apply this generation mecha- 
nism to the whole categorial grammar. 
5 Acknowledgement 
The work was originally done as a part of machine 
translation system in Center for Machine Transla- 
tion of Carnegie-Mellon University; an experimenta- 
tion system was developed on IBM-RT with Common- 
Lisp \[12\], based upon the interlingua mentioned in sec- 
tion 2. The author would like to thank the members 
5 
357 
of CMU-CMT and Institute for New Generation Com- 
puter Technology (ICOT) for many significant com- 
ments. 
root vform vcomp 
swim past 
+progressive 
vp 
root vform vcomp 
I I be p~ 
root vform vcomp 
I I swim -ing 
type demotion for an ext verb 
root vform vcomp 
I have to~ 
root vform vcomp 
I l 
be 
root vform vcomp 
I I swim -±ng 
+seem vp 
root vform vcomp I 
seem 
root vform vcomp 
I I have to~ 
root vform veomp 
f I be 
root vform vcomp I I 
swim -ing 
Figure lh trees in derivation 

References 

B. Comrie. Aspect. Cambridge University Press, 
1976. 

B. Corm'ie. Tense. Cambridge University Press, 
1985. 

O. Jesperson. The Philosophy of Grammar. Allen 
and Unwin, London, 1924. 

J. Lyons. Semantics vol.1,2. Cambridge Univer- 
sity Press, 1977. 

P. Martin-Loef. Intuitionistic Type Theory. Bib- 
liopolis, 1984. 

R. Milner. A theory of type polymorphism in pro- 
graImning. JCSS, 17(3), 1978. 

S. Naito, A. Shimazu, and H. Nomura. Classifi- 
cation of modality function and its application to 
japanese language analysis. A CL Annual Confer- 
ence Proceedings, 1985. 

S. Nirenburg, editor. Machine Translation, chap- 
ter Knowledge-based machine translation, the 
CMU approach, pages 68-89. Cambridge Univer- 
sity Press, 1987. 

F. R. Palmerl Mood and Modality. Cambridge 
University Press, 1986. 

N. Rescher. Topics in Philosophical Logic. Dor- 
drecht: Reidel, 1968. 

J.R. Searle. Expression and Meaning: studies in 
the theory of speech acts. Cambridge University 
Press, 1979. 

S. Tojo. A computational model of verb complex 
translation. Technical Report CMU-CMT-88-110, 
Center for Machine Translation, Carnegie-Mellon 
University, August 1988. 

J. van Bentham. Categorial grammar and type 
theory. Journal of Philosophical Logic, to appear, 
1989. 

E.H. von Wright. An Essay in Modal Logic. North 
Holland, Amsterdam, 1951. 
