Learning Tense Translation from Bilingual Corpora 
Michael Schiehlen* 
Institute for Computational Linguistics, University of Stuttgart, 
Azenbergstr. 12, 70174 Stuttgart 
mike@adler, ims. uni-stuttgart, de 
Abstract 
This paper studies and evaluates disambigua- 
tion strategies for the translation of tense be- 
tween German and English, using a bilingual 
corpus of appointment scheduling dialogues. It 
describes a scheme to detect complex verb pred- 
icates based on verb form subcategorization and 
grammatical knowledge. The extracted verb 
and tense information is presented and the role 
of different context factors is discussed. 
1 Introduction 
A problem for translation is its context depen- 
dence. For every ambiguous word, the part of 
the context relevant for disambiguation must be 
identified (disambiguation strategy), and every 
word potentially occurring in this context must 
be assigned a bias for the translation decision 
(disambigt, ation information). Manual con- 
struction of disambiguation components is quite 
a chore. Fortunately, the task can be (partly) 
automated if the tables associating words with 
biases are learned from a corpus. Statistical 
approaches also support empirical evaluation of 
different disambiguation strategies. 
The paper studies disambiguation strategies 
for tense translation between German and En- 
glish. The experiments are based on a corpus 
of appointment scheduling dialogues counting 
150,281 German and 154,773 English word to- 
kens aligned in 16,857 turns. The dialogues were 
recorded, transcribed and translated in the Ger- 
man national Verbmobil project that aims to 
develop a tri-lingual spoken language transla- 
tion system. Tense is interesting, since it oc- 
curs in nearly every sentence. Tense can be ex- 
* This work was funded by the German Federal Min- 
istry of Education, Science, Research and Technology 
(BMBF) in the framework of the Verbmobil Project un- 
der Grant 01 IV 101 U. Many thanks are due to G. Car- 
roll, hi. Emele, U. Heid and the colleagues in Verbmobil. 
pressed on the surface lexically as well as mor- 
phosyntactically (analytic tenses). 
2 Words Are Not Enough 
Often, sentence meaning is not compositional 
but arises from combinations of words (1). 
(1) a. Ich habe ihn gestern gesehen. I have him yesterday seen 
I saw him yesterday. 
b. Ich schlage Montag vor. I beat Monday forward 
I suggest Monday. 
c. Ich mSchte mich beschweren. I 'd like to myself weigh down 
I'd like to make a complaint. 
For translation, the discontinuous words must 
be amalgamated into single semantic items. 
Single words or pairs of lemma and part of 
speech tag (L-POS pairs) are not appropriate. 
To verify this claim, we aligned the L-POS pairs 
of the Verbmobil corpus using the completely 
language-independent method of Dagan et 
al. (1993). Below find the results for sehen 1 
(see) in order of frequency and some frequent 
alignments for reflexive pronouns. 
sehen:VVFIN be:VBZ (aussehen) 
sehen:VVFIN do:VBP (do-support) 
sehen:VVFIN have:VBP (perfect) 
sehen:VVFIN see:VB 
72 
44 
39 
35 
176 wir:PRF meet:VB (sich treffen) 
33 wir:PRF we:PP 
30 sich:PRF spell:VBN (sich schreiben) 
16 ich:PRF forward:RP (sich freuen auf) 
14 wir:PRF agree:VB (sich einigen) 
13 ich:PRF myself:PP 
1The prefix verb aus-sehen (look, be) is very frequent 
in the corpus, it often occurs in questions. Present sehen 
was frequently translated into perfect discover. 
1183 
3 Partial Parsing 
A full syntactic analysis of the sort of unre- 
stricted spoken language text found in the Verb- 
mobil corpus is still beyond reach. Hence, we 
took a partial parsing approach. 
3.1 Complex Verb Predicates 
Both German and English exhibit complex verb 
predicates (CVPs), see (2). Every verb and verb 
particle belongs to such a CVP and there is only 
one CVP per clause. 
(2) He would not have called me up. 
The following two grammar fragments describe 
the relevant CVP syntax for English and Ger- 
man. Every auxiliary verb governs only one 
verb, so the CVP grammar is basically 2 regu- 
lar and implementable with finite-state devices. 
S --+ ... VP ... 
VP --+ hd:V (to) VP 
VP --+ hd:V ... (Particle) 
S --+ ... hd:Vfi n ... (Refi) ... VC ... 
S --+ ... (Refl) ... VC ... 
S --~ ... VC hd:Vfin ... (Refl) ... 
vc ~ (vc) (zu) hd:V 
VC --+ SeparatedVerbPrefix 
English CVPs are left-headed, while German 
CVPs are partly left-, partly right-headed. 
, ~ CVP 
/ ~VP 4.\ / /%', 
/ VP4, \ ' , / 
Er wird es getan haben miissen 
he will it done have must 
He will have to have done it. 
2The grammar does not handle insertion of CVPs into 
other CVPs and partially fronted verb complexes (3). 
(3) Versuchen h/itte ich es schon gerne wollen. 
try 'd have I it liked to 
I'd have liked to try it. 
3.2 Verb Form Subcategorization 
Auxiliary verbs form a closed class. Thus, the 
set sub(v) of infinite verb forms for which an 
auxiliary verb v subcategorizes can be specified 
by hand. English and German auxiliary verbs 
govern the following verb forms. 
• infinitive e.g. will 
• to-infinitive (T) e.g. want 
• past participle (P) e.g. get 
• P V T e.g. have 
• present participle V P V T e.g. be 
• infinitive (I) e.g. miissen 
• zu-infinitive (Z) e.g. scheinen 
• perf.part, with haben (H) e.g. bekommen 
• H V I e.g. werden 
• H V I V Z e.g. haben 
• perf.part, with sein V H V I V Z e.g.sein 
3.3 Transducers 
Two partial parsers (rather: transducers) are 
used to detect English and German CVPs 
and to translate them into predicate argument 
structures (verb chains). The parsers presup- 
pose POS tagging and lemmatization. A data 
base associates verbs v with sets mor(v) of pos- 
sible tenses or infinite verb forms. 
Let m = \[{mor(v) : Verb vii andn = I{sub(v): 
Verb v }\[. Then the English CVP parser needs 
n + 1 states to encode which verb forms, if 
any, are expected by a preceding auxiliary verb. 
Verb particles are attached to the preceding 
verb. The German CVP parser is more compli- 
cated, but also more restrictive as all verbs in 
a verb complex (VC) must be adjacent. It op- 
erates in left-headed (S) or right-headed mode 
(VC). In VC-mode (i.e. inside VCs) the order 
of the verbs put on the output tape is reversed. 
In S-mode, n + 1 states again record the verb 
form expected by a preceding finite verb Vfi n- 
VC-mode is entered when an infinite verb form 
is encountered. A state in VC-mode records the 
verb form expected by Vii n (n + 1), the infinite 
verb form of the last verb encountered (rn), and 
the verb form expected by the VC verb, if the 
VC consists of only one verb (n + 1). So there 
are m • (n + 1) 2 states. As soon as a non-verb is 
encountered in VC-mode or the verb form of the 
previous verb does not fit the subcategorization 
requirements of the current verb, a test is per- 
formed to see if the verb form of the last verb 
1184 
i00000 I00000 
i0000 
I000 
I00 
i0 
1 
0 
pluperf.preterite perfect 
I I I 
past perfect ¢ 
past -~--- A future past 
-G--- /, x 
present perfect "~( ...... /' "x 
present -~"-',." ~ "x 
future perfect -~7" ." "..:~ 
..... ...:'*.;, ....... .-?,: ....... .*., ",. : .- ~ 
......... .-:.~,-a ., % ~. ........"'" ,"" ~'~ 
/' . ,' • 
......... I........~¢ ,~.. ..... \ 
.:.-"" ~ "'".,~ 
present future 
l ! 
past perfect o 
past -~-- 
future past -G--- 
present perfect-'~ ...... 
present -~- 
future perfect -~- 
future -¢--- 
I0000 
i000 
100 
I0 
1 
0 
pluperf.preterite perfect 
/" \. 
.1' • -, ",~ 
.......... ""- X " -'!" ", \'- 
.... :-.'" " ........ "5 
'::"' ,,~' ili ..... ~| 
present future 
Figure h translation frequencies G-eE (left: simple tenses, right: progressive tenses) 
I00000 
I0000 
1000 
I00 
i0 
0,.~ 
PastPer f(prog) 
i I I I I I I I I I 
pluperfect --<)- ..... 
preterite -+--- ~k 
perfect-f\]-- / 
present × / \ 
future -~- / X /~ 
;~,, / \ / k 
I ",, / %,% 
o/ y/ ,/, ,.:,,,,..::,,:....- ,, / ..... 
, .. ,, ,:,, • . ... ",~, , ..... . / *X 2 ",,,, ?:: ................ ................... / ",.... 
,' , . ,, ... ,.. ,. / ", .. 
. • , ,/ ....,...- ~ "..,~ . 
Past (prog) FutPastPresPf (prog) Present (prog) FutPerfFuture (prog) 
Figure 2: translation frequencies E-+G 
in VC fits the verb form required by Vfin. If it 
does or there is no such finite verb, one CVP has 
been detected. Else Vfin forms a separate CVP. 
In case the VC consists of only one verb that 
can be interpreted as finite, the expected verb 
form is recorded in a new S-mode state. Sep- 
arated verb prefixes are attached to the finite 
verb, first in the chain. 
3.4 Alignment 
Iu the CVP alignment, only 78 % of the turns 
proved to have CVPs on both sides, only 19 % 
had more than one CVP on some side. CVPs 
were further aligned by maximizing the trans- 
lation probability of the full verbs (yielding 
16,575 CVP pairs). To ensure correctness, turns 
with multiple CVPs were inspected by hand. 
In word alignment inside CVPs, surplus tense- 
bearing auxiliary verbs were aligned with a 
tense-marked NULL auxiliary (similar to the 
English auxiliary do). 
3.5 Alignment Results 
The domain biases the corpus towards the fu- 
ture. So only 5 out of 6 German tenses and 
12 out of 16 English tenses occurred in the cor- 
pus. Both will and be going to were analysed as 
future, while would was taken to indicate con- 
ditional mood, hence present. 
• present (15,710) • perfect (344) 
• preterite (331) • pluperfect (49) 
• future (150) 
1185 
• present (12,252; progressive: 358) 
• past (594; progressive: 23) 
• present perfect (227; progressive: 7) 
• past perfect (1; progressive: 1) 
• future (1,429; progressive: 23) 
• future perfect (10) • future in the past (3) 
In some cases, tense was ambiguous when con- 
sidered in isolation, and had to be resolved 
in tandem with tense translation. Ambiguous 
tenses on the target side were disambiguated to 
fit the particular disambiguation strategy. 
• G present/perfect (verreist sein) (39) 
• G present/past (sollte, ging) (229) 
• E pres./present perfect (/lave got) (500) 
• E pres./past (should, could, must) (1,218) 
4 Evaluation 
Formally, we define source tense and target 
tense as two random variables S and T. Disam- 
biguation strategies are modeled as functions tr 
from source to target tense. Precision figures 
give the proportion of source tense tokens ts 
that the strategy correctly translates to target 
tense tt, recall gives the proportion of source- 
target tense pairs that the strategy finds out. 
(4) precisiontr(ts, tt) = 
P(T = ttl S = ts, tr(ts) = tt) 
recalltr ( ts, tt ) = 
P(tr(ts) = ttl S = ts, T = tt) 
Combined precision and recall values are formed 
by taking the sum of the frequencies in numer- 
ator and denominator for all source and target 
tenses. Performance was cross-validated with 
test sets of 10 % of all CVP pairs. 
4.1 Baseline 
A baseline strategy assigns to every source 
tense the most likely target tense (tr(ts) = 
arg maxttP(tt\[ts), strategy t). The most likely 
target tenses can be read off Figures 1 and 2. 
Past tenses rarely denote logical past, as dis- 
cussion circles around a future meeting event, 
they are rather used for politeness. 
(5) a. Ich wollte Sie fragen, wie das aussieht. 
I wanted to ask you what is on. 
b. iibermorgen war ich ja auf diesem Kon- 
gref~ in Ziirich. 
the day after tomorrow, I'll be (lit: was) 
at this conference in Zurich. 
4.2 Full Verb Information 
Three more disambiguation strategies condi- 
tion the choice of tense on the full verb in 
a CVP, viz. the source verb (tr(ts,vs) -- 
arg maxttP(ttlts,vs), strategy vs), the target 
verb (tr(ts,vt), strategy vt), and the combina- 
tion of source and target verb (tr(ts, (vs,vt)), 
strategy vst). The table below gives preci- 
sion and recall values for these strategies and 
for the strategies obtained by smoothing (e.g. 
Vst, Vs, Vt, t is Vst smoothed first with vs, then 
with vt, and finally with t). Smoothing with t 
results in identical precision and recall figures. 
t 
Vs 
Vt 
Vst 
Vst, Ut, Vs 
Vst, Vs, Vt 
G~E 
prec. recall , t 
.865 .865 .865 
.885 .854 .879 
.900 .876 .896 
.916 .819 .899 
.902 .892 .900 
.899 .889 .897 
E-~G 
prec. recall , t 
.957 .957 .957 
.970 .941 .965 
.973 .933 .966 
.979 .874 .965 
.970 .956 .967 
.971 .957 .967 
We see that inclusion of verb information im- 
proves performance. Translation pairs approx- 
imate the verb semantics better than single 
source or target verbs. The full verb contexts of 
tenses can also be used for verb classifications. 
Aspectual classification: The aspect of a 
verb often depends on its reading and thus can 
be better extrapolated from an aligned corpus 
(e.g. I am having a drink (trinken)). German 
allows punctual events in the present, English 
prefers present perfect (e.g. sehen, finden, fest- 
stellen(discover, find, see), einfallen (occur, re- 
member); treffen, erwischen, sehen (meet)). 
World knowledge: In many cases perfect 
maps an event to its result state. 
finish 
forget 
denken an 
sich verabreden 
sich vertun 
settle a question 
4.3 Subordinating 
=~ fertig sein 
=~ nicht mehr wissen 
=~ have in mind 
=~ have an appointment 
be wrong 
(the question) is settled 
Conjunctions 
Conjunctions often engender different mood. 
• In conditional clauses English past tenses usu- 
ally denote present tenses. Interpreting hypo- 
thetical past as present increases performance 
by about 0.3 %. 
1186 
* In subjunctive environments logical future is 
expressed by English simple present. The verbs 
vorschlagen (suggest) (in 11 out of 14 cases) and 
sagen (say) (2/5) force simple present on verbs 
that normally prefer a translation to future. 
(6) I suggest that we meet on the tenth. 
. Certain matrix verbs 3 trigger translation of 
German present to English future. 
4.4 Representation of Tense 
Tense can not only be viewed as a single item 
(as sketched above, representation rt). In com- 
positional analyses of tense, source tense S and 
target tense T are decomposed into compo- 
nents (S1,... , Sn) and (T1,... ,Tn). A disam- 
biguation strategy tr is correct if Vi : tr(Si) = 
T,. 
One decomposition is suggested by the en- 
coding of tense on the surface ((present/past, 
O / will/ be going to/werden, O / have/ haben/ sein, 
0/be), representation rs). Another widely 
used framework in tense analysis (Reichenbach, 
1947) ((E</~/>R, R</~/>S, ±progr), repre- 
sentation rr) analyses English tenses as follows: 
R~S R<S R>S 
E~R present past 
E<R present perf. past perf. fut. perf. 
E>R future future past 
A similar classification can be used for German 
except that present and perfect are analysed as 
ambiguous between present and future (E_>R~S 
and E<R_>S). 
G-+E E-+G 
repr. strat, prec. recall , t prec. recall , t 
rt t 
rs t 
rs Vs 
rs vt 
rs Vst 
rr t 
rr Vs 
rr Vt 
rr Vst 
.865 .865 .865 
.859 .859 .859 
.883 .853 .876 
.894 .871 .890 
.912 .815 .894 
.861 .861 .861 
.885 .855 .879 
.898 .875 .894 
.915 .817 .897 
.957 .957 .957 
.955 .955 .955 
.966 .938 .961 
.971 .933 .964 
.978 .874 .962 
.964 .964 .964 
.973 .945 .970 
.977 .939 .972 
.982 .878 .970 
The poor performance of strategy rs corrob- 
orates the expectation that tense disambigua- 
tion is helped by recognition of analytic tenses. 
Strategy rr performs slightly worse than rt. The 
really hard step with Reichenbach seems to be 
aausgehen von, denken, meinen (think), hoffen 
(hope), schade sein (be a pity) 
the mapping from surface tense to abstract rep- 
resentation (e.g. deciding if (polite) past is 
mapped to logical present or past), rr per- 
forms slightly better in E-+G, since the burden 
of choosing surface tense is shifted to genera- 
tion. 
repr. strat. 
rr~ 
rr, Vs 
rr' vt 
rr, Vst 
G--+E 
prec. recall ,t 
.861 .861 .861 
.883 .853 .877 
.895 .872 .891 
.913 .816 .895 
E--+G 
prec. recall , t 
.957 .957 .957 
.968 .940 .963 
.971 .933 .965 
.979 .875 .964 
5 Conclusion 
The paper presents a way to test disambigua- 
tion strategies on real data and to measure the 
influence of diverse factors ranging from sen- 
tence internal context to the choice of represen- 
tation. The pertaining disambiguation informa- 
tion learned from the corpus is put into action 
in the symbolic transfer component of the Verb- 
mobil system (Dorna and Emele, 1996). 
The only other empirical study of tense transla- 
tion (Santos, 1994) I am aware of was conducted 
on a manually annotated Portuguese-English 
corpus (48,607 English, 43,492 Portuguese word 
tokens and 6,334 tense translation pairs). It nei- 
ther gives results for all tenses nor considers dis- 
ambiguation factors. Still, it acknowledges the 
surprising divergence of tense across languages 
and argues against the widely held belief that 
surface tenses can be mapped directly into an 
interlingual representation. Although the find- 
ings reported here support this conclusion, it 
should be noted that a bilingual corpus can only 
give one of several possible translations. 

References 
Ido Dagan, Kenneth W. Church, and William A. 
Gale. 1993. Robust Bilingual Word Alignment for 
Machine-Aided Translation. In Proceedings of the 
Workshop on Very Large Corpora: Academic and 
Industrial Perspectives, pages 1-8. 
Michael Dorna and Martin C. Emele. 1996. 
Semantic-Based Transfer. In Proceedings of the 16th 
International Conference on Computational Lin- 
guistics (COLING '96), Copenhagen, Denmark. 
Hans Reichenbach. 1947. Elements of Symbolic 
Logic. Macmillan, London. 
Diana Santos. 1994. Bilingual Alignment and Tense. 
In Proceedings of the Second Annual Workshop on 
Very Large Corpora, pages 129-141, Kyoto, August. 
