Dialogue Translation vs. Text Translation 
-Interpretation Based Approach- 
Jun-ichi Tsujii, Makoto Nagao 
Dept. of Electrical Engineering, 
Kyoto University, 
Yoshida-honmachi, Sakyo, Kyoto, 606, JAPAN 
Abstract 
The authors discuss the differences of envi~on- 
meats whexe dialogue translation and textual trans- 
lation systems might be used. The differences are 
summarized as clear definition of information and 
active participations of speakers and hearers in di- 
alogue t~anslation. A new approach to MT, inter- 
pretation based approach, is proposed to take the 
advantages of dialogue translation environments. 
The approach introduces a layer o/ understand- 
ing to MT and can produce less structure bound 
translations than conventional approaches. 
1 Introduction 
Although we had been engaged in developing an MT sys- 
tem of texts for several years (Mu project \[Nagao85, Na- 
gao86\]), we were puzzle(| when we examined the data of 
dialogue translation gathered by the research group of 
ATR, which is a newly established research organization 
for translation of telephone dialogues and is now gather- 
ing dialogue translation data in various hypothetical sit- 
uations. 
The sample translations gathered by the ATR research 
group looked very difficult for machines, but we rarely 
found syntactic structures which make textual translation 
so difficult, such as long noun phrases or clauses, compli- 
cated conjuncted phrases, etc.(\[Tsujii84\] \[Tsnjii88\]). On 
the other hand, most of the translations of dialogues be- 
tween Japanese and English, which were produced by 
professional human interpreters, did not preserve syn- 
tactic structures of their original sentences at all. They 
were completely paraphrased in the target language and 
seemed very hard to be produced by conventional tech- 
niques developed for textual MT systems. 
Both translations, dialogue and textual translations, 
are difficult, but their difficulties are very different frqm 
each olher. 
We discuss in this paper the differences of dialogue 
translation systems and textual translation systems. Bcausel 
we do not know the difficulties of recognition of spoken ut- 
688 
terances, we will avoid the discussion about the difficulties 
of interfacing the speech recognition part and the linguis- 
tic processing part~ which we will certainly encounter in 
spoken dialogue translation systems. The dialogue trmm- 
lation in this paper is restricted to the translation of dia- 
logues through keyboards, on which ATR is now concern 
trated. 
The differences of these two translation systems mainly 
come from the fact that dialogues of certain types are 
more goal-oriented than ordinary texts. We will argue 
that the goal orientedness of dialogues makes dialogue 
translation systems more feasible than textual translation 
systems, though they are usually considered much harder. 
2 Differences of Environments 
In the current states of the art in machine translation, 
most researchers may agree that we cannot expect an ideal 
FAMT system which can translate any linguistic materi- 
als in any subject domains. So, at present, what should 
be discussed about MT systems have to be engineering 
problems. 
We should discuss problems from engineering points of 
view. That is, we should discuss, first of all, what types 
of systems or system organizations are economically and 
technically feasible in what situations of actual transla- 
tion, and what sorts of human aids can be expected in 
real application environments. 
The important consideration is how to design feasible 
MT systems which can be used in actual, rather specific, 
translation environments. Different application environ- 
ments require different technologies. Therefore, the ques- 
tions we would like to pose in this paper are: 
® Which is more feasible in actual application envi~ 
ronments, dialogu e translation systems or textual 
translation systems ? 
,., ()~m ,ee design a feasible dialogue translation system 
iu.~t by extending or modifying current MT tech- 
~,ologies dew'loped exclusively fur textual transla- 
tion ? 
Our an:~wer tothe first question, though it might sound 
sirs.age, is that diah)gue translation :ffstems of certain 
~ype.~; are more feasible than textual translation systems 
which are ( urrently developed aud connnereially a,vailable. 
~i, might b:~' the case that we imagin dialogue translation 
is ea:;ier, t ecause we have been engaged in developing a 
t( x£1~d ~,ranslatiou system ~md have recognized many, not 
o~dy diiiicl,lt but ala~ m~ty a~d dirty problems iu textual 
tra,slatiou systems (\[Nakamura86\]). 
l~ut ned only because of that, we believe dialogue trans- 
I~,tioi~ syst:..ms are more feasible, mainly because of the 
ba~sic diffe~eu(:es of environments where these two tyl)e~; 
of systems will be used. 
We can summarize the differences of environments in 
wMch th(',e two types of systeuts might be used as fbllows. 
~, Clear Deflnifiou of lnfo',malio'n : In certain types 
of di~,.logue translations, we can define rather clearly 
what iuformaliou should be transmitted tiom source 
se:atea(:es to target translations, while we generMly 
cannot in textual translation. 
By c~:rtain types of diMogues, we mean here the di- 
alogues such as dialogues for hotel reservation and 
coul:erence registration which are currently picked 
up by the ATR research group, dialogues between 
patients and doctors tried by the CMU group 
(\['romita86\]), 
etc. 
o Actiw~ Pavlicipalious o\] Speakers and Hearers : in 
most application environments of textual n~ansla - 
lion .,~ystems, they are supposed to be used by pTv= 
\]?ssio',al ~','a~tslalo'rs. We camtot have the writers 
of te~.:ts at the time of translatioxi, the persons who 
prel)~.rc texts and really want to comnmnicate ,~x)rn~ 
thing through the texts. The actual readers of traus- 
luted texts are not awdlable, either, at the time of 
the translation, who.really want to get messages or 
iufor, aatiou eucodM in the texts. 
On tim contrary, in diMogue translation, we have 
bo~l, ~he speakers (the senders of messages) and the 
hearers (the receivers of me, ages) at the time of 
translatiiig m('ssages. 
These two differences make, we claim, dialogue traits- 
lation systems more feasible in actual translation envi- 
ronments, if they are properly designed ibr taking these 
adwmtages. 
Our answer to the s(~ond question is directly derived 
ti-om the above discussion. That is, in ordel)to take the 
advantages of dialogue translation, the system organiza- 
tions should be different ti'om those for textual transla- 
tion. Mere extension of current Mr\[ ` tec|mologies for tex- 
tual tl:anslation will not result in high quality dialogue 
translation systems by which ordinary people (:a~t com- 
municate with eact, other. 
We will discuss what implications the basic differences 
of enviroments have in the design of dialogue transla- 
tion systems and, substantiate the conclusion that i\] they 
are properly designed, cerlain types of dialogue Iranslation 
syMems, are more feasible, tex:hnically at least, than the 
text translation systems which are currently available. 
3 What should be translated ? 
Fig. 1 shows a simplified framework of application sys- 
tems of natural language understanding (NLU) other than 
MT systems. In this framework, understanding of a ~eno 
leuce is regarded as a process of transformation from an 
input sentence, a linear sequence of words, into so-called 
the meaning represeulalion of lhe senleuce. 
Structural 
I)eser ipt ion; 
of Sentences 
t 
\[ s°.auti, I 
1 I Ilpll t Sen teuce 
Interpretation 
hltcrprctat ion 
t~esul tq \[Mean iltg 
}leoresen t a t ion\] ) 
lutcrllaI 
Processings 
(Problem Solving. 
Iii \[create, 
Data Base hcc\[:s~, 
etc.) 
Fig. I General OrgauizatiolL of 
NLII hppl lcat ion Systet~s 
Memdng representation in this framework is tile in- 
put to certain internal processings such as deductive in: 
ferences, problem solvings in certain restricted domains, 
data base accesses, etc., which are actually implemented 
as computer programs to carry out certain specific inter- 
nM tasks. 
Meaning of input sentences are defined in this fi'arnc~ 
work, relative to the internal tasks that the systems are 
6\[19 
expected to perfor m. In other words, what kinds of in- 
formation should be extracted from sentences are pre- 
defined, depending on the aims of the internal processings 
of the systems. Understanding is regarded as art extrac- 
tion process of information relevant to specific internal 
tasks. 
However, the internal task or the aim of translation is 
to re-express by using sentences of target lan- 
guages the information of all aspects conveyed 
by sentences of source languages, with as least 
distortion as possible(\[Tsujii86\]). 
The internal task of MT, by itself, does not define 
what information should be extracted from input texts. 
It is commonly recognized by linguists that all different 
surface linguistic expressions convey different meaning. 
MT systerns, at least textual translation systems, have 
to extract all the factors relevant to the determination of 
surface linguistic expressions. 
Most of the difficulties peculiar to MT, such as the so 
lection of appropriate target lexical items or expressions, 
etc. come from the fact that we cannot define in MT 
what aspects of information in source sentences are rele- 
vant to the determination of target expressions and should 
be extracted from source sentences• In general, we cannot 
establish a representational framework which is language 
universal and by which understanding results are repre- 
sented. 
As a consequence, most of the current systems use 
certain, linguistic levels of structural descriptions of source 
sentences, such as deep case structures in the Mu project, 
in order to calculate appropriate target descriptions. Be- 
cause the structures are far from representing understand- 
tug results and reflect the linguistic strutures of source 
sentences, their translation results are inherently struc- 
lure bound. 
On the other hands, in certain types of dialogues, we 
can define by the purpose of dialogues what is essential or 
important information conveyed by uttranees and should 
be transmitted to their translations. Here, we do not dis- 
cuss about the systems which are capable of translating 
arbitrary dialogues like chatterings among house wives 
without any purposes, but the systems which translate 
dialogues of certain restricted domains as already men- 
tioned, such as dialogues for hotel reservation, conference 
registration, etc. In such dialogues, we can define impor- 
tant information by referring to the aim of the dialogues. 
Such important information should be extracted kom 
690 
the input and properly transmitted to the target. So, 
the framework for dialogue translation becomes similar 
to that of the other applications of NLU illustrated in 
Fig. 1. We can introduce a layer of explicit understan& 
in~ to MT systems, to which important information of 
nntterances are related and so, in which results of under- 
standing can be represented in a language independent 
(but task dependent) way (\[Tsujii87\]). 
Some parts of utterances which convey information 
*mporlant for the purposes of the dialogues are related 
to this layer and interpreted. Because information is ex- 
pressed language-independently in this layer, we can ex- 
pect less structure bound translation results for the parts 
of utterances. On the other hand, the other parts which 
do not convey important information need not be related 
to this explicit understanding layer. They would be trans- 
lated by conventional MT technologies. 
Let me show you a simple example from hotel reserva- 
tion dialogues, which actually appears in the experiments 
conducted at ATR. 
lax 1\] 
\[Japanese\] hoteru( holel)-wa, tomodachi(friends)- 
to Disuko( discotheque)-ni ikitai(t0 waut to go 
)-node, Roppongi(Roppongi - the name of the 
place in Tokyo)-no chikaku(t0 be near )-ga iino(t0 
be good)-desuga ? 
\[Structure Bound English Translation\] As for 
hotel, because \[ I \] would like to go to Dis- 
cotheque with friends, to be near to Roppongi 
is good. 
\[English Translation\] Because I would like to 
go to discotheque with friends, I prefer to stay 
at a hotel near to Roppongi. 
In this example, we can divide the utterance into two. 
One is the part which contain important information for 
hotel reservation, and the other is the part which does not. 
Because the location of the hotel which the client wants 
to stay is important for the task of hotel reservation, the 
underlined part of the uttrance is important and should 
be translated as properly as possible. 
The other part of the utterance, which gives the l:eason 
why the client wants to stay at a hotel in a specific region 
of Tokyo (Roppongi), is less important. Our contention 
is that these two parts of the utterance should be treated 
differently in dialogue translation systems. 
Note that the English translation given above has a 
deep case structure completely different from that of the 
source sentence. The translation contains the verbs to 
prefer a~nd to slay whose corresponding Japanese verbs 
do not appear in the source sentence. 
4 Architecture of Dialogue Trans- 
la.l;ion Systems 
Fig. 2 shows a schematic view of a system which translates 
dialogues in a certain restricted domain. The translation 
system tmows in advance what kinds of information el 
concepts ~zre important for the natural flow of dialogues in 
thai. specific task domain, and also knows a set of surface 
linguistic expressions which may convey such important 
information . 
Structural 
Descriptions 
of Sollrce Sel|i 011COS 
i 
Xnowle(Igc of the Domain 
t i--r Inference F, llg i ne 
//~lnterprotatlon Result9 
/ 
\[/:terpret'-~at io~ Vr a;hr as i~ 
7 x 
e ~, //~ Strtlcttll'al 
Transfer In the \]. ..... ~ l)eaeriptiolls 
Conventional Mr j l°f the Taritet 
I 
Target Seat esces 
k 
1 Source SelLt 0aces 
........ For the laportant Parts 
...... For tile less important Parts 
I:ig. "2 fhe 0rgaaizati0n of a Dialogue Traaslatl0n 3yste~ 
By truing these kinds of knowledge, the system should 
be able to distinguish the parts which convey important 
informational contents extract them and relate them to 
the repres:entations of the explicit understanding layer. 
It is certainly difficult to capture the important parts 
of untter~mces and understand them, but if we confine 
ourselves to a certain restricted task domain, it is much 
easier than story understandings in general, which A1 re~ 
searchers have been interested in. 
Furthermore, it is easier than developing intelligent 
dialogue systems which make conversations with lmman 
users in restricted task domains, for example, to make 
appropriate hotel reservation. Although those intelligent 
systems should be able to understand fully the u~r's ut- 
terances, a dialogue translation system needs not. The 
hearer, the receiver of the translated messages may un~ 
derstand tile speaker's intention. A translation system 
is only required to provide information sufficient for his 
understanding. It is desirable but not inevitable for a dia- 
logue tarnslation system to have the ability of recognizing 
the speaker's plan. 
A translation system which extracts intporlaut in/of 
mutton from source utterances and re-expresses it in the 
target language can produce less structure bound trans- 
lations. It can reduce varieties of surface expressions to a 
single meaning representation, if they convey essentially 
the same information, the same from the view point of the 
purposes of dialogue. For example, the tbllowing Japanese 
expressions, which have quite different (deep case) struc- 
tures, may be reduced to a single representation and re- 
expressed by English expressions. The English expres- 
sions will be chosen independently of Japanese strueture.'~ 
but only by considering English contexts where the ex-- 
pressions are located. 
\[Ex 2\] 
\[Japanese\] Roppngi-no chikaku (to be near) - 
no hoteru (hotel) -gaii (to be good) wa. 
\[Structure bound translation\] A hotel near 
to Roppongi is good. 
\[Japanese\] Ropponi-atari (around) -no hoteru 
(hotel) -we onegaishimasu (please). 
---* \[Structure bound translation\] A hotel around 
Koppongi, please. 
\[Japanese\] hoteru (hotel)-ha roppongi-no chikaku 
(to be near) -ga iinodesuga (to be good) 
\[Structure bound translation\] As for hotel, 
to be near to Roppongi is good. 
\[Japanese\] tsugou-ga-iino (t0 be convenient)- 
ha roppongi-ni chikai (to be near) hoteru (ho~ 
tel) desuga. 
\[Structure bound translation\] What is con- 
venient is a hotel near to Roppongi. 
As an extreme, we can imagine a system which pro- 
duces fluent translations only for important parts of ut- 
terances but awkward ones for the other parts. 
691 
\[~:x a\] 
\[Because (to Go Discotheque) Friends\] I prefer 
to stay at a hotel near to Roppongi. 
Note that a dialogue translation system needs not un- 
derstand utterances completely, and so, it needs not un- 
derstand why the clause 'tomodachi-to disuko-ni ikitai'(I 
would like to go to discotheque with friends) can be the 
reason tbr staying at a hotel near to Roppougi. ~lb under- 
stand this, a system has to have a lot of real world knowl-- 
edge which is not so closely related with hotel reservation 
tasks, such as 
(1)Roppongi is a special region in ~lbkyo where 
many discothetques exist 
(2)In order to go to some place, it is preferable 
to stay at a hotel near to the place 
(3)If something is preferable, the client tend 
to ..... 
etc. 
A system which converses intelligently with human to 
make hotel reservation should have such knowledge and 
abilities of using it. However, a dialogue translation sys- 
tem has only to provide information to the human partic~ 
ipants who organize conversation intelligently. 
5 Active Participation of Speak- 
ers and Hearers 
What should be understood from texts is highly depen- 
dent on the intentions of actual writers and readers of 
texts, but r,either of them is available at the time of traus- 
lation in textual translation. 
The same texts would be read by different readers with 
different intentions who would like to get different sorts 
of information from the translated texts. 
Readers of translated texts are often irritated because 
they cannot get necessary information for them. We found 
that translated texts are irritating not only because trans- 
lations are awkward, but also because original texts tt,em.- 
selves do not contain in/ormation which actual readers 
would like to get. Furthermore, evaluating translations 
produced by MT systems is difficult, because the evalua- 
tion highly depends on both what readers want to know 
and what source texts really contain. MT systems cannot 
produce good translations from bad source texts. 
However, the environments of dialogue translation sys- 
tems, in which both actual writers and readers are avail- 
able at the time of translation, are much better than tex. 
tual translation. The readers can ask questions directly 
592 
to the writers in order to get necessary inforn|ation, when 
they cannot get it from the translated messages or whctt 
they cannot understand the translations. 
Furthermore, the translation system can also pose ques- 
tions to the writers (senders of messages) to clarify their 
intentions. We can expect an intelligent translation sys- 
tem to play a role of a coordinator of conversations by 
keeping track of exchanges of important in/srma~ion be- 
tween dialogue participants (see Fig.a). 
\[EX 41 
\[English participant\] In which region do you 
want to stay in Tokyo ? 
\[Japai, ese participant\] disuko~ni ikitai. (1 would 
like to go to disdotheqne) 
\[System's Question to the participant\] shitsumon-. 
ha anata-no kibou-suru hoteru-no basho desu 
? (The question is 'in which resgion do you 
want to stay in Tokyo ?'. Would you specify 
the place which you prefer to stay ?) 
\[Japanese participant\] disuko-ni ehikai hoteru- 
desu. (A hotel near to a discotheque) 
\[Translated reply to the English participant\] 1 
prefer to stay a hotel near to a discotheque. 
Note also that what is important in dialogue transla- 
tion is the exchange of information through translation 
but not translated texts obtained as the result. Transla.. 
tions are satisfactory when the participants achieve their 
goals, even if they are awkward. On the contrary, in tex- 
tual translation, translated texts themselves are impor- 
tant and they should be natural and cleat' enough in all 
aspects, because different readers with different intention 
will read them and be interested in different aspects of 
informtional conteflts of same texts. 
6 Conclusions 
We discussed in this paper the difl'erences of dialogue 
translation systems and textual translation systerns. Es- 
pecially, we emphasized the differences of environments 
where these two types of systems will be used, and dis. 
cussed what implications the differences have in the &> 
sign of feasible dialogue transla\[ion systems. The main 
differences are: 
Clear Definition of Information in Dialogue 3.¥ans- 
lation 
Active Participations of Speakers and tlearers in Di- 
alogue 'lYanslation 
We argued that, if they are properly designed to take 
these adval,tages of dialogue translation, dialogue transla.- 
tion systems can be more feasible than textuM translation 
systems. F,.~p(x~ially, we proposed a new approach to MT, 
called interpretation based approach, in which an explicit 
layer of uude'rslanding is introduced and parts of utter.- 
~nees conveying importanl information are inte~7)reled by 
being related to this layer. 
Though the approach produces le~s siructure bou'ad 
translations through understanding and paraphrasing, it 
is different fi:om the conventional pivot or interlingual ap- 
proach which claims their understanding results can be 
represented in the lbrms which are independent on both 
individual hmguages and tasks. Tim mtderstauding layer 
in the proposed approach, on the other hand, is language 
universal but highly dependent on specific tasks of dia- 
logues. In the proposed approach, we have to design ml 
internM meaning representation specific to the domain of 
the dialogues. 
The k)llowings are important in order to develop a 
ti~'asible dialogue translation system based on the inter- 
pretation based approach. 
o httegration of different layers of descril)tions: We 
have to devise technologies for integrating the d(. ~ 
scriptions of the understanding layer and the con- 
venti,)nal structural descriptions of source sentences 
to produce translations, because single utterances 
generally consist of the parts which convey impor- 
tant information and those which do not. The idea 
of sally net should be re-considered in this new con- 
text. 
Flexible interaction during translation: 2¥aditional 
post- and pre-editings by human are not the best 
way lo take the advantage of the availability of speak- 
ers and hearers in dialogue translation. We have 
to design much flexible interaction modes including 
clarificalion dialogues between the system and the 
dialogue participants. 
¢ Management of dialogue strnctnres: ht order to find 
imporlant information, a system should have the 
ability of managing the dialogue sturutures. It should 
be able to utilize various kinds of knowledge such 
as knowledge about surface clue expressions, task 
depeadent knowledge, discourse structures, etc. to 
recover the struetnres of omgoing dialogue. Espe- 
dally, a translation system as a coordinator of con- 
vers0.tions has to keep track of importanl informa.. 
lion exchanges through sequences of utterances. 
Acknowledgement 
The authors are grateful to the members of the study 
group on Spoken Dialogue at ATR. The discussions with 
them are very helpful to improve the paper. 

References 

\[Nagao85\] Nagao, M., Tsujii, J., Nakamura, J. : The 
Japanese Government Project for Machine Transla- 
tion, Computational Linguistics, Vol. 11, NO.2-3, 
1985. 

\[Nagao86\] Nagao, M., Tsnjii, J., Nakamura, J. : The 
Transfer Phase of the Mu Machine Translation Sys- 
tem, Proe. of Coling 86, 1986. 

\[Nakamura86\] Nakamura, J., Tsujii, J:, Nagao, M. : So- 
lutions for Problems of MT,Parser, Proc. of Coling 
86, 1986. 

\[Tomita86\] Tomita, M., Carbonell, J. : Another 
Stride Towards Knowledge-based Machine TRansla- 
tion, Proe. of Coling 86, 1986. 

\[Tsujii84\] Tsujii, J., Nakanmra, J., Nagao, M.: Analysis 
Grammar of Japanese in the Mu-projeet, Proc. of 
Coling 86, 1986. 

\[Tsnjii86\] Tsujii, J. : Future Directions of Machine 
Txanslation, Proc. of Coling 86, 1986. 

\[Tsujii87\] Tsujii, J. : What is PIVOT ? Prec. of MT 
Smnmit, Hakone, Japan, 1987. 
