Interactive Translation : a new approach 
Rtmi ZAJAC 
GETA, UJF-CNRS, Domaine Universitaire, 
38041 Saint Martin d'Hdres cedex 53X FRANCE 
ATR Interpreting Telephony Research Laboratories 
Twin 21, MID Tower, 2-1-61 Shiromi, Higashi-ku, 
Osaka 540, JAPAN 
Abstract 
A new approach for Interactive Machine Translation where the author 
interacts during the creation or the modification of the document is 
proposed. The explanation of an ambiguity or an error for the 
purposes of correction does not use any concepts of the underlying 
linguistic theory : it is a reformulation of the erroneous or ambiguous 
sentence. The interaction is limited to the analysis step of the 
translation process. 
This paper presents a new interactive disambiguation scheme based 
on the paraphrasing of a parser's multiple output. Some examples of 
paraphrasing ambiguous sentences are presented. 
Key-words 
Machine Translation, Interactive Translation, Intelligent Word 
Processor. 
A. THE PROBLEM 
Goals 
The main goal here is to resolve correctly ambiguities arising in 
natural language analysis in every case. To date, this cannot be 
aecomplisheA by any existing automatic MT system. The problem 
remains choosing a sentence structure that most accurately reflects the 
author's intended message and it therefore remains an unsolved and 
yet important problem. 
Classical machine translation systems use heuristics based on 
statistical regularities in the use of language. Interactive systems ask 
questions directed at a specialist of the system (like rFS of BYU 
\[Melby & alii 80\]) and/or a specialist of the domain (like the TITUS 
system of Institut Textile de France \[Ducrot 82\]). There, tile 
interaction is done purely at the syntactic level, as a syntax directed 
editor for a programming language is used by a specialist of both the 
system and the language 1. 
Models or projects using extralinguistic knowledge will not be able 
to solve ambiguities in every case: a document is generally supposed 
to provide some piece of new information that may not be coded in 
the knowledge base. 
The use of learning procedures is at present not effective. 
None of these approaches can resolve ambiguities correctly in 
every case. The problem is basically a matter of interpretation: only 
the author of the document himself can tell what he intended to say. 
Nevertheless, he is not supposed to have any knowledge of the target 
!language and therefore, he should not be involved during the transfer 
phase 2. 
In the case of interaction with the author, two problems arise: 
' 1. The author is supposed to write his document and not to solve 
weird linguistic problems. 
~2. In all interactive systems, the system asks a specialist questions 
based on knowledge of the underlying linguistic theory. For 
interacting wiUa the author, this approach is to be rejected: see 
examples of interaction with ITS \[Melby & alii 80\] or even Tomita's 
system \[Tomita 84\]. 
785 
A proposal 
To solve these problems, we propose : 
- to integrate the interactive system as one function of a word 
processor, the interaction being initiated by the author; 
- to explain an ambiguity presenting a set of paraphrases generated 
from the set of parse trees of the ambiguous sentence; 
- to explain an error (of spelling and of grammar) by presenting a 
"reasonable" correction and a comment of the error. This point will 
not be treated in this paper. See for example \[Jensen & Heidorn 83, 
Zajac 86b\]. 
Discussion 
The integration in a word processor allows the use of a "controlled 
language" where checking and correction is done during the creation 
or modification of a document. This can be viewed as an extension of 
the capabilities of a simple spellchecker, in the form a toolbox of 
linguistic aids for the author, checking the spelling, the terminology, 
the grammar and the style. For the translation of technical material, 
the use of a normative grammar, imposing precise limitations on 
terminology and syntax, will entail more clarity and concision in 
expression, as argued by \[Elliston 79\] and \[Ruffino 82\], and will 
offer a convenient tool for normalizing a documentation. 
In the cases where a correct interpretation uses domain knowledge 
interactively, it will be possible to make a clear cut between the pure 
linguistic knowledge, to be coded in the analyser, and the 
extralinguistic knowledge (semantics of the domain). As a matter of 
fact, it is not always justified to integrate in the grammar specific 
semantic categories, as in the METEO system for example. This 
separation will allow us to enlarge the domain of applicability of a 
machine translation system, that could be, for example, extended to a 
personal translation system \[Tomita 84\], and this could be interesting 
when no translation service is available or if the quantity of 
translation does not justify using the services of a translator \[Kay 
82\]. 
GETA \[Vauquois 78\]. There are four main levels of linguistic 
interpretation: 
1. categories : morphosyntactic categories (gender, number, class of 
verb,...), semantic categories (abstract, concrete,...), actualisation 
categories (perfective, imperfective,...), syntactic categories (noun, 
verb, valencies,...) and syntactic classes (sentence, verb phrase,...). 
2. syntactic functions : subject, objectl, object2, attribute of the 
subject, attribute of the object, complement of noun or adjective, 
detemainer, circumstancial complement .... 
3. logical relations : predicate-argument relations. 
4. semantic relations : causality, consequence, qualifier, qualified .... 
The geometry of the tree corresponds to a phrase structure : the 
labels of inner nodes are syntactic classes, the labels of leaves are 
lexical units. Additional information is coded in the attributes of each 
node. 
The morphological, syntactic and semantic categories are computed 
by a morphological analyser written in ATEF. The output of the 
morphological analyser will be the input of a structural analyser 
producing multiple outputs in ambiguous cases. 
Architecture of the interactive translation system 
A classical machine translation process in the ARIANE system 
\[Boitet & alii 82, 85\] uses a morphological analysis phase (MA) and 
an automatic structural analysis phase (SA, on the left of the figure). 
This phase is replaced with an interactive phase (in the middle). 
Disambiguation and correction dialogues make calls to paraphrasing 
and correcting modules. The remainder of the process uses classical 
automatic transfer steps (LT and ST) and generation steps (SG and 
MG). On the figure, the existing modules are in bold outline, 
modules where there exists only a model are in normal outline, 
specified modules are shaded grey). 
B. THE PROPOSAL 
The linguistic framework 
The linguistic treatment of ambiguities is based on the struture of a 
linguistic descriptor (labeled and attributed tree) defined in SCSL 
\[Zajac 86a\]. Let us recall briefly the multilevel linguistic theory of 
7B6 
descriptor 
of the 
source text 
i 
NO 
NO 
MA \[ I- 
f source lexi l source text !" t / 
\ .................. .." __ .. J I 
paraphrasing modules 
descriptor 
of the 
target text 
ROBRA 
i 1 
Strategy for interactive disambiguisation 
The approach we propose is not to produce explanations using 
linguistic concepts of the linguistic model (as it has been done up to 
now, see \[Melby & alii 80, Tomita 84\]), but to produce paraphrases 
that make explicit the ambiguous relations. 
Lexical ambiguities are quite trivial to solve by presenting the 
definitions from a dictionary. In this paper, they are supposed to be 
already solved. Structural ambiguities are treated after a complete 
parse. In a practical setting, the best strategy would probably be to 
produce a complete parse, to solve lexical ambiguities and then to 
solve structural ambiguities for the remaining parses. 
We propose, for some types of ambiguities that can arise, 
paraphrastk; transformations that make ambiguous relations explicit. 
paraphrasing step, the generation being for the same language as the 
source language. The process is illustrated below. 
(parsot;ee2~jparapllraslng~k_~ar~pllrase 2~ ,~ ~ ~ grammar 
Cparse ,reel ROBRA ~paraphrase 1J 
Generation of paraphrases 
Each parse tree will be sent tea paraphrasing grammar, written in the 
ROBRA transformational system \[Boitet & alii 80\]. Then, each 
paraphrased tree will be sent to a generator to produce the 
corresponding string. The whole process is very similar to a second 
generation translation process, the transfer step being replaced by a 
C. SOME EXAMPLES OF PARAPHRASTIC 
TRANSFORMATIONS 
1. Scope of coordination. The nominal phrase "perturbations in 
the atmosphere and radiation" may have two interpretations as shown 
below. 
787 
NP 
perturbations N P 
In the atmosphere NlPsf=coord 
I 
and radiation 
NP 
perturbations N P N sf=coord 
I In the atmosphere and radiation 
Presenting the phrase structure as parenthetized structure, we may 
have : 
1. (perturbations (in the atmosphere (and radiation))) 
2. (perturbations (the atmosphere) (and radiation)) 
This kind of presentation (or a similar projective scheme) is used in 
the DLT project of BSO (personal communication, 1987) and in 
\[Tomita 84\]. A conjunction of coordination van be used to "factorize" 
a phrase. The explanation of the scope of the coordination will be the 
"developement" and the permutation of the factorized terms. The 
presentation using the paraphrasing scheme would be as follows : 
> pertubations in the atmosphere and radiation 
1. perturbations in the radiation and perturbations in the atmosphere 
2. radiation and perturbations in the atmosphere 
2. AP as NP complement or VP complement: "Le magistrat 
juge les enfants coupables" 
PHVB 
GN jugs GN 
le maglstrat lee enfants GA sf:eplt 
coupables 
788 
PHVB 
GN Jugs GN GA 
Is maglstrat los enfents coupables 
Using explicit paraphrasing of the determination with a relative 
pronoun, we may have : 
> le magistrat juge les enfants coupables 
1. le magistrat juge les enfants qui sent coupables 
(the magistrate judges the children who are guilty) 
2. le magistrat jnge que les enfant sent coupables. 
(the magistrate judges that the children are guilty) 
3. Subject and object. The sentence "Which author quotes this 
lecturer ?" may have two interpretations, sf is the syntactic function 
whose value may be the subject (subj) or the first object (objl) of 
the governor of the sentence, "quotes". There is also an ambiguity 
with the argument place (argO, argl) for logical relations (It). In 
this case, we may present the structures normalizing the sentence to 
active declarative form. Note that the phrase structures in this 
example are identical. 
S 
sf:ob\]l N P quotes N Psf:subJ r lr=argl I I lr=argO 
which author the lecturer 
S 
sf=subJ NP quotes NPsf:objl Ir=argO I I ,rf.rgr 
which author the lecturer 
> Which author quotes this lecturer ? 
1. the lecturer quotes the author 
2, the author quotes the lecturer 
4. A well known example. The sentence "Mary sees a man in 
the park with a telescope" may have six different interpretations as 
below. 
S 
I I I I 
Mary a man In the park with a telescope 
S 
NP Mary a man In the park I 
i 
with a telescope 
S 
NP NP NP 
Mary a man NP with a telescope 
In the park 
S 
NP sees NP 
NP Mary a man } 
I 
In the park 
NP I 
with a telescope 
S 
NP sees NP 
Mary a man N P 
In the park N P I 
with a telescope 
For paraphrasing, we have to move circumstancials ahead aud if there 
is more than one, to coordinate them. We have also to make noun 
phrase determinations explicit by using relative pronouns and, if 
there is more than one determination for the same noun phrase, we 
coordinate them. We should have then : 
> Mary sees a man in the park with a telescope 
1. with a telescope, in the park, Mary sees a man 
2. in the park which has a telescope, Mary sees a man 
3. with a telescope, Mary sees a man who is in the park 
4. Mary sees a man who has a telescope and who is in the park 
5. Mary sees a man who is in the park which has a telescope 
Conclusion 
We have presented a new approach for interactive translation based 
on the paraphrasing of ambiguous sentences. Compared to others 
\[Ducrot 82, Melby & alii 80, Tomita 84\], this proposal makes a step 
forward to the user level of understanding, transfering part of the 
burden of interaction from the man to the machine : no special 
linguistic knowledge is required but the simple (!) everyday 
competence of the user of language. This could be realized using only 
linguistic paraphrastic transformations on the output of the parser. 
Some simple examples have been presented using quite simple 
transformations : in the case of ambiguous PP attachment there are 
two possibilities : (1) the PP modifies a noun phrase and this could 
be made explicit by using a relative pronoun; (2) the PP modifies the 
sentence and it can be moved ahead of it. 
A set of paraphrastic transformations is now being developed to be 
able to write a transformational grammar that will allow experiments 
on a corpus. 
Notes 
1. In the case of technical documents, the operator (linguist, 
translator or documentalis0 may not have enough knowledge to solve 
some question. For example, in the sentence "the experiment requires 
carbon and nitrogen tetraoxyde" \[Gerber & Boitet 85\], the scope of 
"and" is ambiguous and we may read either "carbon tetraoxyde and 
nitrogen tetraoxyde" or "nitrogen tetraoxyde and carbon". To be able 
to choose correctly, we have to know that carbon tetraoxyde does not 
exist in ordinary chemistry. But again, this conclusion could be false 
in a very special s6tting, e.g. an experiment described by the text in 
which carbon tetraoxyde is being produced as an (unstable) 
intermediate product of th reaction! 
2. It may be possible to organise the interaction simply by presenting 
the set of definitions of the transfer dictionary for each unit having 
7~9 
several equivalent in the target language, and ask tim author to choose 
one of them. 

References 

\[Boitet & alii 80\] BOITET C., GUILLAUME P., QUEZEL- 
AMBRUNAZ M., Manipulation d'arborescences et paralldlisme: le 
systdme ROBRA , COLING-80. 

\[Boitet & alii 82\] BOITET C., GUILLAUME P., QUEZEL- 
AMBRUNAZ M., ARIANE-78: an integrated environment for 
automated translation and human revision, COLING-82. 

\[Boitet & alii 85\] BOITET C., GUILLAUME P., QUEZEL- 
AMBRUNAZ M., A case study in software evolution : from 
ARIANE 78.4 to ARIANE 85, COLGATE-85. 

\[Carbonell & Tomita 85\] CARBONELL J.G., TOMITA M., New 
approaches to machine translation, COLGATE-85. 

\[Ducrot 82\] DUCROT J.M., TITUS IV, in Taylor P.J., Cronin B. 
(eds) Information management research in Europe, Proceedings of 
the EURIM 5 Conference, Versailles, 12-14 May, 1982, ASLIB, 
London. 

\[Elliston 79\] ELLISTON J.S.G., Computer aided translation - a 
business view point, in SNELL B.M., (ed) Translating and the 
computer, North-Holland, 1979. 

\[Heidorn & alii 82\] HEIDORN G.E., JENSEN K., MILLER L.A., 
BYRD R.J., CHODOROW M.S., The EPISTLE text-critiquing 
system, IBM Syst. Journal, 21/3, 1982. 

\[Gerber 84\] GERBER R., Etude des possibilitds de coopdration entre 
un systdme fondd sur des techniques de comprehension implicite 
(systdmes logico-syntaxiques) et un systdme fond~ sur des 
techniques de comprehension explicite (systdme expert), Th~se de 
3 i~me cycle "informatique", INPG, 1984. 

\[Gerber & Boitet 85\], GERBER R. and BOITET C., On the design 
of expert systems grafted on MT systems, Proc. of the Conf. on 
theoretical and methodological issues in Machine Translation of 
natural languages, 1985, Colgate University, Hamilton, N.Y. 

\[Jensen & Heidorn 83\] JENSEN K., HEIDORN G.E., The fitted 
parse : 100% parsing capability in a syntactic grammar of English, 
Proc. of the Conf. on Applied Natural Language Processing, pp 93- 
98, Santa-Monica, California, February, 1983. 

\[Kay 82\] KAY M., Machine Translation, AJCL 8/2, pp 74-78, April- 
June, 1982. 

\[Melby & alii 80\] MELBY A.K., MELVIN R., SMITH R., 
PETERSON J., ITS: Interactive Translation System, COLING-80. 

\[Ruffino 82\] RUFFINO J.R., Coping with machine translation, in 
LAWSON V., Practical experience of machine translation, 
North-Holland Pub. Co., 1982. 

\[Tomita 84\] TOMITA M., Disambiguating Grammatically 
Ambiguous Sentences by Asking, COLING-84. 

\[Tomita 85\] TOMITA M., Feasability Study of Personal~Interactive 
Machine Translation Systems, COLGATE-85. 

\[Vauquois 78\] VAUQUOIS B., Description de la structure 
interm~diaire, communication presented at Luxembourg Meeting, 
April 17-18, 1978. 

\[Zajac 86a\] ZAJAC R., SCSL : a linguistic specification language for 
Mr, COLING-86. 

\[Zajac 86b\] ZAJAC R., Etude des possibilit~s d'interaction homme- 
machine dans un processus de Traduction Automatique, Th6se de 
Doctorat en Informatique, Institut National Polyteehnique de 
Grenoble, juillet 1986. 
