abstract 
PRAGMATICS IN MACHINE TRANSLATION 
Annely Rothkegel 
Universit~t Saarbr~cken 
Sonderforschungsbereich 100 
Elektronische Sprachforschung 
D 6600 Saarbr~cken 
West-Germany 
TEXAN is a sysLem of bransfer,.oriented text analysis. 
Its linguistic concept is based on a communicative 
appY'oacll within the framework of speech act theory. 
In this view texts are considered to be the result of 
linguistic actions° It is assumed that they control 
the selection of translation equivalents. The trans- 
ition of this concept of linguistic actions (text 
acts) to the model of computer analysis is performed 
by a context-free illocution grammar processing cate- 
gories of actions and a propositional structure of 
states of affairs. The grammar whicll is related to a 
text lexicon provides the connection of these catego- 
ries and the linguistic surface units of a single 
language. 
I. The Problem 
One of the main tasks of machine translation, besides 
the resolution of ambiguities and the generation of 
appropriate structural analyses, is tlle selection of 
adequate translation equivalents. It has been Found 
that an analysis which even produces unequivocal re- 
suits does not suffice for the production of pragma- 
tically adequate texts in the target language. 
a different text type it may be right or even must 
have this form. On the other hand, regulations (REGU- 
LATE) differ in verb forms. Thus in German present 
tense is to be used, in English shall-forms, and in 
French present and future may be alternatives. A ge- 
neral principle is, that the participants never are 
pronominalize(lo 
The question now is what kind of linguistic model can 
hell) us to structure the re\].evanL components of the 
analysis system? 
2. Concept of Text Acts (TA) 
Our system needs a linguistic model in which content, 
function and form of lingusitic expressions in a text 
are connected. We think that a ~ood concept for this 
purpose may be the concept of text acts (Rothkegel 
1984). TA are speech acts in which texts are produced. 
When we translate, we are producing a new text. 
We follow Searle's analysis of speecll acts into illo- 
cutionary, propositional and locutionary parts and 
assume, with respect to texts, the existence of three 
There are problems wiLh respect to the selection of 
appropriate lexemes, collocations, idiomatic expres- 
sions on the one hand. On the otller hand we have to 
know what kind of syntactic patterns and anaphorica\] 
or elliptical constructions usually are applied with 
respect to t.he text type. What we need is informaLion 
on cummunicative norms. In addition to a syrYtacLic 
and/or semantic analysis we have to provide a pragma- 
tic component especially in order to solve problems 
on the \]eve\] of transfer° 
The notion that linguistic usage and tlle selection of 
rneans of expression (\]exis and syntax) is directed by 
or at least influenced by -communicative intent- 
ions has received increasing attention witll respect 
to problems of translation° Recent research in this 
area include communicative grammars for foreign-\]an~ 
guage learning (e.g. Leech/Svartvik 1975), but also 
more specific Studies which explicity Lake account of 
text function (ReiS/Vermeer 1984., ThJel 1980) and as- 
pects of action in texts (HOnig/KuSmaul 1982). These 
studies have influenced the theoretical foundations 
of TEXAN to the extent ttlat we view communicative as- 
pects as decisive for the solution of translation 
problems. 
Some stlort examples of our texts (interacting-regula~ 
Ling texts, especially international treaties)may 
illustrate this approach. We should know when a spe~ 
ciai pattern has to be applied in different languages 
and when iL has to be changed. It has been found in 
these texts that there is a special type of defini- 
tion (DEFINE) with lexical restrictions and which al- 
ways is realized by participle constructions in Eng- 
lish, German, French, Ita\]ien, etc. A translation by 
a relative clause, e.g. in German, would be wrong. In 
parts of text acts (I: text illocution; T: tllematic 
specification of Bile propositional part; R: reper- 
toires of lexical and grammatical expressions which 
are typically used For a specific communicative task). 
\]A : ( l, T, R ) 
Automatic procedures for tile processing of speech act 
basically have to do with the selection and represen- 
tation of contextual factors° They determine the as- 
signment oF illocutions to linguistic utterances (Gaz- 
dar 1981). What models developed for this purpose have 
in common is the use of overall schemas whithin which 
the respective speech acts can be interpreted. While 
Evans (1981)llandles general definitions of situation, 
Al\]en/Perrault (1980), Cohen (1978) and Grosz (1982) 
use general action plans in which the speech acts oF 
interest are embedded. This principle, which is appl- 
ied to dialogues in the models mentioned, we have 
applied to written texts in TEXAN (example of an art.. 
icle in Fig.l). 
3. Model of Analysis 
l-he analysis of text acts is oriented conceptually in 
a top-down fashion. In tile context of machine proces- 
sing, however, we have to rely on the linguistic sur- 
face as input data. TEXAN is a system which builds on 
other programs already completed within our project. 
We use a syntax parser (SAIAN, cf. SALEM 1980), for 
instance, which provides a description of constituent 
structure and valencies. Furthermore, we use a program 
for ease-grammatical analysis (PROLID, cf. Harbusch/ 
Rothkege\] 1984) which provides a role interpretation 
on the description of constituent structure. Input in- 
to TEXAN , then, is a complete structural and case- 
relational description of sentences. This determines 
335 
REGULATE (case) 
GENERALIZE (general case) 
LFIX (activity) 
L ~INDIVIDUALIZE (partner) 
CONCRETIZE (commerce) f SPECIFY (object,con) 
DEFINE (text) 
SPECIFY (object,abs) 
DIFFERENTIATE (special case) 
-FIX (condition) 
k r SPECIFY (object,abs) 
CONCRETIZE (event) ~ LOCALIZE (place) 
INDIVIDUALIZE (partner) 
-PERMIT 
-FIX(activbyIPECIFY (object,abs) 
m CONCRETIZE (commerce) 
FIX (condition) 
k F SPECIFY (object,abs) 
CONCRETIZE (commerce) 
L DETERMINE (relationship) 
INDIVIDUALIZE (partner) 
to a large extent the strategy of analysis within 
TEXAN. In priciple, the task here is to bundle the 
available information on syntax, lexis and thematic 
roles in a form suitable to the determination of the 
underlying illocution. Nevertheless, the concept of 
text acts is the basis for the structure of data. We 
distinguish the following components (Fig. 2): 
\[ZZZE~ ~ text 
~text representation 
The Community Die Gemeinschaft 
shall not subject fQhrt (ein) 
imports of products fur die Einfuhr der 
defined under Article I in Artikel I genannten 
Erzeugnisse 
to new quantitative restrictions, keine neuen mengenm~5igen 
Beschr~nkungen. 
Fig. 2 
The components of the automatic analysis are GRILL 
(grammar of illocutions), TEL (text lexicon) and TEF 
(sequence of propositions of the text). INT (schema 
of interpretation for the structure of states of af- 
fairs and communicative tasks) and HAS (action struc- 
ture of the text) are preconditions in order to for- 
mulate the rules of GRILL. 'text' represents the in- 
put structure. This means that the sentences are syn- 
tactically analyzedand ordered according to a propo- 
sitional listing. 'text representation' is output in 
the form of Fig. I. 
In the following we will sketch the structure of the 
components. 
INT represents the structure in which knowledge of 
states of affairs is embedded into knowledge of ling- 
uistic action. It consists of 4 parts which can be 
combined. States of affairs (see Fig. 3): 
(a) actions (a (x, (y), (z))) 
states of affairs occur as actions/interactions 
(a) of/between participants (xl, x2 .... ) and re- 
If 
additional demand ....... .Tritt (auf) 
should arise ...... ~ :,- ..auf dem Gemeinschaftsmarkt 
on the Community market,-.-"z.-.eine zus~tzliche Nachfrage, 
the Community ..... so 
will not object .... -'..-.~. -hat ... nichts elnzuwenden 
-. die Gemeinschaft, dab 
to these quantitative limits die vorgenannten HOchstmengen 
being increased, Qberschritten werden, 
on the understanding that sofern 
the additional quantities die zusatzlichen Mengen 
shall be determined .... .. ,von den Vertragsparteien 
on the basis of mutual agreemen£.( einvernehmlich 
between the Parties. -~ " "festgesetzt werden. 
Fig. 1 
fer to an concrete object (y) or abstract object (z) 
or relate the two ones (y,z). 
/ ~ ~(, FTIME( acti°n ) II 
| Fnorm l I\~. |DETERMINE(procedure) I\] 
/.. |purpose I }/%~.J /DETERMiNE(relationsh.)l I 
\[~,') IPermissionl I \ ILOCALIZE(place) II 
\v Lcondition \] / /X LRESTRICT(domain) J) 
.... rco merce \] 
k.,/ a.CzLLOL~ I contact l 
/~--w-/~-'~'---mwillingness for com/contJ 
/ ! 
/ INDIVIDUALIZE 
(partner) SPECIFY SPECIFY il 
% ~(obj.con) ('l(obj.abs) / 
/ \ DEFINE DEFINE Ii 
(by text, place) (by text) /' 
Fig. 3 
(b) states of affairs occur as events concerning 
abstract objects: b (z) 
(c) situation (m,n,o,p .... ) 
actions are embedded in a situation described by pa- 
rameters of time, location, personal relationship, 
domain, procedures, etc. 
336 
(d) the verbalization of an action can be seen in the 
status of condition, norm, purpose, permission,etc. 
Linguistic actions: 
They are interpretations of states of affairs with re- 
spect to communicative Basks and can be described as 
predications on propositions. Thus we can add several 
types of illocutions to (a)-(d). Examples are: 
CONCRETIZE (a (x, (y), (z))) 
FIX (condition (b (z))) 
HAS (Fig. 4) represents the action structure of 'trea~ 
ties of trade' in terms of text acts. Our example in 
Fig.1 shows a segment of REGULATE (case). 
goal (development of trade) 
situational problem solving 
preconditions ~--" I 
DETERMINE ANTICIPATE DELEGATE 
activities consequences tasks / 
APPLY means APPLY control ~- ~~... 
REGULATE (case~) REGULATE (c~) ~. I 
GENERALIZE DIFFERENTIATE DIFFERENTIATE 
general case special case I spec.case 2 
Fig. 4 
TEL represents the text lexicon. According to the two 
tasks of TEXAN TEl_ includes hwo sections of informa- 
tion: an idenhification section concerning the text 
act structure (TAS) which is described by types of il- 
locution and roles such as REGULATE (case), SPECIFY 
(object), etc., and a selection section consisting of 
lists of repertoires which belong to several single 
languages (TAE:R(LI ..... Ln). As a third par\[ a key (K) 
is established which provides the connection of input 
data and the TA-information. On the level of simple 
illocutions the key represented by the lemma of the 
head of the respective phrase; on the level of complex 
illocutions the key is the illocuhion of a lower level. 
An entry of IEL has the following design: 
TELl: I. key (lemma or illocutionc) 
2. TAS (I/T) 
3. TAE (R (LI: \],g) 
R (L2: l,g) 
R.(LAI l,g)) 
It is possible that one key corresponds to several en- 
tries of TEL. This is the case if there are different TAS. 
GRILL provides rules which represent the structure of 
INT and HAS and which transform them into procedures. 
GRILL (grammar' of il\]ocutions) has such a form that it 
can be processed by a context-free grammar parser. A 
parser has been developed according to the structure 
of the programming language COMSKEE. Elements of the 
TEF-component (listing of propositions of the text) 
are integrated as parameter (F) into the rules. 
a) rule (RIO) for terminals (lexicon rule): 
I e (Ti)/(F') := lemma z, (T i) /(F') 
e.g. CONCRETIZE (contact)/(F1) := "inform" (cont)/(F1) 
b) rule for non-terminals (RI-R9), general form: 
Ic(Tj)/(Fi_ m) := 
II(Tf)/(F i) + < I Ig(Th), R In/ (Fo_ p) > 
\[I n recursion 
<> optional 
R surface conditions 
4. Transfer 
Orl the basis oF identified i\]\]ocutions with respect to 
L1 we have access to tile Iexical and grammatical in- 
Formation of R with regard to L2, L3, etc. This infor- 
mation is offered by TEL. We apply a further assign- 
ment rule of the following type (e=engtish, d=german, 
l=\]exical inf., g=syntactic inf.): 
for 'lemma'(Lx), ) := R(I~, gk ) (Ly) for I c (Tj)(Lx) li(Tj R(I~, gk ) (Ly) 
Examples: 
for 'subject'(e), CONCRETIZE(commerce) := 
R(l:'einfiJhren', 'anwenden' 
g: finite verb)(d) 
for GENERALIZE (case) (e):= 
R(g: main clause,activ, 
present tense)(d) 
\[he transfer part is to be se~n as a kind of "helper" 
for translation purposes.. It may be used by human 
translators as well as by systems generating the com- 
plete target text. 

References 

Allen, J.F./Perrault, C.R.,1980. Analyzing intention 
in utterances. Art. Intell. Vol 15,3,143=178. 

Cohen, P.R.,1978. On knowing what to say: Planning 
speech acts. Ph.D.Thesis, Dep.of CompuTer Science, 
Univ.of Toronto. 

Gazdar, G.,1981. Speech act assignment. In: Joshi,A. 
K./Webber, B.L./Sag, J..A., Elements of discourse 
understanding, 64-83, Cambridge, Univ.Press. 

Grosz, B.J., 1982. Discourse Analysis.ln: Kittredge, 
R./Lehrberger, J.(ed), Sublanguage, 138-174. 
Berlin, de Gruyter. 

Harbusch, K./Rothkegel, A., 1984. PROLID . Ein Pro- 
gramm zur Rollenidentifikation. Ling. Arbeiten des 
SFB IO0,N.F.8, Univ. Saarbr~cken. 

H6nig, H./Ku6maul, P., 1982. Strategie der 8bersetzung. 
T@bingen, Narr. 

Leech, G./Svartvik, J., 1975. A communicative grammar 
of English. London, Longman. 

Rei5, K./Vermeer, H.J., 1984. Grundlegung einer aii- 
gemeinen Translationstheorie. TUbingen, Niemeyer. 

Rothkegel, A., 1984. Sprachhandlungstypen in Lnter- 
aktionsregelnden Texten. In: Rosengren, I.(Hg.), 
Sprache und Pragmatik, Lunder Symposium 1984, 255- 
278. Stockholm, Almqvist & Wikse\]i Int. 

Rothkegel, A. 1985. Text Acts in Machine Translation. L.A.U.I. 
paper no. 133, Universit~t Trier. 
SAI.EM. Sonderforschungsbereich 100 (Hg.), 1980. Ein 
Verfahren zur automatlschen Lemmatisierung deut- 
scher Texte. TSbingen,Niemeyer. 

ThieI, G., 1980. Vergleichende Textanalyse als Basis 
fQr die Entwicklung einer Obersetzungsmethodik. In: 
Wilss, W.(Hg.), Semiotik und Obersetzen, 87-98. 
T~bingen, Narr. 
