Parameterization of the Interlingua in Machine Translation 
Bonnie Dorr* 
Institute for Advanced Computer Studies 
A.V. Williams Building, Room 3157 
University of Maryland 
College Park, MD 20742 
e-mail: bonnie@umiacs.umd.edu 
Abstract 
The task of designing an interlingual machine transla- 
tion system is difficult, first because the designer must 
have n knowledge of the principles underlying cross~ 
linguistic distinctions for the languages under consider- 
ation, and second because the designer must then be 
able to incorporate this knowledge effectively into the 
system. This paper provides a catalog of several types 
of distinctions among Spanish, English, and German, 
and describes a parametric approach that characterizes 
these distinctions, both at the syntactic level and at the 
lexical-semantic level. The approach described here is 
implemented in a system called UNITRAN, a machine 
translation system that translates English, Spanish, and 
German bidirectionally. 
1 Introduction 
What makes the task of designing an interlin- 
gual machine translation system difficult is the re- 
quirement that the translator process many types 
of language-specific phenomena while still main- 
taining language-independent information about the 
source and target languages. Given that these two 
types of knowledge (language.specific and language- 
independent) are required to fulfill the translation 
task, one approach to designing a machine trans- 
lation system is to provide a common language. 
independent representation tiiat acts as a pivot be- 
tween the source and target languages, and to pro- 
vide a parameterized mapping between this form 
and the input and output of each language. This 
is the approach taken in UNITRAN, a machine 
translation system that translates English, Spanish, 
*This paper describes research done at the Uni- 
versity of Maryland Institute for Advanced Computer 
Studies and at the MIT Artificial Intelligence Labo~ 
ratory. Useful guidance and commentary during the 
research and preparation of this document were pro- 
vided by Bob Berwick, Gary Coen, Bruce Dawson, Klau- 
dis Dussa-Zieger, Terry Gaasterland, Ken Hale, Mike 
Kashket, Jorge Lobo, Panla Merlo, James Pustejovsky, 
Jeff Siskind, Clare Vess, Amy Weinberg, and Patrick 
Winston. 
'7-r 
Figure 1: Overall Design of the UN|TRAN System 
and German bidirectionally. The pivot form that is 
used in this system is a lexical conceptual structure 
(henceforth, LCS) (see Jackendoff(1983, 1990), Hale 
& Laughren (1983), Hale & Keyser (1986a, 1986b), 
and Levin & Rappaport (1986)), which is a form that 
underlies the source- and target-language sentences. 
The pivot approach to translation is called in- 
tcvliugual because it relies on an underlying form 
derived from universal principles that hold across 
all languages. Within this framework, distinctions 
between languages are accounted for by settings of 
parameters associated with the universal principles. 
For example, there is a universal principle that re- 
quires there to be a conceptual subject for each pred- 
icate of a sentence. Whether or not the couceptual 
subject is syntactically realized is determined by a 
parameter associated with this principle: the null 
subject parameter. This parameter is set to yes for 
Spanish (also, Italian, Hebrew, etc.) but no for En- 
glish and German (also French, Warlpiri, etc.). The 
setting of the null subject parameter accounts for 
the possibility of a missing subject in Spanish and 
the incorrectness of a missing subject in English and 
German (except for the imperative form). 
This paper argues that, not only should the syn- 
tactic component of a machine translation system be 
parameterized, but other components of a machine 
translation system would also benefit from the pa- 
rameterization approach. In particular, the lexical- 
semantic component must be constructed in such a 
way as to allow principles of the lexicon to be pa- 
rameterized. Thus, UNITRAN uses two levels of 
processing, syntactic and lexical-semantie, both of 
which operate on the basis of language-independent 
ACRES DE COLING-92, NANTES, 23-28 AOt~r 1992 6 2 4 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992 
knowledge that is parameterized to encode lauguage~ 
specific information (see figure 1). 
Within the syntactic level, the language- 
independent and language-speeilic information are 
supplied, respectively, by the principles and pa- 
rmnetets of government-binding theory (henceforth, 
GB) (see Chomaky (1981, 1982)). Within the 
lexical-semantie level, the language-independent and 
language-specific information are supplied by a set 
of general LCS mappings and the associated pa- 
rameters for each language, respectively. Tim inter- 
face between the syntactic and semantic levels allows 
the source-language structure to be mapped system- 
atically to the conceptual form, and it allows the 
targetdanguage structure to be realized systemati- 
cally from lexical items derived from the conceptual 
form. This work represents a shift away from coln- 
plex, language-specific syutactic translation without 
entirely abandoning syntax. Furthermore, the work 
moves toward a model that employs a well-defined 
lexieal conceptual representation without requiring 
a "deep" semantic conceptualizatiou. 
Consider the following example: 
(1) (i) I stabbed Jnhn 
(ii) Yo le di pufialadas a Juan 
'I gave knife-wounds to John' 
This example illustrates a type of distiuctiou (hence- 
forth called divergence as presented in Dorr (1990a)) 
that arises in machine translation: the source- 
language predicate, stab, is ,napped to more than 
one target-language word, dar puiialadas a. This 
divergence type is lezical in that there is a word 
selection variation between the source language and 
the target language. Such divergeuees are accounted 
for by lexical-semantie parameterization, as we will 
see in section 3. 
The following section of this paper will provide a 
catalog of syntactic divergences between the source 
and target languages. The set of parameters that 
are used to account for these divergences will be de- 
scribed. In the third section, we will exanfine the 
divergences that occur at tire lexical-semantie level, 
and we will see how the parametric approach ac- 
counts for these divergences as well. Finally, we will 
turu to the evaluation and coverage of tile system. 
2 Toward a Catalog of Syntactic 
Divergences 
Figure 2 shows a diagram of the UNITItAN syntac- 
tic processing component. The parser of this compo- 
nent provides a source-language syntactic structure 
to the lexical-semantic processor, and, after lexical- 
semantic processing is completed, the generator of 
this component provides a target-language syntac- 
tic structure. Both the parser and generator of this 
component have access to the syntactic principles 
of GB theory. These principles, which act as con- 
straints (i.e., filters) on the syntactic structures pro- 
Figure 2: Design of the Syntactic Processing Com- 
ponent 
duced hy the parser and the generator, operate on 
tim basis of parameter settings that supply certain 
lauguage-specific iulbrmation; this is where syntac- 
tic divergences are factored out from the lexical- 
semantic representation. 
The Gll principles and parameters are organized 
into modules whtme constraints are applied in the 
following order: (1) X, (2) Boundiug, (3) Case, (4) 
'iYace, (5) Ilinding, and (6) 0. A detailed descriw 
tiou of these modules is provided in Dorr (1987). 
We will look t, riefiy at a number of these, /hens- 
ing on how syntactic divergences are accounted for 
by this approach. Figure 3 smmnarizes the syntac- 
tic divergences that are revealed by the parametric 
variations presented here.l 
2.1 Principles and Parameters of the X 
Modal(," 
The X" constraiut module of the syntactic component 
provides the phrase-structure representation of sen- 
tenees. In particular, the fundamental principle of 
the X module is that each phrase of a sentence has 
a mazimal projection, X-MAX, lor a head of cate- 
gory X (see tigure 4). ~ In addition to the head X, a 
phrasal projection potentially contaius satellites c~1, 
a~, ill, f12, 71, and 72, where cq attd ~2 are any nuln- 
ber of maximally adjoined adjuncts positioned ac- 
curding to the adjuaclion parameter, fll aud f12 are 
arguments (subjects aud objects) ordered according 
to the constituent order parameter, and 71 and 72 
are any number of minimally adjoined adjuncts p~ 
sitioued according to the adjunctiou parameter. 3 
tThe syntactic divergences are enumerated with r~ 
spect to the relevant pasametera and modules of the 
syntactic component. The figure illustrates the effect of 
syntactic parameter settings on tile constituent structure 
for each language. (In this figure, E stands for English, 
G for German, S for Spanish, and I for Icelandic.) 
aThe possibilities for the category X are: (V)erb, 
(N)oua, (A)djective, (P)reptmition, (C)omplementizer, 
and (1)affection. 't'ite Complementizer corresponds to 
relative pronouns such as that in the matt that I saw. 
The IntlectionM category corresponds to modals such as 
would in 1 would eat cttke. 
3This is a revised version of the "X-Theory presented 
in Chomsky (1981). Tire adjunction par~ueter will not 
be discussed here, but see Dorr (1987) for details. 
ACrEs DE COTING-92, NANTES, 23-28 ^O~" 1992 6 2 S Paoc. OF COL1NG-92, NANYV.S, AUO. 23-28, 1992 
Syntactic Divergence Examples Parameter GB Module 
E, S: V preccd¢~ object constituent X G: 
V followe object order 
E: P stranding allowed proper Gov~t 
S, G: No P stranding allowed governors 
E, G: Fronted question word bounding Bounding beyond |ingle sentence nodes 
level not allowed S: Fronted quenion word 
beyond single sentence 
level allowed E, G: 
P not ¢~quired before type of verbal object anaoci- government 
ated with elitic S: p required before ver- 
bal object a~aociated with elitic 
E, G: Subject required in ms- null nub- trlx claule ject 
S: Subject not required in matrix clau~ 
E, S, G: Anaphor (e.g., him. governing Binding self) must have an- category 
tecedent inside near- eta dominating clauBe 
Anuphor (e,g. , siq) I: may have antecedent 
outside nearest domi- nating clause 
E: No empty pleonastics NDP 0 allowed 
S: Empty pleonaatica al- lowed 
G: Empty pleonastics in 
embedded claunes only 
Figure 3: Summary of Syntactic Divergences 
X-MAX 
./!\. ,A\,I 
Figure 4: Phrase-Structure Representation 
Given this general i phrase-structure representa- 
tion, we can now "fit" this template onto the phrase 
structure of each language by providing the appro- 
priate settings for the parameters of the X module. 
For example, the constituent order parameter char- 
acterizes the word order distinctions among English, 
Spanish and German. Unlike English and Spanish, 
German is assumed to be a subject-object-verb lan- 
guage that adheres to tim verb-second requirement 
in matrix clauses (see Safir (1985)). Thus, for the 
sentence 1 have seen him, we have the following con- 
trusting argument structures: 
(2) (i) I have seen him 
(ii) Yo he visto a dl 
'I have seen (to him)' 
(iii) Ich habe ihn gesehen 
'I have him seen' 
The X module builds the phrase-structure from 
the general scheme of figure 4 and the parameter 
settings described above. The principles and param- 
eters of the remaining modules are then applied as 
constraints to the phrase-structure representation. 
We will now examine each of the remaining modules 
in turn. 
2.2 Principles and Parameters of the 
Government Module 
Government Theory is a central notion to the Case 
and Trace modules. A familiar example of the gov- 
ernment principle in English is that a verb governs 
its object. 4 We will examine the effect of this module 
in sections 2.4 and 2.5. 
2.3 Principles and Parameters of the 
Bounding Module 
The Bounding module is concerned with the distance 
between pairs of co-referring elements (e.g., trace- 
antecedent pairs). The fundamental principle of the 
bounding module is that the distance between co- 
referring elements is not allowed to be more than one 
bounding node apart, where the choice of bounding 
nodes is allowed to vary across languages. 
The bounding nodes parameter setting accounts 
for a syntactic divergence between Spanish and En- 
glish (and German): 
(3) (i)* Whol did you wonder whether ti went to 
school? ~ 
(ii) LQui6n, crees tfi queti rue a la esenela? 
The reason (3)0) is ruled out is that the word who 
has moved beyond two bounding nodes. It turns 
out that the corresponding Spanish sentence (3)(ii) 
is well-formed since the choice of bounding nodes is 
different and only one bounding node is crossed. 
2,4 Principles and Paranaeters of the Case 
Module 
The Case module is in charge of ensuring that all 
noun phrases are properly assigned abstract case 
(e.g., nominative, objective, etc.). The Case Fil- 
ter rules out any sentence that contains a non-case- 
marked noun phrase. 
The notion of government is relevant to case as- 
signment since an element assigns case only if it is 
a governing case-assigner. Tile setting of the type 
of government parameter for English, Spanish, and 
German characterizes the following divergences: 
(4) (i) I saw Guile 
• I saw to Guile 
(ii)* Lo vi Guille 
Lo vi a Guile s 
(iii) lch sah Guile 
• lch sah zn Guille 
4See Dorr (1987) for a more formal definition of the 
government principle. 
sit who is spoken emphatically, this sentence can al- 
most be understood as an echo question corresponding to 
the statement I wondered whether John went to school. 
AcrEs DE COLING-92, NANteS, 23-28 ^O~" 1992 6 2 6 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 
2.5 Principles and Parameters of the Trace 
Module 
After case has been assigned, the Trace module 
applies the empty category principle (ECP) which 
checks for proper government of empty elements. 
The ECP is parameterized by means of the null sub- 
ject parameter. As discussed in section 1, the null 
subject parameter accounts for the null subject dis- 
tinction between Spanish, on the one hand, and F,n- 
glish and German on the other: 
(5) (i) Vo vi ellibro 
Vi el libro 
(ii) I saw the book 
* Saw the book 
(iii) Ich salt das Buch 
* Sah das Buch 
Art additional parameter that is relevant to the 
Trace module is the proper governors parame- 
ter. The choice of proper governor accounts for 
preposition-stranding distinctions in the three lan- 
guages: 
(t;) (i) \[mMxx What store\]i did John go to ti? r 
(fi)* \[N.IdaX Cu~I tienda\]i rue Juan a ti? 
(iii)* \[mMAX Welchem Geseha.ft\]i geht Johann 
zu ti? 
2.6 Principles and Parameters of the 
Binding Module 
The Binding module is tire final module applied be- 
fore thematic roles are assigned. This module is con- 
cerned with the coreference relations among noun 
phrases, and it is dependent on the governing cat- 
egory parameter, which specifies that a governing 
category for a syntactic constituent is (roughly) the 
nearest dominating clause that has a subject. This 
parameter happens to have the same setting for En- 
glish, Spanish, and German, but see Dorr (1987) for 
a description of other settings of this parameter (e.g., 
for Icelandic) based on work by Wexler & Manzini 
(1986), 
2.7 Principles and Parameters of the 0 
Module 
The 0 module provides the interface between the 
syntactic component and the lexical-aemantic com- 
ponent. In particular, the assignment of themalic 
roles (henceforth 0-roles) after parsing leads into the 
construction of the interlingual form. 
The fundamental principle of the 0 module is the 
O-Criterion which states that a lexical head must 
eAs noted in Jaeggli (1981), animate objects (e.g., 
Guille) are a~ociated with a clitic pronoun (e.9., Io) 
only in certain dialects such as that of the River Plate 
area of South America. 
7The t~ constituent is a trace that corresponds to the 
noun phrase that has been moved to the front of the 
sentence. 
2~ L./ __\ 
Figure ,5: Design of the Lexieal-Semantic Compo~ 
Ilent 
assign 0-roles in u unique one-to-one correspondence 
with the argument positions specified in thc lexical 
entry for the head. One of the parameters ax~oci- 
ated with the 0 \[nodule is the unto-drop paradigm 
(NDP) parameter (based on work by Safir (1985)). 
This parameter accounts for the distinction between 
English, on the one hand, and Spanish and German, 
on the other hand, with respect to the subject of an 
embedded clause: 
(7) (i) * 1 know that was dancing 
(ii) Yo sd que hahfa un halle 
'1 know that (there) was a dance' 
(iii) Ich weill, daft getanzt wurde 
'I know that (there) wa~ dancing' 
Ones all 0-roles are assigned, the lexical-semantic 
component of the translator composes the interlin- 
gual representation for the source and target lan- 
guage. The next section will describe the lexical- 
semantic component, and it will show how this com~ 
l)onent accounts for a number of divergences outside 
of the reahn of syntax. 
3 Toward a Catalog of 
LexicaI-Semantic Divergences 
Figure 5 shows a diagram of the UNITRAN lexical- 
semantic processing component. A detailed descrip- 
tion of the lexical conccplual structure (LCS) which 
serves as the interlingua is not given here, but see 
Dorr (1990b) for further discussion, s 
81n general, the LCS representation follows the for- 
mat proposed by Jackeudoff (1983, 1990) which views 
semautic representation as a subset of conceptual struc- ture. Jackeudoff's approach includes such notious as 
Event and State, which are specialized into primitives such as (30, STAY, BE~ GO-EXT, aud ORIENT. As an 
example of how the primitive GO is used to represent sentence semantics, consider the following sentence: 
(s) (i) The ball rolled toward Beth. 
(ii) \]Event GO (\[Thing BALI.l, 
\[Pith TO (lPolitioa AT 
(\[Thlas BALI,l, \[Thin~ BI~TII\])I)\])I 
This representation illustrate~ one dimension (i.e., the 
spatial dimension) of J~vckendoff's representation. An- 
other dimension is the causal dimension, which includes 
ACTES DE COLING-92. NANTES, 23-28 AOt'n' 1992 6 2 7 l'e, OC. OV COLING-92, NANTES, AUG. 23-28, 1992 
Diuergence Examples \[ (Parameter) 
E: enter: John entered Structural the house (*) 
S: entrar: Juan entr6 en 
la cMa G: 
(hinein)treten: J~ harm trat ina Haus 
hinein E: 
like: I like Mary Thematic S: 
gustar: Me gusts (:INT, :EXT) Maria 
G: 9efallen: Marie gefKIIt mir 
E: be: I am hungry Categorial S: 
tener: Yo tengo (:CAT) hambre 
G; hubert: lch habe Hunger 
E: like: 1 like eating Demotional S: 
gustar: Me gusta (:DEMOTE) comer 
G: gem: Ich ~ gem 
E: usually: John usu- Promotional ally goea home (:PROMOTE) 
S: soler: Juan auele ir a cMa 
G: gewJhnlich: Johann geht gewShnlich 
nach Hauae E: 
stab: I stabbed John Conflatioaal S: 
dar: Yo le di (:CONFLATED) pufialadaa a Juan 
G: erJtechen: lch er- atach Johann 
Linking rule 
l\]inking rule 
CS'R 
Linking rule 
Linking rule 
Linking rule 
Figure 6: Summaryof Lexical-Semantie Divergences 
What is important to recognize about tiffs pro- 
ceasing component is that, just as the syntactic 
component relies on parameterization to account 
for source-to-target divergences, so does the lexical- 
semantic component. The parameterization of this 
component is specified by means of language-specific 
lexical override markers associated with the LCS 
mapping betweeu the syntactic structure and the in- 
terlingua. 
We will look briefly at the principles and parame- 
ters of the lexical-semantic component, focusing on 
how a number of divergences are accounted for by 
this approach. Figure 6 summarizes the lexical- 
semantic divergences that are revealed by the para- 
metric variations presented here. 9 
the primitives CAUSE and LET. A third dimension is 
introduced through the notion of field. This dimension 
extends the semantic coverage of spatially oriented prim- 
itives to other domains such as Posssssional, Temporal, 
Identificational, Circumstantial, and Existeutial. 
9The divergences are enumerated with respect to 
the relevant principles and parameters of the lexical- 
semantic component. In contrast to the summary of syn- 
tactic divergences in figure 3, which enumerates the effect 
of syntactic pixameter settings on constituent structure, 
the list of divergences presented here is specified in terms 
of the effect of LCS parameter settings on the realization 
of specific lexical items. 
X-MAX X ~ 
o I X-MAX a 2 /\ 
i. Syntactic specifier (/gh C B1 0 D2) 
O l,ogical subject (filL) 
2. Syntactic complements ()ill O~2 - P~) O Logical arguments (B~U B~ - fl~.) 
S. Syntactic adjuncts (al O a 2 o "rl O "/2) 
O Logical Modifiers (a~ U a~ U ~i u 7~) 
4. Syn~tic head (X) O Logical head (X I) 
Figure 7: LCS Linking Rule Between the Syntactic 
Structure and the Interlingua 
yutactic 'atcgory 
~VZNT 
STATE 
THING 
PROPERTY 
PATH 
POSITION 
LOCATION 
TIME 
MANNER 
ADV 
ADV 
ADV 
INTENSIFIER ADV 
PURPOSE ADV 
Figure 8: C,S~ Correspondence Between LCS Type 
and Syntactic Category 
3.1 Principles and Parameters of the 
Lexical-Semantic Component 
The algorithm for mapping between the syntactic 
structure and the interlingua relies on the output 
of #-role assigmnent (in the analysis direction) and 
feeds into 0-role assignment (in the synthesis direc- 
tion). Tile 0-roles represent positions in the LCS 
representations of lexical entries associated with the 
input words. Thus, the construction of the interlin- 
gun is essentially a unification process that is guided 
by the pointers left behind by 0-role assignment. 
The mapping, or linking rule between the syn- 
tactic positions and the positions of the LCS rep- 
resentation is shown in figure 7. In terms of 0-role 
assignment, the phrasal head X assigns #-roles cor- 
responding to positions in the LCS associated with 
X j. For example, the syntactic subject Bk is assigned 
the logical subject position fl~ in the LCS. Once all 
these roles have been assigned, the interlingual rep- 
resentation is composed simply by reenrsively filling 
the arguments of tile predicate into their assigned 
LCS positions. 
ACRES DE COLING-92, NANTES, 23-28 AOtYr 1992 6 2 8 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992 
Ill addition to tile LCS linking rule, there is 
another general rule associated with tile lexical= 
semantic component: the canonical syntaclie repre- 
schist\]on (CSW,.) function. This fmtction associates 
an LCS type (e.g., TIIIIII3) with a syntactic category (e.g., 
N-MAX) (see figure 8). 
The LCS Linking rule and the CS~ function 
are the two fundmnental principles of the lexical= 
semantic component. In order to account for lexical- 
semantic divergenc~, these principles nmst be pa- 
rameter\]zeal. In general, translation divergences oc- 
cur when there is all exception to one (or both) 
of these principles in one language, but not in the 
other. Titus, the lexical entries have bccn con- 
structed to support parametric variation that ac 
counts for such exceptions. The parameters are used 
in lexical entries as overrides for tile LCS linking rule 
and (JS~ function. We will now examine examples 
of how each parameter is used. 
3.1.1 %' Parameter 
The '*' parameter refers to an LG'S position that is 
syntactically realizable in the surfitce sentence. This 
parameter accounts for sSructural divergence: 
(9) (i) John entered the house 
(ii) Juan entr6 en la casa 
'John entered (into) the house' 
Here, the Spanish sentence diverges structurally 
from the English sentence since the noun phrase (the 
house) is realized as a prepositional phrase (en la 
cuss). In order to account for this divergence, the 
lexicon uses tile * marker ill the LCS representation 
associated with the lexical entries for enter and en- 
trnr. This marker specifies tim pbrasal level at whictl 
an argument will be projected: in tile Spanish lexical 
entry, the marker is associated with all LCS position 
that is realized at a syntactically higher phrasal level 
than that of tile English lexical entry. 
3.1.2 :INT and :EXT Parameters 
The :INT and :EXT paraineters allow tile I,CS 
linking rule to be overridden by associating a logical 
subject with a syntactic complement aud a logical 
argument with a syntactic subject. A t)o~iblc effect 
of using these parameter settings is that there is a 
subject-object reversal during translation. Such a 
reversal is called a thematic divergcuee: 
(10) (i) I like Mary 
(ii) Me gnats Maria 
'Mary pleases me' 
tlere, the subject of the source-language sentence, 
I, is translated into all object position, and the ob- 
ject of the source-language sentence Maria is trans- 
lated into a subject position. Ill order to accouut for 
this divergence, the lexicon uses the :INT and :EXT 
markers in the LCS representation associated with 
the lexieal entries for gustar. The English lexieal 
entry does not contain thesc markers since tile LCS 
linking rule does not need to be overridden in this 
case. 
3.1.3 :CAT Parameter 
The :(~AT lllarker provides a syntactic category 
for all LCS argument. Recall that the CS'K function 
maps all LCS type to a syntactic category (see fig- 
ure 8). When this mapping is to bc overridden by 
a lexicaI entry, the language-specific marker :CAT is 
used, 
This parameter accounts for categorial divergence: 
(11) (i) 1 am hungry 
(ii) ich hahe Hunger 
'l have hsnger ~ 
llere, not only are tl~e predicates be and hubert lexi- 
cally distinct, but the arguments of these two pred- 
icates are categorially divergent: ill English, the ur- 
gmnent is all adjectival phrase, and, ill German, the 
argument is a noun phrmse. ¢~'bc :CAT marker is 
used in the Gernmn definition to force the PROP- 
EWFY al'glnln~nt tO be realized as a norm rather than 
an adjective. Thus, the (2S~ function is overridden 
daring realization of tile word Hunger in this exam- 
pie. 
3.1.4 :DEMOTE and :PROMOTE 
~lar~tln(~t \[Jr s 
The :I)EMOTE and :PROMOTE markers, like 
the :INT and :EXT markers, allow the LCS linking 
rule to be overridden by iL~sociating a logical head 
with a syntactic adjunct or complement. These lla - 
t'ameters account, respectively, for demoiioual diver- 
gence: 
(12) (i) 1 like to eat 
(ii) lch ease gem 
'I eat likingly' 
and promotional divergence: 
(13) (i) John usually goes home 
(ii) Juan sselc ir a I:zLsa 
~Johlt teltds to go home' 
in the first case, thc English main verb like cor- 
responds to tile adjunct geru in German, and the 
embedded verb eat corresponds to the main verb 
essen in German. Ill the second case, tile English 
adjunct usually corresponds to the main verb soler 
in Spanish. 'Fhese "head switching" divergences are 
acconnnodated analogously: the :\])EMOTE marker 
is used in the lexical entry tot ger~t and the :PRO- 
MO'l'l,; ~.vker is used in the lexical entry for soler. 
3.1.5 :C.ONFLATED Parameter 
The sixth LCS parameter is tile :GONI"LATED 
marker. This marker is used tbr indicating that 
a particular argument need not be realized in tile 
surhtcc representation. This parameter accounts for 
couflational divergence aa in the sentence I stabbed 
John (see (1) from section 1). In this example, the 
AcrEs DE COL1NG=92, NANTES, 23-28 A¢)t~l 1992 6 2 9 I)ROC. OV COLING-92, NANTES, AUG. 23-28, 1992 
Ulass o? Verb Position 
~nge o o.,tlo~"-- 
I~irec ted Motion 
"Mo'tion with Manner Exchange 
Phy$icM State 
Ch'anse ot Phy*ical 
State /brient atlon 
gxittence 
Circumstance 
Rouse 
Intention 
Perteption and Communication 
Menial Proceu 
Coat Load/Spray 
Contact/Elect 
Lextcal Prsm|t|~es 9TAY.TEMP, STAY-LOC, BF.TgMP, 
DE.LOC ~Q-LOC, 
GO-TEMP 
GO-LOC, Go.PaSS 
GO.LOC CAUSE*EXCHANGE 
EE-IDENT, STAY-IDENT 
GO-IDENT 
O~IENT-LOC .... EE-\],~XIST, 
GO.EXIST, STAY-EXIST 
BE.CIRC, GO-CIRC, STAY-ClUe 
-G"c~- EXT- IDENT, OO-EX~,-- GO-EXT-LOC 
ORIENT-CIRC, ORIENT-TEMP 
B~POS9, STAY-POSS 
-- 7~E:IDENT 
HEAP~-P~RC, SEE-PERC 
B~- PEn.C, GO.PERC 
ORIENT-IDENT 
GO-LOC 
Figure 9: Coverage of LexicaI-Semantic Primitives 
argument that is incorporated in the English sen- 
tence is the \[IIIFE-I/0tlND argument since the verb 
stab does not realize this argument; by contrast, the 
Spanish construction dar pn~ialadas a explicitly re- 
alizes this argument as the word pnCLaladas. Thns, 
the :CONFLATED marker is associated with the 
I~IiIFE-WOUI~ argument in the case of stab, but not 
in the ease of dar. 
4 Evaluation and Coverage 
One of the main criteria used for evaluation of the 
parameterization framework described herc is the 
ease with which lexieal entries may be automatically 
acquired from on-line resources. While testing the 
framework against this metric, a number of results 
have been obtained, including the discovery of a fun- 
damental relationship between the lexical-semantie 
primitives and aspectnal information. This relation- 
stlip is crucial for demonstrating the snceess of the 
parameterization approach with respect to lexical 
acquisition. Details about the lexical acquisition 
model and results are presented in Dart (1992). 
We have already examined the syntactic and 
lexieal-semantic coverage of the system (see figures 3 
and 6 above). The linguistic coverage of the lexicon 
is summarized in figure 9. 
5 Conclusion 
The translation model described here is built on 
the basis of a parametric approach; thus, it is easy 
to change from one language to another (by set- 
ting syntactic and lexical switches for each language) 
without having to write a whole new processor for 
each language. This is an advance over other ma- 
chine translation systems that require at least one 
language-8pecific processing module for each source- 
language/targetolanguage pair. 
The approach is interlingual: an underlying 
language-independent form of the source language is 
derived, and any of the three target languages, Span- 
ish, English, or German, can be produced from this 
form. Perhaps the most important advance of UNI- 
TRAN is the mapping between the lexical-semantie 
level and the syntactic level. In particular, tile 8ys~ 
tern has been shown to select and realize the appro- 
priate target-language words, despite the potential 
for syntactic and lexical divergences. The key to be- 
ing able to provide a systematic mapping between 
languages is modularity: because the system has 
been partitioned into two different processing levels, 
there is a deeoupling of the syntactic and lexicalo 
semantic decisions that are made during the trans- 
lation process. Titus, syntactic and LCS parameter 
settings may be specified for each language without 
hindering the processing that produces, and gener- 
ates from, the interlingual form. 

References 

Chomeky, Noam A. (1981) Lectures an Government and Bind- 
ing, Forts Publieations~ Dordrecht. 

Choranky, Noam A. (1982) ~Some Concepts and Consequences of the Theory of Government and Binding," MIT PresL 

Dorr, Bonnie J. (1987) "UNITRAN: A Principle-Ba~d App~ach to Machine Translation," AI Technical Report 1000, MMter 
of Science thesis, Department of Electrical Engineering and Computer Science, Mauachusetts Institute of Technology. 

Dorr, Bonnie J. (1990a) "Solving Thematic Divergences in Ma- chine Tranalationp" 
Proceedings of the £8th Annual Confer- ence 
of the Association for Computational Linguistics, Uni- verlity of Pittsburgh, Pittsburgh, PA, 127-134. 

Dart, Bonnie J. (1990b) "A Croda-Linguistic Approach to Ma- chine Translation," 
Proceedings o\] the Third International 
Conference on Theoretical and Methodological Issues in Ma. 
chine Translation of Natural Languages, Linguistics Research Center, The Univerlity of Texai, Austin, TX, 13-32. 

Dorr, Bonnie J. (1992) "A Parameterized Approach to \]ntegrat- in K Aspect with Lexical-Semantics for Mwehine Translation," 
Proceedings of the 3Oth Annual Conference of the Associa- 
tion for Computational Linguistics, University of Delaware, Newark, DE. 

Hale, Kenneth and M. Laughren (1983) "Warlpiri Lexicon Project: Warlpiri Diction~,ry Entries," Mm~achusetts Institute 
of Technology, Cambridge , MA, Warlpiri Lexicon Project. 

Hale, Kennetb and Jay Keyaer (1986a) "Some qYansitivity Al- ternations in English," Center for Cognitive Science, MEte. 
sachuaetts Institute of Technology, Cambridge, MA, Lexicon Project Working Papers #7. 

Hale, Kenneth and J. Keyler (1986b) "A View from the Middle," Center for Cognitive Science, Ma~achuaetts Institute of Tech- 
nology, Cambridge, MA, Lexicon Project Working Papern #10. 

Jackendoff, Ray S. (1983) Semantics and Cognition, MIT Pre~a~ 
Cambridge, MA. 

Jackendoff, Ray S. (1990) Semantic Structures, MIT preaa~ Cam- 
bridge, MA. 

Jaeggli , Osv~ldo Adolfo (1981) Topics in tlomanee Syntax, Forts Publications, Dordrecht, Holland/Cinnaminaon, USA. 

Levis, Beth and Malka Rappaport (1986) "The Forn\]ation of Ad- jectival Passives," 
Linguistic Inquiry 17, 623-662. 

S~flr, Ken (1985) "Missing Subjects in German," in Studies in 
German Grammar, Toman, Jindlfich (ed.), Foria Publications, Dordrecht, Holland/Cinnaminnon, USA, 19::1-229. 

Wexler, Kenneth and M. Rite Manzini (1986) "Parameters and 
l.earnability in Binding Theory," pr~ented at tbe Go#nitiue 
Science Seminar, MIT, September, Cambridge, MA. 

Zubiaarreta, Maria Luis~ (1987) Leveb of Representation in the 
Lexicon and in the Syntax~ Foris Publications, Dordrecht, Hol- land/Cinnaminmon, USA. 
