Using Lexicalized Tags for Machine Translation * 
Anne Abeilld 
University of Paris 7-Jussieu 
LADL 
2 place Jussieu, 75005 Paris France 
abeilleC~franz.ibp.fr 
Yves Schabes 
University of Pennsylvania 
Dept of Computer & Information Science 
Philadelphia PA 19104-6389 
schabes@linc.cis.upenn.edu 
Aravind K. Joshi 
University of Pennsylvania 
Dept of Computer & Information Science 
Philadelphia PA 19104-6389 
joshi@linc.cis.upenn.edu 
Abstract 
Lexicalized Tree Adjoining Grammar (LTAG) is an 
attractive formalism for linguistic description mainly 
because cff its extended domain of locality and its 
factoring recursion out from the domain of local 
dependencies (Joshi, 1985, Kroch and Joshi, 1985, 
Abeilld, 1988). LTAG's extended domain of locality 
enables one to localize syntactic dependencies (such 
as filler-gap), as well as semantic dependencies (such 
as predicate-arguments). The aim of this paper is to 
show that these properties combined with the lex- 
icalized property of LTAG are especially attractive 
for machine translation. 
The transfer between two languages, such as 
French and English, can be done by putting di- 
rectly into correspondence large elementary units 
without going through some interlingual representa- 
tion and without major changes to the source and 
target grammars. The underlying formalism for the 
transfer is "synchronous Tree Adjoining Grammars" 
(Shieber and Schabes \[1990\]) 1. Transfer rules are 
stated as correspondences between nodes of trees of 
large domain of locality which are associated with 
words. We can thus define lexical transfer rules that 
avoid the defects of a mere word-to-word approach 
but still benefit from the simplicity and elegance of 
a lexical approach. 
We rely on the French and English LTAG gram- 
mars (Abeille \[1988\], Abeille \[1.990 (b)\], Abeilld et 
al. \[1990\], Abeill6 and Schabes \[1989, 1990\]) that 
have been designed over the past two years jointly at 
University of Pennsylvania and University of Paris 
7-Jussieu. 
1 Strategy for Machine Trans- 
lation with LTAGs 
The idea of using grammars written with "lexical- 
ist" formalisms for machine translation is not new 
*This research was partially ftmded by ARO grant 
DAAG29-84-K-0061, DARPA grant N00014-85-K0018, and 
NSF grant MCS-82-19196 at the University of Pen nsylvania. 
We are indebted to Stuart Shieber for his valuable comments. 
We would like also to thank Marilyn Walker. 
1 In tlfis volume. 
and has been exemplified by Kaplan, et al., (1989) 
for LFG, Beaven etal. for UCG (1988), Dorr for GB 
(1989) and Arnold et al. for Eurotra (1986). tIow- 
ever, our approach is more radical in the sense that 
we associate with the lexical items structures that lo- 
calize syntactic and semantic dependencies. This al- 
lows for the possibility that an explicit semantic rep- 
resentation level can be avoided. 2 The claims about 
the advantages of an explicit semantic representation 
level need to be investigated again in the context of 
the approach proposed here. For examples, many 
traditionally difficult problems for machine transla- 
tion due to different divergence types (Dorr 1989) 
such as categorial, thematic, conflational, structural 
and lexical are not problems in the approach we sug- 
gest, Also contrary to UCG, but like LFG, we use 
grammars that have not been designed for the pur- 
pose of translation. 
The underlying formalism achieving the transfer 
of derivations is "Synchronous Tree-Adjoining Gram- 
mars" (as described in a companion paper by Shieber 
and Schabes \[1990\]). ~ The strategy adopted for 
machine translation consists of matching the source 
LTAG derivation of the source sentence to a target 
LTAG derivation by looking at a transfer lexicon. 
The transfer lexicon puts into correspondence a tree 
from the source grammar instantiated by lexical in- 
sertion (all its nodes and their attributes) with a tree 
from the target grammar. Although the approach is 
not inherently directional, for convenience we will 
call the English and French grammars, the source 
and target grammars. 
The translation proces.s consists of three steps in 
which the generation step is reduced to a trivial 
step. First the source sentence is parsed accord- 
ingly to the source grammar. Each elementary tree 
in the derivation is now considered with the features 
given from the derivation through unification. Sec- 
ond, the source derivation tree is transferred to a 
2The formalism of Synchronous Tree-Adjolning Grammar 
does not prevent constructing an explicit semantic represen- 
tation. In fact, in Shieber and Schabes (1990) it is shown how 
to construct a semantic representation, which itself is a TAG. 
3We assume that the reader is familiar with Tree Adjoining 
Grammars. We refer the reader to Joshi (1987) for an intro- 
duction to TAGs. We also refer the reader to the companion 
paper for more details on synchronous TAGs. 
1 1 
target derivation. This step maps each elementary 
tree in the source derivation tree to a tree in the tar- 
get derivation tree by looking in the transfer lexicon. 
And finally, the target sentence is generated from the 
target derivation tree obtained in the previous step. 
As an example, consider the fragment of the trans- 
fer lexicon given in Figure 1. 
o7) J hn J hn 
T 
NP0* VP f NP0+ LP \ 
XV NP1 ,\[, V PP, l -t,." t / 
\apparently 
Figure 1: 
lexicon 
apparemment \] 
Fragment of the English-French transfer 
The transfer lexicon consists of pairs of trees one 
from the source language and one from the target 
language. Within the pair of trees, nodes may be 
linked (thick lines). Whenever in a source tree, say 
Got, roe, adjunction or substitution is performed on a 
linked node (say nso~ is linked to n,~,t), the cor- 
responding tree paired with tsouree, ttaraet, operates 
on the linked node ntaraet. For example, suppose we 
start with the pair 7 and we operate the pair a on 
the link from the English node NPo to the French 
node NPI. This operation yields the derived pair 
o~1. 
o~ 1 
ve r 0+ ve /\r / 
\[J ohn V NPI$ V PP 
\ Inissos manque ~1 ~ 
h John 
Then, if the pair/3 operates on the NP1-NPo in 
~1, the following pair ~u is generated. 
or2 
\[ S S 
/I A I John V NP Mary V P_P1 
J I I I 
\ missesMary manque ~1 7 
h John 
Finally, when the pair 6 operates on the S-S link 
in a~, the pair a3 is generated. 
0/3 
$ s 
Adv S Adv S 
apparently NP VP apparemmcmt NP VP 
j MalT v 
John 
The source sentence is parsed accordingly to the 
source grammar, then the target derivation is gener- 
ated by tracing the pairs stated in the transfer lex- 
icon. The fragment of the transfer lexicon given in 
Figure 1 therefore enables us to translate: 
Apparently, John misses Mary 
Apparemment, Mary manque ~ John 
In most cases, translation can be performed incre- 
mentally as the input string is being parsed. 
The aim of this paper is to show that LTAG's local- 
ization of syntactic dependencies (such as filler-gap), 
as well as semantic dependencies (such as predicate- 
arguments) combined with the lexiealized property 
of LTAGs are especially attractive for machine trans- 
lation. 
We show how the transfer lexicon is stated. We 
motivate the need for mapping trees instantiated 
with words and with the value of their features ob- 
tained from the derivation tree corresponding to the 
parse of the source sentence. We also show that the 
transfer needs to be stated at different levels: match- 
ing tree families (trees associated to the same pred- 
icate), trees, nodes and therefore their attributes, 
since they are associated with a node. We show how 
not only subcategorization frames but also adjuncts 
are transferred, and how differences of syntactic and 
semantic properties are accounted for ill terms of 
structural discrepancies. Then we illustrate how the 
extended domain of locality enables us to deal with 
these structural discrepancies in the process of ma- 
chine translation. 
2 2 
2 Transfer Lexicon- matching 
two LTAG Lexicons 
The transfer is stated between the English and 
French LTAG grammars in a lexicon. We rely on 
grammars built from a monolingual perspective, but 
the match between them can be one to many, or 
many to one. 
2.1 Matching elementary trees 
Instead\[ of matching words, we match structures in 
which words have been already lexically inserted. 
This provides interesting disambiguations that could 
not be obtained by a morphological match. For ex- 
ample, there is one morphological English verb leave, 
but the structures associated with it disambiguate 
it between intransitive and transitive leave. Inter- 
estingly, these two predicates receive two different 
French translations: 4 
Or4 
O~ 5 
N2'0$ Ve 
 eave\ / 
The pairs a4 and c~5 will correctly give the follow- 
ing translations: 
Sohn John 
John l_efi Mary *-~ John a quitlg Mary 
By convention, in the elementary trees, the set of 
morphological flexions of a given word is written sur- 
rounded by baekslashes. For example, \leave\ stands 
for {leave, leaves, left, ...}. For each word in a mor- 
phological set attributes (such as mode and agree- 
ment) are also specified. When a word in a tree is 
not surrounded by backslahes, it stands for the in- 
fleeted form and not for a morphological set. 
Since lexieal items appearing in the elementary 
structures can be inflected words or a morphological 
set, lexieal items of the two languages are matched 
regardless of whether they exhibit the same morpho- 
4We use standard TAG notation: '1' stands for nodes to 
be substituted, '*' annotates the foot node of an auxiliary tree 
and the hadices shown on the nodes correspond to semantic 
functions. The trees are combined with adjunction and sub- 
stitution. 
Our approach does not depend on the specific representation 
adopted ha this paper. See Abeill6 1990 (b) for an Mternate 
representation. 
logical variations or not. For example, English adjec- 
tives lacking morphological variation appear as such 
in the syntactic and transfer lexicons, while their 
French counterparts are usually morphological sets. 
The word white is thus matched with \blanc\, stand- 
ing for {blanc, blanche, blancs, blanches). 
Words that are not autonomous entries in the En- 
glish syntactic lexicon (ex: complementizers, light 
verbs or parts of an idiomatic expression), are not 
considered as autonomous entries in the transfer lex- 
icon; for example, no rule needs to match directly 
take or pay with faire, or give with pousser, in order 
to get the right light-verb predicative noun combina- 
tions in the following sentences: 5 
John t.ool~ a walk 
John a fail une promenade (Danlos 1989) 
John pays court_ t ° Mary 
~, John fait ia court d Mary (Danlos 1989) 
John ~ ~-~ Jean a poussd un cri 
Some words existing as autonomous entries in the 
English syntactic lexicon do not appear as entries 
in the transfer lexicon because their French coun- 
terpart is a morphological flexion, not a word. For 
example, the future auxiliaries will or shall are not 
translated as such. The tense feature they contribute 
is transferred (as well other syntactic features) and 
the future tense French verbal form will be chosen. 
2.2 Matching nodes 
Matching predicates of the two languages as a whole 
is not sufficient. Correspondences between their ar- 
guments must be stated too as shown in the following 
example: 
~6 
~7 
$ VP NPa$ VP I. 
V .NPI$ V PP., / Ix. I 
7 
/NP0 S --'-'-'-'~ S / $ vP j---NP0$ VP 
/\At 
/  V NPI$ V PP, "k." i /x 
\ ~miss\ ~._~manqueA rl ~NPI~ 
John resembles Mary *-+ John ressemble d Mary 
John misses Mary *-+ Mary manque fi John 
5It has long been noticed that light-verb predicative noun 
combinations are highly langalage-idiosyncratic, and word-to- 
word transfer rules will inevitably lead to overgeneration or 
unnatural restrictions. 
3 3 
These examples also show that it is not correct 
to match trees where lexical insertion has not al- 
ready been made and therefore the correspondences 
between nodes cannot be made on the only basis of 
the subcategorization frame. 
Arguments are matched directly by the links exist- 
ing between them. Adjuncts are matched indirectly 
by the links existing on the nodes, at which they ad- 
join. For example, in the following correspondence, 
NP0 S-'- ~S 
v y 
~,/~e\ A PP ~aimeA I 
the AP node in the English tree is linked to the V 
node of the French tree to account for: 
John is fond of music 
John aime ia mnsique 
John is very fond of music 
John aime beaucoup la musique 
The adjective fond is associated with an AP-type 
auxiliary tree which is paired with a V-type auxiliary 
tree corresponding to the word beaucoup. 
2.3 Matching feature structures 
Some feature structures of the words appearing in the 
trees are transferred in the translation process, but 
with the value further specified from the derivation 
(and not with the one from the lexical entry which 
may not be as specific). For example, fish can be 
either singular or plural and is therefore stated as 
such in the lexicon. However, it can get its number 
from the verb-subject agreement constraints, as in 
the following sentences: 
The fish. swim in the pond 
*-+ Les poissons nagent dans I'dtang (plural) 
The fish is good 
Le poisson est bon (singular) 
Agreement features of nouns are lexically matched 
only in the case of two morphological sets. In the ease 
of one (or both) entry being a single inflected word, 
the agreement features depend only on the lexieal 
entry itself and are directly assigned in the transfer 
lexicon: 
\boy\,g \[hum=X\] ~ \gar~on\,N \[nura= X\] 
luggage, N \[hum=sing\] ~ bagages, N Inure = pl\] 
Because of these idiosyncrasies, agreement features 
of verbs are not matched. We will thus rightly have: 
My luggage .0. heavy (singular) 
*-* Mes bagages sont Iourds (plural) 
based on monolingual agreement constraints between 
subject and verb. 
Features assigned to the sententiai yoot node (ei- 
ther from lexieal insertion or from S dh~e adjoined ma- ,'.%," '3 
terial) are transferred or not depending on whether 
they are assigned autonomously !nthe target lan- 
guage or not. The feature tense for example is usu- 
ally transferred, but not the feature~:mode because 
the latter depends on the verb of the matrix sentence 
if the sentence is embedded: 
Jean wants Marie to leave 
~-* Jean veut que Marie parte (Danlos 1989) 
2.4 Matching tree families 
In order to transfer both the predicate-argument re- 
lations, and the construction types such as question, 
passive, topicalization etc., it is necessary to be able 
to refer to a specific tree in a tree family. This is done 
by matching the syntactic features by which the dif- 
ferent trees are identified within a tree family, for 
example <passive>, <relative, NPi > or <question, 
NP~ >.6 
As has been noted, transitivity alternations ex- 
hibit striking differences in the two languages. The 
trees in the two families will not necessarily bear the 
same syntactic features; corresponding tree families 
may not include the same number of trees. 
When a syntactic feature of a given tree family 
does not exist for the corresponding tree family in 
the target language, it will be ignored. English trees 
for prepositional passives will thus be matched with 
their corresponding declarative trees in French (un- 
less the English prepositional argument is matched 
with the French direct object): 
John was given a book by Mary 
Mary a donnd nn livre ~ Jean 
Similarly, the feature <question, NPi > will be 
transferred but not the feature differentiating be- 
tween pied-piping and preposition-stranding in En- 
glish, since French always pied-pipes: 
Who did Mary give a book to? 
~ Mary a-t-elle donnd un livre? 
When a certain syntactic feature exists for both 
tree families in the two languages, but not for both 
lexical items, it is ignored as well: 
Advantage was taken of this affair by John 
• Patti a ~td tird de cette affaire par Jean 
Jean a tird patti de cette affaire 
Such idiosyncrasies are in fact expected and han- 
dled in our grammars, since they have both their 
constituent structures and their syntactic rules iexi- 
calized (see Abeill6 \[1990 (a)\] for a discussion on this 
topic). 
6NPi refers to the noun phrase being extracted, usually 0 
for subject, 1 for first object etc.. 
4 4 
3 Dealing with Structural Dis- 
crepancies 
Units of a LTAG grammar have a large domain of lo- 
cality. Discrepancies in the internal structures being 
matched are in fact expected by our strategy, and no 
special meclhanism is required for them. 
fi;.1 Discrepancies in constituent 
structures 
I1. is not a problem when an elementary tree of a 
certain constituent structure translates into an ele- 
mentary tree with a different constituent structure 
ix:t the target language, provided they have a simi- 
l~p,r argument structure. For example: idiom ~-+ verb; 
idiom ~ different kind of idiom; verb ~ light-verb 
combination; VP-adverb ~-~ raising verb; S-adverb ¢~ 
matrix clause ... as in: 
The baby j_ust fell 
~-+ Le bdbd ~ tomber (I(aplan et ai. 1989) 
John is likely to come 
I1 est probable que Jean viendra 
.Iohn gave.. ~ ~-+ John 
0"9 
0"10 
VP"'--"~ VP / /N /~ 
VP* Ad V VP 
S 
ve ,/'~ ~.'Po vP 
V A I i ./~ 
I /N il V A 
~\ A vr, I ,/~ I /X ~tre', A s 
lflcely to VP* \] /~ probable C 
I st 
que 
0~11 
N, dve\ D N ~x)usserk 
I I 
a cough 
Links provide for simultaneous adjunction (or sub- 
stitution) of matching trees at the corresponding 
nodes. For example in the pair 0"11, adjunction of 
an adjective (on N) in the English tree corresponds 
to an adjunction on the French VP: 
John gave a weak cough 
John toussa faiblemenl 
Furthermore elementary structures of the source 
language need not exist in the target language as el- 
ementary structures. For example, there is no French 
counterpart to the English verb particle combination. 
John called Mary up ~ John a appeid Mary 
3.2 Discrepancies in syntactic prop- 
erties 
Some English predicates do not have the same num- 
ber of arguments as their corresponding French ones. 
In such cases, the pair does not consists of pairs of 
elementary trees but rather pairs of derived trees of 
bounded size. Since the match is performed between 
derived trees, no new elementary trees are introduced 
in the grammars. This addition of pairs of bounded 
derived trees is the only change we have to make to 
the units of the original grammars. 
For example, the adverb hopefully has an S argu- 
ment. Since there is no corresponding French adverb, 
the French verb espdrer (which has two arguments, 
an NP and an S) combined with on will be used: 
hopefully, John will work 
on espgre que Jean travaiilera 
.1 
O'12 
/ oo 
\hopefully I /~ 
'~e~sp6rerk ~ S; 
que 
In the pair 0"12, hopefully is paired with a derived 
tree corresponding to on esp~re. The English tree for 
hopefully is paired with the result of the substitution 
of on in the subject position of the tree for esp~rer. 
The right hand tree in 0"12 is a derived tree. 
Matching agentless passive with declarative trees 
is done with the same device: 
John was given a book 
~ a donnd un livre h John 
Similar cases occur for verbs exhibiting ergativity 
alternation in one language and but not in the other. 
In this case, a supplementary causative tree has to 
be used for the unaccusative language (see pair 0"13): 
The sun melt____.As the snow 
* le soleil fond la neige 
+-+ le soleil fair fondre la neige 
5 5 
0¢13 
S S A A 
se Se0  ve A 
~neltk I 
V I 
~fOndre\ 
The right hand tree in al3 is again a derived tree. 
Multicomponent TAG !Joshi \[1987\]) can also be 
used for resolving certain other discrepancies. This 
device is not a new addition, it is already a part of 
the Synchronous TAG framework. 
Conclusion 
By virtue of their extended domain of locality, Tree 
Adjoining Grammars allow regular correspondences 
between larger structures to be stated without a me- 
diating interlingual representation. The mapping of 
derivation trees from source to target languages, us- 
ing the formalism of synchronous TAGs, makes pos- 
sible to state such direct correspondences. By doing 
so, we are able to match linguistic units with quite 
different internal structures. Furthermore, the fact 
that the grammars are lexicalized enables capturing 
some idiosyncrasies of each language. 
The simplicity and effectiveness of the transfer 
rules in this approach shows that lexicMized TAGs, 
with their extended domain of locality, are very well 
adapted to machine translation. A detailed discus- 
sion of this approach will be provided in an expanded 
version of this paper which will include a discussion 
of the applicability of this method for other pairs of 
languages exhibiting some language phenomena that 
do not arise in the pair considered in this paper. 

References 

Abeill~, Anne. 1988. Parsing French with Tree Adjoin- 
ing Grammar: some Linguistic Accounts. In Proceed- 
ings of the 12 th International Conference on Computa- 
tional Linguistics (COLING'88). Budapest, Hungary. 

Abeill~, Anne. 1990 (a). Lexical and Syntactic Rules in a 
Tree Adjoining Grammar. In Proccedings of the 28 th 
Meeting of the Association for Computational Linguis- 
tics (ACL '90). Pittsburgh, PA. 

Abeill~, Anne. 1990 (b). A Lexiealized Tree Adjoining 
Grammar and its Relevance for Machine Translation. 
To appear in Machine Translation. 

Abeill6, Anne and Schabes, Yves. 1989. Parsing Idioms 
in Tree Adjoining Grammars. In Fourth Conference 
of the European Chapter of the Association for Com- 
putational Linguistics (EA CL '89). Manchester. 

Abeill6, Anne, Bishop, Kathleen M., Cote, Sharon, and 
Schabes, Yves. 1990. A Lexicalized Tree Adjoining 
Grammar for English. Technical Report, Department 
of Computer and Information Science, University of 
Pennsylvania. 

Abeill6, Anne and Schabes, Yves. 1990. Non Composi- 
tional Discontinuous Constituents in Tree Adjoining 
Grammar In Proceedings of the Symposium on Dis- 
continuous. Tilburg, Holland. 

Arnold, D., Krauwer, S., Rosner, M., Destombes, L. and 
Varile, G. 1986. The CAT Framework in Eurotra: a 
Theoretically Committed Notation for Machine Trans- 
lation. In Proceedings of the 11 th International Con- 
\]erence on Computational Linguistics (COLING'86). 
Bonn. 

Beaven, John and Whitelock, Pete. 1988. Machine 
Translation Using Isomorphic UCGs. In Proceedings 
of the 12 th International Conference on Computa- 
tional Linguistics (COLING'88). Budapest, Hungary. 

Danlos, Laurence. 1989. La traduction automatique. An- 
hales des Tglgcommunications 44(1-2). 

Dorr, Bonnie J. 1989. Conceptual Basis of the Lexicon 
in Machine Translation. MIT AI lab Memo No 1166. 

Joshi, Aravind K. 1985. How Much Context- 
Sensitivity is Necessary for Characterizing Structural 
Descriptions--Tree Adjoining Grammars. In Dowty, 
D., Karttunen, L., and Zwicky, A. (editors), Natu- 
ral Language Processing--Theoretical, Computational 
and Psychological Perspectives. Cambridge University 
Press, New York. (Originally presented in a Workshop 
on Natural Language Parsing at Ohio State Univer- 
sity, Columbus, Ohio, May 1983) 

Joshi, Aravind K. 1987. An Introduction to Tree Ad- 
joining Grammars. In Manaster-Ramer, A. (editor), 
Mathematics of Language. John Benjamins, Amster- 
dam. 

Kaplan, R., Netter, K., Wedekind, J., and Zaenen, A. 
1989. Translation by structural correspondences. In 
Fourth Conference of the European Chapter of the As- 
sociation for Computational Linguistics (EACL'89). 
Manchester. 

Kroch, Anthony and Joshi, Aravind K. 1985. Linguis- 
tic Relevance of Tree Adjoining Grammars. Technical 
Report MS-CIS-85-18, Department of Computer and 
Information Science, University of Pennsylvania. 

Schabes, Yves, Abeill~, Anne, and Joshi, Aravind K. 
1988. Parsing Strategies with ~Lexicalized' Grammars: 
Application to Tree Adjoining Grammars. In Proceed- 
ings of the 12 th International Conference on Computa- 
tional Linguistics (COLING '88). Budapest, Hungary. 

Shieber, Stuart and Schabes, Yves. 1990. Synchronous 
Tree Adjoining Grammars. in Proceedings of the 13 th 
International Conference on Computational Linguis- 
tics (COLING'90). Stockholm, Sweden. 
