Mapping Scrambled Korean Sentences into English Using 
Synchronous TAGs 
C Hyun S. Park 
omputer Laboratory 
University of Cambridge 
Cambridge, CB2 3QG, U.K. 
Hyun. Park~cl. cam. ac. uk 
Abstract 
Synchronous Tree Adjoining Grammars 
can be used for Machine Translation. How- 
ever, translating a free order language such 
as Korean to English is complicated. I 
present a mechanism to translate scram- 
bled Korean sentences into English by com- 
bining the concepts of Multi-Component 
TAGs (MC-TAGs) and Synchronous TAGs 
(STAGs). 
1 Motivation 
Tree Adjoining Grammars (TAGs) were first devel- 
oped by Joshi, Levy, and Takahashi (Joshi et al., 
1975). There are other variants of TAGs such as 
STAGs (Shieber and Schabes, 1990), and MC-TAGs 
(Weir, 1988). STAGs in particular can be used for 
machine translation and were applied to Korean- 
English machine translation in a military message 
domain (Palmer et al., 1995). 
Park (Park, 1995) suggested a way of handling 
Korean scrambling using MC-TAGs together with a 
priority concept. However, as scrambled argument 
structures in Korean were represented as sets using 
MC-TAGs, a mechanism to combine MC-TAGs and 
STAGs was necessary to translate Korean scrambled 
sentences into English. 
2 Korean-English Machine 
Translation Using STAGs 
STAGs are a variant of TAGs introduced to charac- 
terize correspondences between tree adjoining lan- 
guages. They can be used to relate TAGs for two dif- 
ferent languages for machine translation (Abeill6 et 
al., 1990). The translation process consists of three 
steps. The source sentence is parsed according to the 
source grammar. Each elementary tree in the deriva- 
tion is considered with the features given from the 
derivation through unification. Second, the source 
derivation tree is transferred to a target derivation. 
This step maps each elementary tree in the source 
derivation tree to a tree in the target derivation tree 
by looking in the transfer lexicon. And finally, the 
target sentence is generated from the target deriva- 
tion tree obtained in the previous step. 
The transfer lexicon consists of pairs of trees, one 
from the source language and the other from the 
target language. Within the pair of trees, nodes may 
be linked. Whenever adjunction or substitution is 
performed on a linked node in a source tree, the 
corresponding operation applies to the linked node 
in the target tree. 
i "-':1 , "--', 
"i i "° 
Fibre 1: The K-E Transfer Lexicon 
Canonical ordering of the arguments of transitive 
verbs in Korean is SOV. Whereas the case marker 
in English is implicit in the word, case markers are 
explicit in Korean. This is reflected in the transfer 
lexicon of Figure 1. So, the pair a in Figure 1 shows 
that Korean has an explicit subject case marker i, 
and the pair/~ shows that Korean has an explicit ob- 
ject case marker lul. Also, the pair 7 shows the links 
between SOV structure of Korean to SVO structure 
of English. 
K: Tom-i Jerry-lul ccossnunta. 1 Tom-NOM Jerry-ACC chase 
E: Tom chases Jerry. 
To translate sentence (1), we start with the pair 7 
in Figure 1, and we substitute the pair a on the link 
from the Korean node SP to the English node NP. 
Then, pair/~ is substituted into the NP-OP pairs in 
7, thus correctly transferring sentence (1). 
317 
3 Handling of Scrambling in Korean 
Using MC-TAGs 
TAGs and related formalisms, due to the extended 
domain of locality, can combine a lexical head and all 
of its arguments in a single elementary structure of 
the grammar. However, Becker and Rambow show 
that TAGs that obey the co-occurrence constraint 
cannot handle the full range of scrambled sentences 
(Becket and Rainbow, 1990). As a result, non-local 
MC-TAG-DL (Multi-Component TAG with Dom- 
inance Link) was proposed as a way of handling 
scrambling 1. Later, by adding a priority concept 
to MC-TAG-DL, Park (Park, 1995) suggested a way 
of handling scrambling in Korean. 
3.1 aAT~ & flAT~ structures 
I "IF ...\] *" Tom, No: " ,{ 
I 
-'C,,-,, '\] \[1 
,o I 
For handling scrambling, the multi-adjunction 
concept in MC-TAGs can be used for combining a 
scrambled argument and its landing site. For exam- 
ple, a subject (e.g., Tom) would have two Korean 
structures as above. For notational convenience, 
call the two structures, aAT~s~, and ~AT~Gs~, re- 
spectively. In general, aAT~G represents a canonical 
NP structure and flAT~G represents a scrambled NP 
structure. ~.A~s~, shows a pair of structures for 
representing the scrambled subject argument. Call 
the left structure of ~AT~GsT~, flAT~s~, and the 
right structure, ~AT~g~,. ~A~g~s~, represents a 
scrambled subject, and ~.AT~G~, is used for repre- 
senting the place where the subject would have been 
in the canonical sentence. Similarly, flAT~Go~, de- 
notes a pair of structures for representing a scram- 
bled object argument. 
The basic idea is that whenever an argument is 
not in a scrambled position, it should be substituted 
into an available empty slot using the aAT~ struc- 
ture. The fiAT~G structure will be used only when 
the argument is in a scrambled position so that the 
aAT~G structure cannot be used. 
3.2 An Example 
K: Jerry-lul Tom-i ccossnunLa. 2 Jerry-ACC Tom-NOM ehase-DECL 
E: Tom chases Jerry 
From the elementary trees in Figure 2, both sen- 
tences, (1) and (2) can be derived. For example, 
Figures 2(a), 2(b), and 2(d) can be used for sentence 
(1), to derive Figure 3(a). However, for sentence 
(2) where the order is OSV (the object argument is 
nAn additional constraint system called dominance links was 
added, thus giving rise to MC-TAG-DL. 
m u ° ; j o~, 0 j 
' I i 
(a) (b) (c)~AT~OoT~ (d) 
~i~ure 2: Elementary, Trees 
scrambled), Figures 2(a), 2(c), and 2(d) are used to 
derive Figure 3(b) (fl,4T~G~, is adjoined onto 5, and 
~,4T~G~ is substituted into OPl ~ node.). As the 
trace feature is locally set within each flAT~ struc- 
ture, two OP nodes in Figure 3(b) are co-referenced 
with the same variable, < 1 >, indicating where the 
object should have been in the canonical sentence. 
S 
A SP Vp 
A A 
NP I OP VP 
N NO ~1 V 
I I I 
I 
(a) Canonical 
!l " I 
\J ," ---. 
(b) Scrambled 
Fi~tre 3: Derived Trees 
Each elementary tree is given a priority. A higher 
priority is given to aAT~G structure over flAT~G. 
Generally, when a structure given a higher prior- 
ity over others can be successfully used for the final 
derivation of a sentence, the remaining structures 
will not be tried at all. Only when the highest pri- 
ority structure fails will the next available structure 
be tried 2. 
4 Using MC-TAGs in STAGs 
For mapping Korean to English, the simple object 
(NP) structure of English (e.g., the right structure of 
/3 pair in Figure 1) can be mapped to two structures, 
i.e., aA~o~, and ~AT~go~,, thus generating two 
possible lexical pairs. 
~As a way of implementing a verb-final condition in 
Korean,/KA'/'~s~, structure is dominated by fl.AT~s~,, and each S-type verb elementary tree will nave an A/'.A 
constraint on the root node, which guarantees that 
j3~4T~ type structure cannot be adjoined onto the par- tially derived tree unless its predicate structure (its S- 
type verb elementary tree) is already part of the partial derived tree up to that point. An example including 
long-distance scrambling is shown in (Park, 1995). 
318 
For translating sentence (1), the aA~Go~,-NP 
pair is used for Jerry (similar to the/~ pair in Figure 
1). However, in sentence (2), the/~AT~Go~,-NP pair 
should be used instead for translating the scrambled 
argument Jerry (i.e., Figure 4(a)). Thus, it is nec- 
essary that a Korean flA:RG structure (MC-TAG) 
be mapped to an English NP structure (TAG) to 
transfer a scrambled argument in Korean. I assume 
that there is one head structure for each MC-TAG 
structure, and that the/~A~G ~ (place holder struc- 
ture) is the head structure for each/~AT~G struc- 
ture. The root node of the head structure is al- 
ways mapped to the root node of the target (English) 
structure. 
Usually, the nodes in the source language should 
be linked to each relevant node in the target lan- 
guage, and vice versa (in STAGs). However, in the 
case that it is a multi-component structure (e.g., 
/~AT~), an adjunction node need not necessarily 
be linked to any node. If it is not linked to any 
node of the target language, the structure can be 
freely adjoined onto any available node of the par- 
tially derived tree of the source language, which is 
approximately what scrambling is about. However, 
substitution nodes will always be linked (the differ- 
ence between a substitution node and an adjunction 
node is that an adjunction node does not introduce 
a new structure to the partially derived tree whereas 
a substitution node always does). 
t~"- 
)'.,'." 
l" ..... }" 
(a)K - E Lexicon .,::"",,~ 
/oP..~.-.. ,~m ., .... - "kr - -.... 
~N ' ~p t " '11 " ' " -i i : ~:1 : ~) I .,~ I:! 
~ ~ 'i ": . k 2 r / V . " " k ~ \] 
"..../ I .JL...,, ~..1 Y'am 
(b)K - E DerivedTrees After Applying (a) 
Figure 4: K-E Transfer Lexicon and Derived Tree 
In Figure 4(a), the root node NP of an English 
TAG is mapped to the OP node of/~A~G~, of 
a Korean TAG which is a head structure. All 
the other nodes are mapped to each relevant node 
except S~. As it is not linked, /~AT~, can be 
adjoined onto any available node in the partially 
derived Korean tree. Actually, the restriction on 
whether flAT, GoLf, can be adjoined onto a certain 
node does not come from the formalism of Syn- 
chronous TAGs, but purely from the grammar of 
Korean TAGs. Figure 4(b) shows the final derived 
trees for both Korean and English after applying 
4(a) to the partially derived trees. 
5 Conclusion and Future Direction 
Using MC-TAGs allows the scrambled argument 
structure to be represented as a single (set) struc- 
ture. This makes possible the mapping of Korean 
scrambled m'gument structures into English argu- 
ment structures. The application of similar mech- 
anisms for other languages and for mapping quasi 
logical forms to logical forms (Alshawi et al., 1992) 
using STAGs is also being investigated. 

References 
Anne Abeilld, Yves Schabes, and Aravind K. Joshi. 
1990. Using Lexicalized TAGs for Machine Trans- 
lation. In Proceedings of the International Con- 
ference on Computational Linguistics (COLING 
'90), Helsinki, Finland. 
H. Alshawi, D. Carter, J. Eijck, B. Gamback, 
R. Moore, D. Moran, F. Pereira, S. Pulman, 
M. Rayner, and A. Smith. 1992. The Core Lan- 
guage Engine. MIT Press. 
Tilman Becker and Owen Rainbow. 
1990. Long-Distance Scrambling in German. 
Technical report, University of Pennsylvania. 
Aravind K. Joshi, L. Levy, and M. Takahashi. 1975. 
Tree Adjunct Grammars. Journal of Computer 
and System Sciences. 
Martha Palmer, Hyun S. Park, and Dania Egedi. 
1995. The Application of Korean-English Ma- 
chine Translation to a Military Message Domain. 
In Fifth Annual IEEE Dual-Use Technologies and 
Applications Conference. 
Hyun S. Park. 1995. Handling of Scrambling in 
Korean Using MC-TAGs. In Second Conference 
of Pacific Association for Computational Linguis- 
tics. 
Stuart Shieber and Yves Schabes. 1990. Syn- 
chronous Tree Adjoining Grammars. In Proceed- 
ings of the 13 th International Conference on Com- 
putational Linguistics (COLING'90), Helsinki, 
Finland. 
David J. Weir. 1988. Characterizing Mildly 
Context-Sensitive Grammar Formalisms. Ph.D. 
thesis, University of Pennsylvania. 
