Back Transliteration from Japanese to English 
Using Target English Context  
Isao Goto
†
, Naoto Kato
††
, Terumasa Ehara
†††
, and Hideki Tanaka
†
 
†
NHK Science and Technical 
Research Laboratories  
1-11-10 Kinuta, Setagaya,  
Tokyo, 157-8510, Japan 
goto.i-es@nhk.or.jp 
tanaka.h-ja@nhk.or.jp 
††
ATR Spoken Language Trans-
lation Research Laboratories 
2-2-2 Hikaridai, Keihanna  
Science City, Kyoto, 619-0288, 
Japan 
naoto.kato@atr.jp 
†††
Tokyo University of  
Science, Suwa  
5000-1, Toyohira, Chino,  
Nagano, 391-0292, Japan 
eharate@rs.suwa.tus.
ac.jp 
 
Abstract 
This paper proposes a method of automatic 
back transliteration of proper nouns, in which 
a Japanese transliterated-word is restored to 
the original English word. The English words 
are created from a sequence of letters; thus 
our method can create new English words that 
are not registered in dictionaries or English 
word lists. When a katakana character is con-
verted into English letters, there are various 
candidates of alphabetic characters. To ensure 
adequate conversion, the proposed method 
uses a target English context to calculate the 
probability of an English character or string 
corresponding to a Japanese katakana charac-
ter or string. We confirmed the effectiveness 
of using the target English context by an ex-
periment of personal-name back translitera-
tion.  
1 Introduction 
In transliteration, a word in one language is con-
verted into a character string of another language 
expressing how it is pronounced. In the case of 
transliteration into Japanese, special characters 
called katakana are used to show how a word is 
pronounced. For example, a personal name and 
its transliterated word are shown below.  
Cunningham         カニンガム
(ka ni n ga mu)
[Transliteration]
 
Here, the italic alphabets are romanized Japanese 
katakana characters.  
New transliterated words such as personal 
names or technical terms in katakana are not al-
ways listed in dictionaries. It would be useful for 
cross-language information retrieval if these 
words could be automatically restored to the 
original English words.  
Back transliteration is the process of restoring 
transliterated words to the original English words. 
Here is a problem of back transliteration.  
?                クラッチフィールド
(English word) (ku ra cchi fi – ru do)
[Back transliteration]
 
There are many ambiguities to restoring a 
transliterated katakana word to its original Eng-
lish word. For example, should "a" in "ku ra cchi 
fi – ru do" be converted into the English letter of 
"a" or "u" or some other letter or string? Trying 
to resolve the ambiguity is a difficult problem, 
which means that back transliteration to the cor-
rect English word is also difficult.  
Using the pronunciation of a dictionary or lim-
iting output English words to a particular English 
word list prepared in advance can simplify the 
problem of back transliteration. However, these 
methods cannot produce a new English word that 
is not registered in a dictionary or an English 
word list. Transliterated words are mainly proper 
nouns and technical terms, and such words are 
often not registered. Thus, a back transliteration 
framework for creating new words would be very 
useful.  
A number of back transliteration methods for 
selecting English words from an English pronun-
ciation dictionary have been proposed. They in-
clude Japanese-to-English (Knight and Graehl, 
1998)
1
, Arabic-to-English (Stalls and Knight, 
                                                           
1
 Their English letter-to-sound WFST does not convert Eng-
lish words that are not registered in a pronunciation diction-
ary.  
1998), and Korean-to-English (Lin and Chen, 
2002).  
There are also methods that select English 
words from an English word list, e.g., Japanese-
to-English (Fujii and Ishikawa, 2001) and Chi-
nese-to-English (Chen et al., 1998).  
Moreover, there are back transliteration meth-
ods capable of generating new words, there are 
some methods for back transliteration from Ko-
rean to English (Jeong et al., 1999; Kang and 
Choi, 2000).  
These previous works did not take the target 
English context into account for calculating the 
plausibility of matching target characters with the 
source characters.  
This paper presents a method of taking the tar-
get English context into account to generate an 
English word from a Japanese katakana word. 
Our character-based method can produce new 
English words that are not listed in the learning 
corpus.  
This paper is organized as follows. Section 2 
describes our method. Section 3 describes the 
experimental set-up and results. Section 4 dis-
cusses the performance of our method based on 
the experimental results. Section 5 concludes our 
research. 
2 Proposed Method 
2.1 Advantage of using English context 
First we explain the difficulty of back translitera-
tion without a pronunciation dictionary. Next, we 
clarify the reason for the difficulty. Finally, we 
clarify the effect using English context in back 
transliteration.  
In back transliteration, an English letter or 
string is chosen to correspond to a katakana char-
acter or string. However, this decision is difficult. 
For example, there are cases that an English letter 
"u" corresponds to "a" of katakana, and there are 
cases that the same English letter "u" does not 
correspond to the same "a" of katakana. "u" in 
Cunningham corresponds to "a" in katakana and 
"u" in Bush does not correspond to "a" in kata-
kana. It is difficult to resolve this ambiguity 
without the pronunciation registered in a diction-
ary.  
The difference in correspondence mainly 
comes from the difference of the letters around 
the English letter "u." The correspondence of an 
English letter or string to a katakana character or 
string varies depending on the surrounding char-
acters, i.e., on its English context.  
Thus, our back transliteration method uses the 
target English context to calculate the probability 
of English letters corresponding to a katakana 
character or string.  
2.2 Notation and conversion-candidate 
lattice  
We formulate the word conversion process as a 
unit conversion process for treating new words. 
Here, the unit is one or more characters that form 
a part of characters of the word.  
A katakana word, K, is expressed by equation 
2.1 with "^" and "$" added to its start and end, 
respectively.  
1
0011
...
m
m
kkkk
+
+
==K  (2.1) 
0
^k =
, 
1
$
m
k
+
=  (2.2) 
where 
j
k  is the j-th character in the katakana 
word, and m is the number of characters except 
for "^" and "$" and 
1
0
m
k
+
 is a character string 
from 
0
k  to 
1m
k
+
.  
We use katakana units constructed of one or 
more katakana characters. We denote a katakana 
unit as ku. For any ku, many English units, eu, 
could be corresponded as conversion-candidates. 
The ku's and eu's are generated using a learning 
corpus in which bilingual words are separated 
into units and every ku unit is related an eu unit.  
{}EL  denotes the lattice of all eu's correspond-
ing to ku's covering a Japanese word. Every eu is 
a node of the lattice and each node is connected 
with next nodes. {}EL  has a lattice structure start-
ing from "^" and ending at "$." Figure 1 shows an 
example of {}EL  corresponding to a katakana 
word "キルシュシュタイン (ki ru shu shu ta i 
n)." In the figure, each circle represents one eu. 
A character string linking individual character 
units in the paths 
12
( , ,.., )
dq
pppp∈  between "^" 
and "$" in {}EL  becomes a conversion candidate, 
where q is the number of paths between "^" and 
"$" in {}EL . 
We get English word candidates by joining eu's 
from "^" to "$" in {}EL . We select a certain path, 
p
d
, in {}EL . The number of character units 
cchi
キ (ki)
chi
ci
cki
cy
k
ke
khi
ki
kie
kii
ky
qui
ch
che
chou
chu
s
sc
sch
schu
sh
ル (ru) シュ (shu) シュ (shu) タ (ta) イ (i) ン (n)
ta
tad
tag
te
ter
tha
ti
to
tta
tu
e
hi
i
ji
y
yeh
yi
m
mon
mp
n
ne
ng
ngh
nin
nn
nne
nt
nw
t
タイ (tai)
l
ld
le
les
lew
ll
lle
llu
lou
lu
r
rc
rd
re
rg
roo
rou
rr
rre
rt
lu
she
shu
su
sy
sz
ch
che
chou
chu
s
sc
sch
schu
sh
she
shu
su
sy
sz
taj
tay
tey
ti
tie
ty
tye
・・・
・・・
・・・・・・
・・・
・・・
・・・
・・・
^ $
 
Figure 1: Example of lattice {}EL  of conversion candidates units. 
 
except for "^" and "$" in p
d
 is expressed as ()
d
np . 
The character units in p
d
 are numbered from start 
to end.  
The English word, E, resulting from the con-
version of a katakana word, K, for p
d
 is expressed 
as follows: 
1
0011
..
m
m
kkkk
+
+
==K  
()1
001()1
..
d
d
np
np
+
+
=ku = ku ku ku , (2.3) 
()1
001()1
..
d
d
lp
lp
eee
+
+
==E  
()1
001()1
..
d
d
np
np
+
+
=eu = eu eu eu , (2.4) 
00 0 0
^ke== = =ku eu
,  
1 ( )1 ( )1 ( )1
$
ddd
mlp np np
ke
++ + +
== = =ku eu
,  
(2.5)
 
where e
j
 is the j-th character in the English word. 
()
d
lp  is the number of characters except for "^" 
and "$" in the English word. 
()1
0
d
np +
eu  for each p
d
 
in {}EL  in equation 2.4 becomes the candidate 
English word. 
()1
0
d
np +
ku  in equation 2.3 shows the 
sequence of katakana units. 
2.3 Probability models using target Eng-
lish context 
To determine the corresponding English word for 
a katakana word, the following equation 2.6 must 
be calculated: 
ˆ
arg max ( | )P
E
E =EK. (2.6) 
Here, 
ˆ
E  represents an output result. 
To use the English context for calculating the 
matching of an English unit with a katakana unit, 
the above equation is transformed into Equation 
2.7 by using Bayes’ theorem. 
ˆ
arg max ( ) ( | )PP=
E
E EKE (2.7) 
Equation 2.7 contains a translation model in 
which an English word is a condition and kata-
kana is a result.  
The word in the translation model (|)P K E  in 
Equation 2.7 is broken down into character units 
by using equations 2.3 and 2.4. 
{
}
()1 ()1
00
()1 ()1
00
()1 ()1
00
()1 ()1
00
()1 ()1 ()1
00 0
ˆ
arg max ( )
(, , | )
arg max ( )
(| , ,)
(|,)(|)
dd
np np
dd
dd
np np
dd
dd d
np np
np np
np np np
P
P
P
P
PP
++
++
++
++
++ +
=
×
=
×
×
∑∑
∑∑
E
eu ku
E
eu ku
EE
Kku eu E
E
Kku eu E
ku eu E eu E
 (2.8) 
()1
0
d
np +
eu  includes information of E. K is only 
affected by 
()1
0
d
np +
ku
. Thus equation 2.8 can be 
rewritten as follows:  
()1 ()1
00
()1
0
()1 ()1 ()1
00 0
.
ˆ
argmax ( )
(| )
(| (|)
d
np np
dd
dd d
np
np np np
P
P
PP
++
+
++ +
⎧
⎨
⎩
⎫
⎬
⎭
=
×
×
∑∑
E
eu ku
EE
Kku
ku eu eu E
 (2.9) 
()1
0
(| )
d
np
P
+
Kku
 is 1 when the string of K and 
()1
0
d
np +
ku
 is the same, and the strings of the 
()1
0
d
np +
ku
 of all paths in the lattice and the string 
of the K is the same. Thus, 
()1
0
(| )
d
np
P
+
Kku
 is al-
ways 1.  
We approximate the sum of paths by selecting 
the maximum path.  
()1 ()1
00
()1
0
ˆ
arg max ( ) ( | )
(|)
dd
d
np np
np
PP
P
++
+
≈
×
E
EEkueu
eu E
 
 (2.10) 
We show an instance of each probability 
model with a concrete value as follows: 
 
()
(^Crutchfield$)
P
P
E
, 
 
()1
0
(| )
(^ | ^ / )
(//
d
np
P
P
ku ra cchi fi ru do ku ra cchi fi ru do
+
−−
Kku
クラッチフィールド$ ク/ラ/ッチ/フィー/ル/ド/$
, 
 
()1 ()1
00
(|
(^ / | ^ / C/ru/tch/fie/l/d / $)
(// / //)
dd
np np
P
P
ku ra cchi fi ru do
++
−
ku eu
ク/ラ/ッチ/フィー/ル/ド/$
, 
 
()1
0
(|)
(^ / C/ru/tch/fie/ld / $ | ^Crutchfield$)
d
np
P
P
+
eu E
. 
 
We broke down the language model ()P E  in 
equation 2.10 into letters.  
()1
1
1
(| )()
d
lp
j
j ja
j
Pe eP
+
−
−
=
≈
∏
E
 (2.11) 
Here, a is a constant. Equation 2.11 is an (a+1)-
gram model of English letters.  
Next, we approximate the translation model 
()1 ()1
00
(|
dd
np np
P
++
ku eu
 and the chunking model 
()1
0
(|)
d
np
P
+
eu E
. For this, we use our previously 
proposed approximation technique (Goto et al., 
2003). The outline of the technique is shown as 
follows.  
()1 ()1
00
(|
dd
np np
P
++
ku eu  is approximated by reduc-
ing the condition.  
()1 ()1
00
()1
()11
00
1
(|
(| , )
dd
d
d
np np
np
npi
i
i
P
P
++
+
+−
=
=
∏
ku eu
ku ku eu
 
()1
() 1
() () 1
1
(| ,, )
d
np
start i
istartib iendi
i
Pe
+
−
−+
=
≈
∏
ku eu
 (2.12)
 
where start(i) is the first position of the i-th char-
acter unit eu
i
, while end(i) is the last position of 
the i-th character unit eu
i
; and b is a constant. 
Equation 2.12 takes English context 
() 1
()
start i
start i b
e
−
−
 and 
() 1end i
e
+
 into account.  
Next, the chunking model 
()1
0
(|)
d
np
P
+
eu E  is 
transformed. All chunking patterns of 
()1
0
d
lp
e
+
=E  
into 
()1
0
d
np +
eu  are denoted by each l(p
d
)+1 point 
between l(p
d
)+2 characters that serve or do not 
serve as delimiters. eu
0
 and 
()1
d
np +
eu  are deter-
mined in advance. l(p
d
)-1 points remain ambigu-
ous. We represent the value that is delimiter or is 
non-delimiter between e
j
 and e
j+1
 by z
j
. We call 
the z
j
 delimiter distinction.  
{
delimiter
non-delimiter
j
z =
 (2.13) 
Here, we show an example of English units us-
ing z
j
.  
(e
1 
e
2 
e
3 
e
4  
e
5  
e
6 
e
7 
e
8 
e
9  
e
10 
e
11
)
Crutchfie ld
(z
1  
z
2  
z
3  
z
4  
z
5  
z
6 
z
7  
z
8  
z
9  
z
10
)
// / /
1  0  1  0  0  1  0  0  1  1
English:
Values of z
j
:
/
 
 
In this example, a delimiter of z
j
 is represented by 
1 and a non-delimiter is represented by 0.  
The chunking model is transformed into a 
processing per character by using z
j
. And we re-
duce the condition.  
()1
0
()1 ()1
00
()1
()11
00
1
(|)
(|
(| , )
d
dd
d
d
np
lp lp
lp
lpj
j
j
P
Pz e
Pz z e
+
−+
−
+−
=
=
=
∏
eu E
 
()1
11
1
1
(| , )
d
lp
jj
j jc jc
j
Pz z e
−
−+
−− −
=
≈
∏
 (2.14) 
The conditional information of the English 
1j
j c
e
+
−
 is as many as c characters and 1 character 
before and after z
j
, respectively. The conditional 
information of 
1
1
j
j c
z
−
−−
 is as many as c+1 delimiter 
distinctions before z
j
.  
By using equation 2.11, 2.12, and 2.14, equa-
tion 2.10 becomes as follows:  
()1
1
1
()
() 1
() () 1
1
()1
11
1
1
ˆ
arg max ( | )
(| ,, )
(| , ).
d
d
d
lp
j
jja
j
np
start i
istartib iendi
i
lp
jj
jjc jc
j
Pe e
Pe
Pz z e
+
−
−
=
−
−+
=
−
−+
−− −
=
≈
×
×
∏
∏
∏
E
E
ku eu
 (2.15) 
Equation 2.15 is the equation of our back 
transliteration method.  
2.4 Beam search solution for context 
sensitive grammar 
Equation 2.15 includes context-sensitive gram-
mar. As such, it can not be carried out efficiently. 
In decoding from the head of a word to the tail, 
e
end(i)+1
 in equation 2.15 becomes context-
sensitive. Thus we try to get approximate results 
by using a beam search solution. To get the re-
sults, we use dynamic programming. Every node 
of eu in the lattice keeps the N-best results evalu-
ated by using a letter of e
end(i)+1
 that gives the 
maximum probability in the next letters. When 
the results of next node are evaluated for select-
ing the N-best, the accurate probabilities from the 
previous nodes are used.  
2.5 Learning probability models based 
on the maximum entropy method 
The probability models are learned based on the 
maximum entropy method. This makes it possi-
ble to prevent data sparseness relating to the 
model as well as to efficiently utilize many con-
ditions, such as context, simultaneously. We use 
the Gaussian Prior (Chen and Rosenfeld, 1999) 
smoothing method for the language model. We 
use one Gaussian variance. We use the value of 
the Gaussian variance that minimizes the test 
set's perplexity.  
The feature functions of the models based on 
the maximum entropy method are defined as 
combinations of letters. In addition, we use 
vowel, consonant, and semi-vowel classes for the 
translation model. We manually define the com-
binations of the letter positions such as e
j
 and e
j-1
. 
The feature functions consist of the letter combi-
nations that meet the combinations of the letter 
positions and are observed at least once in the 
learning data.  
2.6 Corpus for learning 
A Japanese-English word list aligned by unit was 
used for learning the translation model and the 
chunking model and for generating the lattice of 
conversion candidates. The alignment was done 
by semi-automatically. A romanized katakana 
character usually corresponds to one or several 
English letters or strings. For example, a roman-
ized katakana character "k" usually corresponds 
to an English letter "c," "k," "ch," or "q." With 
such heuristic rules, the Japanese-English word 
corpus could be aligned by unit and the align-
ment errors were corrected manually.  
3 Experiment 
3.1 Learning data and test data 
We conducted an experiment on back translitera-
tion using English personal names. The learning 
data used in the experiment are described below. 
The Dictionary of Western Names of 80,000 
People
2
 was used as the source of the Japanese-
English word corpus. We chose the names in al-
phabet from A to Z and their corresponding kata-
kana. The number of distinct words was 39,830 
for English words and 39,562 for katakana words. 
The number of English-katakana pairs was 
83,057
3
. We related the alphabet and katakana 
character units in those words by using the 
method described in section 2.6. We then used 
the corpus to make the translation and the chunk-
ing models and to generate a lattice of conversion 
candidates. 
The learning of the language model was car-
ried out using a word list that was created by 
merging two word lists: an American personal-
                                                           
2
 Published by Nichigai Associates in Japan in 1994.  
3
 This corpus includes many identical English-katakana 
word pairs.  
name list
4
, and English head words of the Dic-
tionary of Western Names of 80,000 people. The 
American name list contains frequency informa-
tion for each name; we also used the frequency 
data for the learning of the language model. A 
test set for evaluating the value of the Gaussian 
variance was created using the American name 
list. The list was split 9:1, and we used the larger 
data for learning and the smaller data for evaluat-
ing the parameter value.  
The test data is as follows. The test data con-
tained 333 katakana name words of American 
Cabinet officials, and other high-ranking officials, 
as well as high-ranking governmental officials of 
Canada, the United Kingdom, Australia, and 
New Zealand (listed in the World Yearbook 2002 
published by Kyodo News in Japan). The English 
name words that were listed along with the corre-
sponding katakana names were used as answer 
words. Words that included characters other than 
the letters A to Z were excluded from the test 
data. Family names and First names were not 
distinguished.  
3.2 Experimental models 
We used the following methods to test the indi-
vidual effects of each factor of our method.  
• Method A 
Used a model that did not take English context 
into account. The plausibility is expressed as fol-
lows: 
()
1
ˆ
arg max ( | )
d
np
ii
i
P
=
=
∏
E
E eu ku . (3.1) 
• Method B 
Used our language model and a translation model 
that did not consider English context. The con-
stant a = 3 in the language model. The plausibil-
ity is expressed as follows: 
()1 ()
1
3
11
ˆ
arg max ( | ) ( | )
dd
lp np
j
j jii
ji
Pe e P
+
−
−
==
=
∏∏
E
E ku eu . 
 (3.2) 
• Method C 
Applied our chunking model to method B, with c 
= 3 in the chunking model. The plausibility is 
expressed as follows: 
                                                           
4
 Prepared from the 1990 Census conducted by the U.S. 
Department of Commerce. Available at 
http://www.census.gov/genealogy/names/ . The list includes 
91,910 distinct words.  
()1 ()
1
3
11
ˆ
arg max ( | ) ( | )
dd
lp np
j
jj i i
ji
Pe e P
+
−
−
==
=
∏∏
E
E ku eu  
()1
14
43
1
(| , ).
d
lp
jj
jj j
j
Pz z e
−
−+
−−
=
×
∏
 (3.3) 
• Method D 
Used our translation model that considered Eng-
lish context, but not the chunking model. b = 3 in 
the translation model. The plausibility is ex-
pressed as follows: 
()
()1
1
3
1
()
() 1
() 3 () 1
1
ˆ
arg max ( | )
|,,.
d
d
lp
j
jj
j
np
start i
istarti iendi
i
Pe e
Pe
+
−
−
=
−
−+
=
=
×
∏
∏
E
E
ku eu
 (3.4) 
• Method E 
Used our language model, translation model, and 
chunking model. The plausibility is expressed as 
follows: 
()
()1
1
3
1
()
() 1
() 3 () 1
1
()1
14
43
1
ˆ
arg max ( | )
|,,
(| , ).
d
d
d
lp
j
jj
j
np
start i
istarti iendi
i
lp
jj
jj j
j
Pe e
Pe
Pz z e
+
−
−
=
−
−+
=
−
−+
−−
=
=
×
×
∏
∏
∏
E
E
ku eu
 
 (3.5) 
3.3 Results 
Table 1 shows the results of the experiment
5
 on 
back transliteration from Japanese katakana to 
English. The conversion was determined to be 
successful if the generated English word agreed 
perfectly with the English word in the test data. 
Table 2 shows examples of back transliterated 
words.  
Method A B C D E 
Top 1 23.7 57.4 61.6 63.1 66.4
Top 2 34.8 69.1 72.4 71.8 74.2
Top 3 42.9 73.6 76.6 75.4 79.3
Top 5 54.1 77.5 79.9 80.8 83.5
Top 10 63.4 82.0 85.3 86.5 87.7
Table 1: Ratio (%) of including the answer word 
in high-ranking words.  
                                                           
5
 For model D and E, we used N=50 for the beam search 
solution. In addition, we kept paths that represented parts of 
words existing in the learning data.  
Japanese katakana 
(romanized katakana) 
Created English 
アシュクロフト  
(a shu ku ro fu to) 
Ashcroft 
キルシュシュタイン  
(ki ru shu shu ta i n) 
Kirschstein 
スペンサー  
(su pe n sa -) 
Spencer 
 
パウエル  
(pa u e ru) 
Powell 
 
プリンシピ  
(pu ri n shi pi) 
Principi 
 
Table 2: Example of English words produced.  
4 Discussion 
The correct-match ratio of the method E for the 
first-ranked words was 66%. Its correct-match 
ratio for words up to the 10th rank was 87%.  
Regarding the top 1 ranked words, method B 
that used a language model increase the ratio 33-
points from method A that did not use a language 
model. This demonstrates the effectiveness of the 
language model. 
Also for the top 1 ranked words, method C 
which adopted the chunking model increase the 
ratio 4-points from method B that did not adopt 
the chunking model in the top 1 ranked words. 
This indicates the effectiveness of the chunking 
model. 
Method D that used a translation model taking 
English context into account had a ratio 5-points 
higher in top 1 ranked words than that of method 
B that used a translation model not taking Eng-
lish context into account. This demonstrates the 
effectiveness of the language model.  
Method E gave the best ratio. Its ratio for the 
top 1 ranked word was 42-points higher than 
method A's.  
These results demonstrate the effectiveness of 
using English context for back transliteration.  
5 Conclusion 
This paper described a method for Japanese to 
English back transliteration. Unlike conventional 
methods, our method uses a target English con-
text to calculate the plausibility of matching be-
tween English and katakana. Our method can 
treat English words that do not exist in learning 
data. We confirmed the effectiveness of our 
method in an experiment using personal names. 
We will apply this technique to cross-language 
information retrieval.  

References  

Hsin-Hsi Chen, Sheng-Jie Huang, Yung-Wei Ding, 
and Shih-Chung Tsai. 1998. Proper Name Transla-
tion in Cross-Language Information Retrieval. 36th 
Annual Meeting of the Association for Computa-
tional Linguistics and 17th International Conference 
on Computational Linguistics, pp.232-236. 

Stanley F. Chen, Ronald Rosenfeld. 1999. A Gaussian 
Prior for Smoothing Maximum Entropy Models. 
Technical Report CMU-CS-99-108, Carnegie Mel-
lon University.  

Bonnie Glover Stalls and Kevin Knight. 1998. Trans-
lating Names and Technical Terms in Arabic Text. 
COLING/ACL Workshop on Computational Ap-
proaches to Semitic Languages. 

Isao Goto, Naoto Kato, Noriyoshi Uratani, and Teru-
masa Ehara. 2003. Transliteration Considering 
Context Information based on the Maximum En-
tropy Method. Machine Translation Summit IX, 
pp.125-132. 

Kil Soon Jeong, Sung Hyun Myaeng, Jae Sung Lee, 
and Key-Sun Choi. 1999. Automatic Identification 
and Back-Transliteration of Foreign Words for In-
formation Retrieval. Information Processing and 
Management, Vol.35, No.4, pp.523-540. 

Byung-Ju Kang and Key-Sun Choi. 2000. Automatic 
Transliteration and Back-Transliteration by Deci-
sion Tree Learning. International Conference on 
Language Resources and Evaluation. pp.1135-1411. 

Kevin Knight and Jonathan Graehl. 1998. Machine 
Transliteration. Computational Linguistics, Vol.24, 
No.4, pp.599-612. 

Wei-Hao Lin and Hsin-Hsi Chen. 2002. Backward 
Machine Transliteration by Learning Phonetic 
Similarity. 6th Conference on Natural Language 
Learning, pp.139-145.  

Atsushi Fujii and Tetsuya Ishikawa. 2001. Japa-
nese/English Cross-Language Information Re-
trieval: Exploration of Query Translation and 
Transliteration. Computers and the Humanities, 
Vol.35, No.4, pp.389-420. 
