A Comparative Study on Compositional Translation Estimation
using a Domain/Topic-Specific Corpus collected from the Web
Masatsugu Tonoike†, Mitsuhiro Kida†, Toshihiro Takagi†, Yasuhiro Sasaki†,
Takehito Utsuro††, Satoshi Sato†††
†Graduate School of Informatics, Kyoto University
Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501, Japan
††Graduate School of Systems and Information Engineering, University of Tsukuba
1-1-1, Tennodai, Tsukuba, 305-8573, Japan
†††Graduate School of Engineering, Nagoya University
Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan
Abstract
This paper studies issues related to the
compilation of a bilingual lexicon for tech-
nical terms. In the task of estimating bilin-
gual term correspondences of technical
terms, it is usually rather difficult to find
an existing corpus for the domain of such
technical terms. In this paper, we adopt
an approach of collecting a corpus for the
domain of such technical terms from the
Web. As a method of translation esti-
mation for technical terms, we employ a
compositional translation estimation tech-
nique. This paper focuses on quantita-
tively comparing variations of the compo-
nents in the scoring functions of composi-
tional translation estimation. Through ex-
perimental evaluation, we show that the
domain/topic-specific corpus contributes
toward improving the performance of the
compositional translation estimation.
1 Introduction
This paper studies issues related to the compilation
of a bilingual lexicon for technical terms. Thus
far, several techniques of estimating bilingual term
correspondences from a parallel/comparable cor-
pus have been studied (Matsumoto and Utsuro,
2000). For example, in the case of estimation from
comparable corpora, (Fung and Yee, 1998; Rapp,
1999) proposed standard techniques of estimating
bilingual term correspondences from comparable
corpora. In their techniques, contextual similarity
between a source language term and its translation
candidate is measured across the languages, and
all the translation candidates are re-ranked accord-
ing to their contextual similarities. However, there
are limited number of parallel/comparable corpora
that are available for the purpose of estimating
bilingual term correspondences. Therefore, even
if one wants to apply those existing techniques to
the task of estimating bilingual term correspon-
dences of technical terms, it is usually rather dif-
ficult to find an existing corpus for the domain of
such technical terms.
On the other hand, compositional translation es-
timation techniques that use a monolingual corpus
(Fujii and Ishikawa, 2001; Tanaka and Baldwin,
2003) are more practical. It is because collecting a
monolingual corpus is less expensive than collect-
ing a parallel/comparable corpus. Translation can-
didates of a term can be compositionally generated
by concatenating the translation of the constituents
of the term. Here, the generated translation candi-
dates are validated using the domain/topic-specific
corpus.
In order to assess the applicability of the com-
positional translation estimation technique, we
randomly pick up 667 Japanese and English tech-
nical term translation pairs of 10 domains from ex-
isting technical term bilingual lexicons. We then
manually examine their compositionality, and find
out that 88% of them are actually compositional,
which is a very encouraging result.
But still, it is expensive to collect a
domain/topic-specific corpus. Here, we adopt
an approach of using the Web, since documents
of various domains/topics are available on the
Web. When validating translation candidates
using the Web, roughly speaking, there exist the
following two approaches. In the first approach,
translation candidates are validated through
the search engine (Cao and Li, 2002). In the
second approach, a domain/topic-specific corpus
is collected from the Web in advance and fixed
11
collecting terms
of specific
domain/topic
(language S )
X
S
U
(# of translations
is one)
compiled bilingual lexicon
process data
collecting
corpus
(language T )
domain/topic
specific
corpus
(language T )
sample terms
of specific 
domain/topic
(language S )
X
ST
U
, X
ST
M
,Y
ST
estimating bilingual term
correspondences
language pair (S,T )
term set
(language S )
X
T
U
(lang. T )
translation set
(language T )
web
(language S )
web
(language S )
existing
bilingual lexicon
X
S
M
(# of translations
is more than one)
Y
S
(# of translations
is zero)
web
(language T )
web
(language T )
looking up
bilingual lexicon
validating
translation
candidates
web
(language T )
web
(language T )
Figure 1: Compilation of a Domain/Topic-
Specific Bilingual Lexicon using the Web
before translation estimation, then generated
translation candidates are validated against the
domain/topic-specific corpus (Tonoike et al.,
2005). The first approach is preferable in terms of
coverage, while the second is preferable in terms
of computational efficiency. This paper mainly
focuses on quantitatively comparing the two
approaches in terms of coverage and precision of
compositional translation estimation.
More specifically, in compositional translation
estimation, we decompose the scoring function
of a translation candidate into two components:
bilingual lexicon score and corpus score. In this
paper, we examine variants for those components
and define 9 types of scoring functions in total.
Regarding the above mentioned two approaches
to validating translation candidates using the Web,
the experimental result shows that the second
approach outperforms the first when the correct
translation does exist in the corpus. Furthermore,
we examine the methods that combine two scor-
ing functions based on their agreement. The ex-
perimental result shows that it is quite possible to
achieve precision much higher than those of single
scoring functions.
2 Overall framework
The overall framework of compiling a bilingual
lexicon from the Web is illustrated as in Figure 1.
Suppose that we have sample terms of a specific
domain/topic, then the technical terms that are to
be listed as the headwords of a bilingual lexicon
are collected from the Web by the related term col-
lection method of (Sato and Sasaki, 2003). These
collected technical terms can be divided into three
subsets depending on the number of translation
candidates present in an existing bilingual lexicon,
i.e., the subset X
U
S
of terms for which the number
of translations in the existing bilingual lexicon is
one, the subset X
M
S
of terms for which the number
of translations is more than one, and the subset Y
S
of terms that are not found in the existing bilingual
lexicon (henceforth, the union X
U
S
∪ X
M
S
will be
denoted as X
S
). Here, the translation estimation
task here is to estimate translations for the terms
of the subsets X
M
S
and Y
S
. A new bilingual lex-
icon is compiled from the result of the translation
estimation for the terms of the subsets X
M
S
and
Y
S
as well as the translation pairs that consist of
the terms of the subset X
U
S
and their translations
found in the existing bilingual lexicon.
For the terms of the subset X
M
S
, it is required
that an appropriate translation is selected from
among the translation candidates found in the ex-
isting bilingual lexicon. For example, as a trans-
lation of the Japanese technical term “����,”
which belongs to the logic circuit domain, the term
“register” should be selected but not the term “reg-
ista” of the football domain. On the other hand, for
the terms of Y
S
, it is required that the translation
candidates are generated and validated. In this pa-
per, out of the above two tasks, we focus on the
latter of translation candidate generation and val-
idation using the Web. As we introduced in the
previous section, here we experimentally compare
the two approaches to validating translation candi-
dates. The first approach directly uses the search
engine, while the second uses the domain/topic-
specific corpus, which is collected in advance from
the Web. Here, in the second approach, we use the
term of X
U
S
, which has only one translation in the
existing bilingual lexicon. The set of translations
of the terms of the subset X
U
S
is denoted as X
U
T
.
Then, in the second approach, the domain/topic-
specific corpus is collected from the Web using the
terms of the set X
U
T
.
3 Compositional Translation Estimation
for Technical Terms
3.1 Overview
An example of compositional translation estima-
tion for the Japanese technical term “ ;���

s” is illustrated in Figure 2. First, the Japanese
technical term “ ;���
s” is decomposed
into its constituents by consulting an existing
bilingual lexicon and retrieving Japanese head-
12
• application(1)
•practical(0.3)
• applied(1.6)
•action(1)
•activity(1)
• behavior(1)
• analysis(1)
•diagnosis(1)
• assay(0.3)
• behavior analysis(10)
B�B� Compositional generation 
of translation candidate
• applied behavior analysis(17.6)
• application behavior analysis(11)
• applied behavior diagnosis(1)
B�B� Decompose source term into constituents  
B�B� Translate constituents into target language      
process
!� -G �a
!� -G�b
Generated translation candidates
>" (1.6; 1; 1)+(1.6; 10)
• application(1)
•practical(0.3)
• applied(1.6)
Figure 2: Compositional Translation Estimation
for the Japanese Technical Term “ ;���
s”
words.
1
In this case, the result of this decompo-
sition can be given as in the cases “a” and “b”
(in Figure 2). Then, each constituent is translated
into the target language. A confidence score is as-
signed to the translation of each constituent. Fi-
nally, translation candidates are generated by con-
catenating the translation of those constituents ac-
cording to word ordering rules considering prepo-
sitional phrase construction.
3.2 Collecting a Domain/Topic-Specific
Corpus
When collecting a domain/topic-specific corpus of
the language T , for each technical term x
U
T
in the
set X
U
T
, we collect the top 100 pages obtained
from search engine queries that include the term
x
U
T
. Our search engine queries are designed such
that documents that describe the technical term x
U
T
are ranked high. For example, an online glossary
is one such document. When collecting a Japanese
corpus, the search engine “goo”
2
is used. The spe-
cific queries that are used in this search engine
are phrases with topic-marking postpositional par-
ticles such as “x
U
T
qx,” “x
U
T
qMO,” “x
U
T
x,”
and an adnominal phrase “x
U
T
w,” and “x
U
T
.”
3.3 Translation Estimation
3.3.1 Compiling Bilingual Constituents
Lexicons
This section describes how to compile bilingual
constituents lexicons from the translation pairs of
1
Here, as an existing bilingual lexicon, we use Ei-
jiro(http://www.alc.co.jp/) and bilingual constituents lexicons
compiled from the translation pairs of Eijiro (details to be de-
scribed in the next section).
2
http://www.goo.ne.jp/
a19 a16
applied mathematics : ;
:�
applied science : ;J�
applied robot : ;����
.
.
. frequency
⇓y�
applied : ;:40
a18 a17
Figure 3: Example of Estimating Bilingual Con-
stituents Translation Pair (Prefix)
the existing bilingual lexicon Eijiro. The under-
lying idea of augmenting the existing bilingual
lexicon with bilingual constituents lexicons is il-
lustrated in Figure 3. Suppose that the existing
bilingual lexicon does not include the translation
pair “applied : ;,” while it includes many
compound translation pairs with the first English
word “applied” and the first Japanese word “ 
;.”
3
In such a case, we align those translation
pairs and estimate a bilingual constituent transla-
tion pair which is to be collected into a bilingual
constituents lexicon.
More specifically, from the existing bilingual
lexicon, we first collect translation pairs whose
English terms and Japanese terms consist of two
constituents into another lexicon P
2
. We com-
pile the “bilingual constituents lexicon (prefix)”
from the first constituents of the translation pairs
in P
2
and compile the “bilingual constituents lex-
icon (suffix)” from their second constituents. The
number of entries in each language and those of
the translation pairs in these lexicons are shown in
Table 1.
The result of our assessment reveals that only
48% of the 667 translation pairs mentioned in Sec-
tion 1 can be compositionally generated by using
Eijiro, while the rate increases up to 69% using
both Eijiro and “bilingual constituents lexicons.”
4
3.3.2 Score of Translation Candidates
This section gives the definition of the scores
of a translation candidate in compositional trans-
lation estimation.
First, let y
s
be a technical term whose transla-
tion is to be estimated. We assume that y
s
is de-
3
Japanese entries are supposed to be segmented into a
sequence of words by the morphological analyzer JUMAN
(http://www.kc.t.u-tokyo.ac.jp/nl-resource/juman.html).
4
In our rough estimation, the upper bound of this rate
is approximately 80%. An improvement from 69% to 80%
could be achieved by extending the bilingual constituents lex-
icons.
13
Table 1: Numbers of Entries and Translation Pairs
in the Lexicons
lexicon
# of entries # of translation
English Japanese pairs
Eijiro 1,292,117 1,228,750 1,671,230
P
2
217,861 186,823 235,979
B
P
37,090 34,048 95,568
B
S
20,315 19,345 62,419
B 48,000 42,796 147,848
Eijiro : existing bilingual lexicon
P
2
: entries of Eijiro with two constituents
in both languages
B
P
: bilingual constituents lexicon (prefix)
B
S
: bilingual constituents lexicon (suffix)
B : bilingual constituents lexicon (merged)
composed into their constituents as below:
y
s
= s
1
,s
2
,··· ,s
n
(1)
where each s
i
is a single word or a sequence of
words.
5
For y
s
, we denote a generated translation
candidate as y
t
.
y
t
= t
1
,t
2
,··· ,t
n
(2)
where each t
i
is a translation of s
i
. Then the trans-
lation pair 〈y
s
,y
t
〉 is represented as follows.
〈y
s
,y
t
〉 = 〈s
1
,t
1
〉,〈s
2
,t
2
〉,··· ,〈s
n
,t
n
〉 (3)
The score of a generated translation candidate is
defined as the product of a bilingual lexicon score
and a corpus score as follows.
Q(y
s
,y
t
)=Q
dict
(y
s
,y
t
) · Q
coprus
(y
t
) (4)
Bilingual lexicon score measures appropriateness
of correspondence of y
s
and y
t
. Corpus score
measures appropriateness of the translation candi-
date y
t
based on the target language corpus. If a
translation candidate is generated from more than
one sequence of translation pairs, the score of the
translation candidate is defined as the sum of the
score of each sequence.
Bilingual Lexicon Score
In this paper, we compare two types of bilin-
gual lexicon scores. Both scores are defined as the
product of scores of translation pairs included in
the lexicons presented in the previous section as
follows.
5
Eijiro has both single word entries and compound word
entries.
• Frequency-Length
Q
dict
(y
s
,y
t
)=
n
productdisplay
i=1
q(〈s
i
,t
i
〉) (5)
The first type of bilingual lexicon scores is re-
ferred to as “Frequency-Length.” This score is
based on the length of translation pairs and the fre-
quencies of translation pairs in the bilingual con-
stituent lexicons (prefix,suffix) B
P
,B
S
in Table 1.
In this paper, we first assume that the translation
pairs follow certain preference rules and that they
can be ordered as below:
1. Translation pairs 〈s, t〉 in the existing bilin-
gual lexicon Eijiro, where the term s consists
of two or more constituents.
2. Translation pairs in the bilingual constituents
lexicons whose frequencies in P
2
are high.
3. Translation pairs 〈s, t〉 in the existing bilin-
gual lexicon Eijiro, where the term s consists
of exactly one constituent.
4. Translation pairs in the bilingual constituents
lexicons whose frequencies in P
2
are not
high.
As the definition of the confidence score
q(〈s, t〉) of a translation pair 〈s, t〉, we use the fol-
lowing:
q(〈s, t〉)=
⎧
⎨
⎩
10
(compo(s)−1)
(〈s, t〉 in Eijiro)
log
10
f
p
(〈s, t〉) (〈s, t〉 in B
P
)
log
10
f
s
(〈s, t〉) (〈s, t〉 in B
S
)
(6)
, where compo(s) denotes the word count of s,
f
p
(〈s, t〉) represents the frequency of 〈s, t〉 as the
first constituent in P
2
, and f
s
(〈s, t〉) represents the
frequency of 〈s, t〉 as the second constituent in P
2
.
• Probability
Q
dict
(y
s
,y
t
)=
n
productdisplay
i=1
P(s
i
|t
i
) (7)
The second type of bilingual lexicon scores is re-
ferred to as “Probability.” This score is calcu-
lated as the product of the conditional probabili-
ties P(s
i
|t
i
). P(s|t) is calculated using bilingual
lexicons in Table 1.
P(s|t)=
f
prob
(〈s, t〉)
summationtext
s
j
f
prob
(〈s
j
,t〉)
(8)
14
Table 2: 9 Scoring Functions of Translation Candidates and their Components
bilingual lexicon score corpus score corpus
score ID freq-length probability probability frequency occurrence off-line on-line
(search engine)
A prune/final prune/final o
B prune/final prune/final o
C prune/final prune/final o
D prune/final prune o
E prune/final
F prune/final final prune o
G prune/final prune/final o
H prune/final final o
I prune/final final o
f
prob
(〈s, t〉) denotes the frequency of the transla-
tion pair 〈s, t〉 in the bilingual lexicons as follows:
f
prob
(〈s, t〉)=
braceleftBigg
10 (〈s, t〉 in Eijiro)
f
B
(〈s, t〉) (〈s, t〉 in B)
(9)
Note that the frequency of a translation pair in Ei-
jiro is regarded as 10
6
and f
B
(〈s, t〉) denotes the
frequency of the translation pair 〈s, t〉 in the bilin-
gual constituent lexicon B.
Corpus Score
We evaluate three types of corpus scores as fol-
lows.
• Probability: the occurrence probability of y
t
estimated by the following bi-gram model
Q
corpus
(y
t
)=P(t
1
) ·
n
productdisplay
i=1
P(t
i+1
|t
i
) (10)
• Frequency: the frequency of a translation
candidate in a target language corpus
Q
corpus
(y
t
)=freq(y
t
) (11)
• Occurrence: whether a translation candidate
occurs in a target language corpus or not
Q
corpus
(y
t
)=
⎧
⎪
⎨
⎪
⎩
1 y
t
occurs in a corpus
0 y
t
does not occur
in a corpus
(12)
6
It is necessary to empirically examine whether or not the
definition of the frequency of a translation pair in Eijiro is
appropriate.
Variation of the total scoring functions
As shown in Table 2, in this paper, we examine
the 9 combinations of the bilingual lexicon scores
and the corpus scores. In the table, ‘prune’ indi-
cates that the score is used for ranking and pruning
sub-sequences of generated translation candidates
in the course of generating translation candidates
using a dynamic programming algorithm. ‘Final’
indicates that the score is used for ranking the fi-
nal outputs of generating translation candidates.
In the column ‘corpus’, ‘off-line’ indicates that
a domain/topic-specific corpus is collected from
the Web in advance and then generated transla-
tion candidates are validated against this corpus.
‘On-line’ indicates that translation candidates are
directly validated through the search engine.
Roughly speaking, the scoring function ‘A’ cor-
responds to a variant of the model proposed by
(Fujii and Ishikawa, 2001). The scoring func-
tion ‘D’ is a variant of the model proposed by
(Tonoike et al., 2005) and ‘E’ corresponds to the
bilingual lexicon score of the scoring function ‘D’.
The scoring function ‘I’ is intended to evaluate the
approach proposed in (Cao and Li, 2002).
3.3.3 Combining Two Scoring Functions
based on their Agreement
In this section, we examine the method that
combines two scoring functions based on their
agreement. The two scoring functions are selected
out of the 9 functions introduced in the previous
section. In this method, first, confidence of trans-
lation candidates of a technical term are measured
by the two scoring functions. Then, if the first
ranked translation candidates of both scoring func-
tions agree, this method outputs the agreed trans-
lation candidate. The purpose of introducing this
method is to prefer precision to recall.
15
collecting terms
of specific
domain/topic
(language S )
X
S
U
(# of translations
is one)
compiled bilingual lexicon
process data
collecting
corpus
(language T )
sample terms
of specific 
domain/topic
(language S )
X
ST
U
, X
ST
M
,Y
ST
estimating bilingual term
correspondences
language pair (S,T )
term set
(language S )
X
T
U
(lang. T )
translation set
(language T )
web
(language S )
web
(language S )
existing
bilingual lexicon
X
S
M
(# of translations
is more than one)
Y
S
(# of translations
is zero)
web
(language T )
web
(language T )
looking up
bilingual lexicon
domain/topic
specific
corpus
(language T )
validating
translation
candidates
web
(language T )
web
(language T )
Figure 4: Experimental Evaluation of Translation
Estimation for Technical Terms with/without the
Domain/Topic-Specific Corpus (taken from Fig-
ure 1)
4 Experiments and Evaluation
4.1 Translation Pairs for Evaluation
In our experimental evaluation, within the frame-
work of compiling a bilingual lexicon for techni-
cal terms, we evaluate the translation estimation
portion that is indicated by the bold line in Fig-
ure 4. In this paper, we simply omit the evalua-
tion of the process of collecting technical terms to
be listed as the headwords of a bilingual lexicon.
In order to evaluate the translation estimation por-
tion, terms are randomly selected from the 10 cate-
gories of existing Japanese-English technical term
dictionaries listed in Table 3, for each of the sub-
sets X
U
S
and Y
S
(here, the terms of Y
S
that consist
of only one word or morpheme are excluded). As
described in Section 1, the terms of the set X
U
T
(the
set of translations for the terms of the subset X
U
S
)
is used for collecting a domain/topic-specific cor-
pus from the Web. As shown in Table 3, size of the
collected corpora is 48MB on the average. Trans-
lation estimation evaluation is to be conducted for
the subset Y
S
. For each of the 10 categories, Ta-
ble 3 shows the sizes of the subsets X
U
S
and Y
S
,
and the rate of including correct translation within
the collected domain/topic-specific corpus for Y
S
.
In the following, we show the evaluation results
with the source language S as English and the tar-
get language T as Japanese.
4.2 Evaluation of single scoring functions
This section gives the results of evaluating single
scoring functions A ∼ I listed in Table 2.
Table 4 shows three types of experimental re-
sults. The column ‘the whole set Y
S
’ shows the
results against the whole set Y
S
. The column
‘generatable’ shows the results against the trans-
lation pairs in Y
S
that can be generated through
the compositional translation estimation process.
69% of the terms in ‘the whole set Y
S
’ belongs
to the set ‘generatable’. The column ‘gene.-exist’
shows the result against the source terms whose
correct translations do exist in the corpus and that
can be generated through the compositional trans-
lation estimation process. 50% of the terms in ‘the
whole set Y
S
’ belongs to the set ‘gene.-exist’. The
column ‘top 1’ shows the correct rate of the first
ranked translation candidate. The column ‘top 10’
shows the rate of including the correct candidate
within top 10.
First, in order to evaluate the effectiveness of
the approach of validating translation candidates
by using a target language corpus, we compare the
scoring functions ’D’ and ’E’. The difference be-
tween them is whether or not they use a corpus
score. The results for the whole set Y
S
show that
using a corpus score, the precision improves from
33.9% to 43.0%. This result supports the effec-
tiveness of the approach of validating translation
candidates using a target language corpus.
As can be seen from these results for the whole
set Y
S
, the correct rate of the scoring function ‘I’
that directly uses the web search engine in the cal-
culation of its corpus score is higher than those
of other scoring functions that use the collected
domain/topic-specific corpus. This is because,
for the whole set Y
S
, the rate of including cor-
rect translation within the collected domain/topic-
specific corpus is 72% on the average, which is
not very high. On the other hand, the results of the
column ‘gene.-exist’ show that if the correct trans-
lation does exist in the corpus, most of the scor-
ing functions other than ‘I’ can achieve precisions
higher than that of the scoring function ‘I’. This
result supports the effectiveness of the approach
of collecting a domain/topic-specific corpus from
the Web in advance and then validating generated
translation candidates against this corpus.
4.3 Evaluation of combining two scoring
functions based on their agreement
The result of evaluating the method that combines
two scoring functions based on their agreement is
shown in Table 5. This result indicates that com-
binations of scoring functions with ‘off-line’/‘on-
16
Table 3: Number of Translation Pairs for Evaluation (S=English)
dictionaries categories |Y
S
| |X
U
S
| corpus size C(S)
Electromagnetics 33 36 28MB 85%
McGraw-Hill Electrical engineering 45 34 21MB 71%
Optics 31 42 37MB 65%
Iwanami
Programming language 29 37 34MB 93%
Programming 29 29 33MB 97%
Dictionary of
(Computer) 100 91 67MB 51%
Computer
Anatomical Terms 100 91 73MB 86%
Dictionary of Disease 100 91 83MB 77%
250,000 Chemicals and Drugs 100 94 54MB 60%
medical terms Physical Science and Statistics 100 88 56MB 68%
Total 667 633 482MB 72%
McGraw-Hill : Dictionary of Scientific and Technical Terms
Iwanami : Encyclopedic Dictionary of Computer Science
C(S) : for Y
S
, the rate of including correct translations within the collected domain/topic-specific corpus
Table 4: Result of Evaluating single Scoring Functions
the whole set Y
S
(667 terms∼100%) generatable (458 terms∼69%) gene.-exist (333 terms∼50%)
ID top 1 top 10 top 1 top 10 top 1 top 10
A 43.8% 52.9% 63.8% 77.1% 82.0% 98.5%
B 42.9% 50.7% 62.4% 73.8% 83.8% 99.4%
C 43.0% 58.0% 62.7% 84.5% 75.1% 94.6%
D 43.0% 47.4% 62.7% 69.0% 85.9% 94.6%
E 33.9% 57.3% 49.3% 83.4% 51.1% 84.1%
F 40.2% 47.4% 58.5% 69.0% 80.2% 94.6%
G 39.1% 46.8% 57.0% 68.1% 78.1% 93.4%
H 43.8% 57.3% 63.8% 83.4% 73.6% 84.1%
I 49.8% 57.3% 72.5% 83.4% 74.8% 84.1%
Table 5: Result of combining two scoring func-
tions based on their agreement
corpus combination precision recall F
β=1
A&I 88.0% 27.6% 0.420
off-line/ D&I 86.0% 29.5% 0.440
on-line F&I 85.1% 29.1% 0.434
H&I 58.7% 37.5% 0.457
A&H 86.0% 30.4% 0.450
F&H 80.6% 33.7% 0.476
off-line/ D&H 80.4% 32.7% 0.465
off-line A&D 79.0% 32.1% 0.456
A&F 74.6% 33.0% 0.457
D&F 68.2% 35.7% 0.469
line’ corpus tend to achieve higher precisions than
those with ‘off-line’/‘off-line’ corpus. This result
also shows that it is quite possible to achieve high
precisions even by combining scoring functions
with ‘off-line’/‘off-line’ corpus (the pair ‘A’ and
‘H’). Here, the two scoring functions ‘A’ and ‘H’
are the one with frequency-based scoring func-
tions and that with probability-based scoring func-
tions, and hence, have quite different nature in the
design of their scoring functions.
5 Related Works
As a related work, (Fujii and Ishikawa, 2001) pro-
posed a technique for compositional estimation of
bilingual term correspondences for the purpose of
cross-language information retrieval. One of the
major differences between the technique of (Fu-
jii and Ishikawa, 2001) and the one proposed in
this paper is that in (Fujii and Ishikawa, 2001), in-
stead of a domain/topic-specific corpus, they use a
corpus containing the collection of technical pa-
pers, each of which is published by one of the
65 Japanese associations for various technical do-
mains. Another significant difference is that in
(Fujii and Ishikawa, 2001), they evaluate only the
performance of the cross-language information re-
trieval and not that of translation estimation.
(Cao and Li, 2002) also proposed a method
of compositional translation estimation for com-
pounds. In the method of (Cao and Li, 2002), the
translation candidates of a term are composition-
ally generated by concatenating the translation of
the constituents of the term and are validated di-
rectly through the search engine. In this paper,
we evaluate the approach proposed in (Cao and
Li, 2002) by introducing a total scoring function
17
that is based on validating translation candidates
directly through the search engine.
6 Conclusion
This paper studied issues related to the compila-
tion a bilingual lexicon for technical terms. In
the task of estimating bilingual term correspon-
dences of technical terms, it is usually rather dif-
ficult to find an existing corpus for the domain
of such technical terms. In this paper, we adopt
an approach of collecting a corpus for the do-
main of such technical terms from the Web. As
a method of translation estimation for technical
terms, we employed a compositional translation
estimation technique. This paper focused on quan-
titatively comparing variations of the components
in the scoring functions of compositional transla-
tion estimation. Through experimental evaluation,
we showed that the domain/topic specific corpus
contributes to improving the performance of the
compositional translation estimation.
Future work includes complementally integrat-
ing the proposed framework of compositional
translation estimation using the Web with other
translation estimation techniques. One of them is
that based on collecting partially bilingual texts
through the search engine (Nagata and others,
2001; Huang et al., 2005). Another technique
which seems to be useful is that of transliteration
of names (Knight and Graehl, 1998; Oh and Choi,
2005).

References
Y. Cao and H. Li. 2002. Base noun phrase translation using
Web data and the EM algorithm. In Proc. 19th COLING,
pages 127–133.
A. Fujii and T. Ishikawa. 2001. Japanese/english cross-
language information retrieval: Exploration of query
translation and transliteration. Computers and the Hu-
manities, 35(4):389–420.
P. Fung and L. Y. Yee. 1998. An IR approach for translating
new words from nonparallel, comparable texts. In Proc.
17th COLING and 36th ACL, pages 414–420.
F. Huang, Y. Zhang, and S. Vogel. 2005. Mining key phrase
translations from web corpora. In Proc. HLT/EMNLP,
pages 483–490.
K. Knight and J. Graehl. 1998. Machine transliteration.
Computational Linguistics, 24(4):599–612.
Y. Matsumoto and T. Utsuro. 2000. Lexical knowledge ac-
quisition. In R. Dale, H. Moisl, and H. Somers, editors,
Handbook of Natural Language Processing, chapter 24,
pages 563–610. Marcel Dekker Inc.
M. Nagata et al. 2001. Using the Web as a bilingual dictio-
nary. In Proc. ACL-2001 Workshop on Data-driven Meth-
ods in Machine Translation, pages 95–102.
J. Oh and K. Choi. 2005. Automatic extraction of english-
korean translations for constituents of technical terms. In
Proc. 2nd IJCNLP, pages 450–461.
R. Rapp. 1999. Automatic identification of word translations
from unrelated English and German corpora. In Proc.
37th ACL, pages 519–526.
S. Sato and Y. Sasaki. 2003. Automatic collection of related
terms from the web. In Proc. 41st ACL, pages 121–124.
T. Tanaka and T. Baldwin. 2003. Translation selection for
japanese-english noun-noun compounds. In Proc. Ma-
chine Translation Summit IX, pages 378–85.
M. Tonoike, M. Kida, T. Takagi, Y. Sasaki, T. Utsuro, and
S. Sato. 2005. Effect of domain-specific corpus in com-
positional translation estimation for technical terms. In
Proc. 2nd IJCNLP, Companion Volume, pages 116–121.
