A Method of Creating New Bilingual Valency Entries
using Alternations
Sanae Fujita Francis Bond
fsanae, bondg@cslab.kecl.ntt.co.jp
NTT Machine Translation Research Group
NTT Communication Science Laboratories
Nippon Telephone and Telegraph Corporation
Abstract
We present a method that uses alternation data
to add new entries to an existing bilingual valency
lexicon. If the existing lexicon has only one half of
the alternation, then our method constructs the
other half. The new entries have detailed infor-
mation about argument structure and selectional
restrictions. In this paper we focus on one class
of alternations, but our method is applicable to
any alternation. We were able to increase the
coverage of the causative alternation to 98%, and
the new entries gave an overall improvement in
translation quality of 32%.
1 Introduction
Recently, deep linguistic processing, which aims
to provide a useful semantic representation, has
become the focus of more research, as parsing
technologies improve in both speed and robust-
ness (Uszkoreit, 2002). In particular, machine
translation systems still mainly rely on large
hand-crafted lexicons. The knowledge acquisition
bottleneck, however, remains: precise grammars
need information-rich lexicons, such as valency
dictionaries, which are costly to build and extend.
In this paper, we present a method of adding new
entries to an existing bilingual valency dictionary,
using information about verbal alternations.
The classic approach to acquiring lexical infor-
mation is to build resources by hand. This pro-
duces useful resources but is expensive. This is
still the approach taken by large projects such
as FrameNet (Baker et al., 1998) or OntoSem.
Therefore, there is a need to extend these hand-
made resources quickly and economically. An-
other approach is to attempt to learn informa-
tion from corpora. There has been much research
based on this, but due to the inevitable errors,
there are few examples of lexicons being con-
structed fully automatically. Korhonen (2002)
reports that the ceiling on the performance of
mono-lingual subcategorization acquisition from
corpora is generally around 80%, a level that
still requires manual intervention. Yet another
approach is to combine knowledge sources: for
example to build a lexicon and then try to ex-
tend it using corpus data or to enrich mono-
lingual data using multilingual lexicons (Fujita
and Bond, 2002).
The aim of this research is not to create a
lexicon from scratch, but rather to add further
entries to an existing lexicon. We propose a
method of acquiring detailed information about
predicates, including argument structure, seman-
tic restrictions on the arguments and transla-
tion equivalents. It combines two heterogeneous
knowledge sources: an existing bilingual valency
lexicon (the seed lexicon), and information about
verbal alternations.
Most verbs have more than one possible argu-
ment structure (subcat). These can be regular-
ized into pairs of alternations, where two argu-
ment structures link similar semantic roles into
di erent subcats. Levin (1993) has identi ed over
80 alternation types for English, and these have
been extended to cover 4,432 verbs in 492 classes
(Dorr, 1997). In this paper, we will consider al-
ternations between transitive (Vt) and intransi-
tive (Vi) uses of verbs, where the subject of the
intransitive verb (S) is the same as the object of
the transitive verb (O) (e.g. the acid dissolved
the metal , the metal dissolved (in the acid))
(Levin, 1993, 26{33)). We call the subject of the
transitive verb A (ergative) and this alternation
the S=O alternation.
Figure 1 shows a simpli ed example of an alter-
nating pair in a bilingual valency dictionary (the
valency lexicon from the Japanese-to-English ma-
chine translation system ALT-J/E (Ikehara et
al., 1991)). This includes the subcategorization
frame and selectional restrictions. As shown in
Figure 1, Japanese, unlike English, typically mor-
phologically marks the transitivity alternation.
We chose the S=O alternation because it is one
J-E Entry: 302116
S a0 N1:hstuffi a1 nom
X a2 N3:hstuffi a3 dat
a4 Vi
a5a7a6a9a8 tokeru \dissolve"
S a0 N1 subject
a2 Vi dissolve
X a4 PP in N3
J-E Entry: 508661
A a0 N1:hpeople, artifacti a1 nom
O a2 N2 hstuffi a10 acc
X a2 N3:hinanimatei a3 dat
a4 Vi
a5a12a11 toku \dissolve"
A a0 N1 subject
a2 Vt dissolve
O a2 N2 direct object
X a4 PP in N3
Figure 1: Vi a5a7a6a9a8 tokeru \dissolve" $ Vt a5a13a11 toku \dissolve"
of the most common types of alternations, mak-
ing up 34% of those discovered by Bond et al.
(2002) and has been extensively studied. The
method we present, however, can be used with
any alternation for which lists of alternating verbs
exist.
2 Resources
We use two main resources in this paper: (1) a
seed lexicon of high quality hand-made valency
entries; and (2) lists of verbs that undergo one or
more S=O alternations.
The alternation list includes 449 native
Japanese verbs that take the S=O alterna-
tion, based on data from Jacobsen (1981), Bul-
lock (1999) and the Japanese/English dictionary
EDICT (Breen, 1995). Each entry consists of a
pair of Japanese verbs with one or more English
glosses. Expanding out the English results in 839
Japanese-English pairs in all. Some examples are
given in Table 1.
Intransitive Transitive
Ja En Ja En
a14a16a15a18a17 tokeru dissolve a14a20a19 toku dissolve
a21a22a19 naku cry a21a18a23a25a24 nakasu make cry
a26a18a27a16a17 agaru rise a26a18a28a16a17 ageru lift
Table 1: Verbs Undergoing the S=O Alternation
As a seed lexicon, we use the valency dictio-
nary (Ikehara et al., 1997) from the Japanese-to-
English machine translation system ALT-J/E. It
consists of linked pairs of Japanese and English
verbs. There are 5,062 Japanese verbs and 11,214
entries (ignoring all idiomatic and adjectival en-
tries). Verb entries in both languages have infor-
mation about the argument structure (subcat) of
the verb. In addition to the core arguments, ad-
junct cases are added to many patterns to help
in disambiguation.1 The Japanese side has selec-
1This is common in large NLP lexicons, such as COM-
tional restrictions (SR) on the arguments. The
arguments are linked between the two languages
using case-roles (N1, N2, ...).
The seed lexicon covered 381 out of the 449
linked Japanese pairs (85%). In the next section,
in order to examine the nature of the alternation
we compare the case roles and translation of the
linked valency pairs.
3 The Nature of the S=O Alternation
3.1 Comparing Selectional Restrictions
of A, O and S
In alternations, a given semantic role typically
appears in two di erent syntactic positions: for
example, the dissolved role is the subject of in-
transitive dissolve and the object of the transi-
tive. Baldwin et al. (1999) hypothesized that
selectional restrictions (SRs) stay constant in the
di erent syntactic positions. Dorr (1997), who
generates both alternations from a single underly-
ing representation, implicitly makes this assump-
tion. In addition, Kilgarri (1993) speci cally
makes the A h+sentient, +volitioni, while the
O is h+changes-state, +causally affectedi.
However, we know of no quantitative studies
of the similarities of alternating verbs. Exploit-
ing the machine translation lexicon for linguistic
research, we compare the SRs of S with both A
and O for verbs that take the S=O alternation.
The SRs take the form of a list of seman-
tic classes, strings or *. Strings only match
speci c words, while * matches anything, even
non-nouns. The semantic classes are from the
GoiTaikei ontology of 2,710 categories (Ikehara
et al., 1997). It is an unbalanced hierarchy with
a maximum depth of 12. The top node (level 1)
is noun. The lower the level, the more specialized
LEX (Grishman et al., 1998). For example, the COMLEX
3.0 entry for gather notes that it coocurs with PPs headed
by around, inside, with, in and into.
the meaning, and thus the more restrictive the
SR.
We calculate the similarity between two SRs as
the minimum distance (MD), measured as links
in the ontology. If the SRs share at least one se-
mantic class then the MD is zero. In this case,
we further classi ed the SRs which are identi-
cal into \0 (Same)". For example, in Figure 1,
the MD between S and O is \0 (Same)" because
they have the same SR: hstuffi. The MD be-
tween A and S is two because the shortest path
from hartifacti to hstuffi traverses two links
(artifact  inanimate  stuff).2
Figure 2 shows the MD between O and S, and
A and S. The selectional restrictions are very sim-
ilar for O and S. 30.1% have identical SRs, dis-
tance is zero for 27.5% and distance one is 28.3%.
However, for A and S, the most common case
is distance one (26.7%) and then distance two
(21.5%). Although O and S are di erent syntac-
tic roles, their SRs are very similar, re ecting the
identity of the underlying semantic roles.
             
               
          
 
 
 
 
 
 
           
 
                 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
               
Figure 2: The Minimum Distance of Selectional
Restrictions
Next, we examine whether A, O, and S
are h+sentient, +volitioni or not. In the
GoiTaikei hierarchy, semantic classes subsumed
by agent are h+sentient, +volitioni. A was
very agentitive, with 60.1% of the SRs being sub-
sumed by agent. The most frequent SR for A is
hagenti itself (41.4%). S and O are less agenti-
tive, with 13.9% and 14.1% of their respective se-
lectional restrictions being agentitive. This data
supports the hypothesis in Kilgarri (1993).
2There is some variation due to lexicographer’s incon-
sistencies. For example X’s SR is hstuffi in the intransi-
tive and hinanimatei in the transitive entry. It should be
hstuffi in both entries.
In summary, the SRs of S and O are not iden-
tical, but very similar. In comparison, A is more
agentitive, and not closely linked to either.
3.2 Comparison of Japanese and English
From the point of view of constructing bilingual
lexical entries, if the English main verb can trans-
late both Japanese entries, then it is possible to
automatically construct a usable English transla-
tion equivalent along with the Japanese alterna-
tion. In order to see how often this is the case, we
compare Japanese and English alternations and
investigate the English translations in the alter-
nation list.
We divide the entries into  ve types in Ta-
ble 2. The  rst three are those where the main
English verb is the same. The most common
type (30.0%) is made up of English unaccusative
verbs which also undergo the S=O alternation
[S=O]. The next most common (19.8%) is en-
tries where the Japanese intransitive verb can be
translated by making the transitive verb’s En-
glish translation passive [passive]. In the third
type (6.5%) the English is made transitive syn-
thetically [synthetic]: a control verb (normally
make) takes an intransitive verb or adjective as
complement. The last two are those where ei-
ther di erent translations are used (42.8%), or
the same English verb is used but the valency
change is not one of those described above.
The  rst three rows of Table 2 show the verbs
whose alternate can be created automatically,
56.3% of the total. This  gure is only an ap-
proximation, for two reasons. The  rst is that
the translation may not be the best one, most
verbs can have multiple translations, and we are
only creating one. The second is that this up-
per limit is almost certainly too low. For many
of the alternations, although our table contained
di erent verbs, translations using identical verbs
are also acceptable. In fact, most transitive verbs
can be made passive, and most intransitive verbs
embedded in a causative construction, so this al-
ternative is always possible (and is also possible
for Japanese). However, if the Japanese uses a
lexical alternation, it is more faithful to link it to
an English lexical alternation when possible.
4 Method of Creating Valency Entries
In this section we describe how we create new al-
ternating entries. Given a verb, with dependents
Ni, and an alternation that maps some or all of
the Ni, we can create the alternate by analogy
with existing alternating verbs. The basic  ow of
Japanese English Translation English Structure Type No. (%)
Vi Vt Vi Vt Vi Vt
a0a2a1
a8
a0a4a3
a8 S weaken A weaken O S Vi A Vt O S=O 138 30.0
a5a7a6
a8
a5a9a8a11a10 S be omitted A omit O S be Vt-ed A Vt O passive 91 19.8
a12
a11
a12a14a13a15a10 S cry A make O cry S Vi/be Adj A Vc O Vi/Adj synthetic 30 6.5
a16
a11a18a17a7a8
a16
a11
a10 S pass away A lose O S Vi A Vt O Di Head 197 42.8
a19a18a20a21a6
a8
a19a18a20a2a8a22a10 S play A play with O S Vi A Vt prep O Di Struct 4 0.9
Vc is control verb such as make,get,let,become. Many entries also include information about non-core arguments/adjuncts.
Table 2: Classi cation of English Translations of the S = O Alternation List (Reference Data)
creating valency entries is as follows.
 For each dependent Ni
if Ni participates in the alternation
if Ni has an alternate in the target then
map to it
else delete Ni
else transfer [non-alternating dependent]
 If the alternation requires a dependent not
in the source
Add the default argument
We use the most frequent argument in existing
valency entries as a default. Speci c examples of
creating S = O alternations are given in the next
section.
Although we only discuss the selectional re-
strictions and subcat information here, we also
map the verb classes (given as verbal semantic at-
tributes (Nakaiwa and Ikehara, 1997)). The map-
ping for the dependents in the alternation can
be taken from existing lexical resources (Dorr,
1997), learned from corpora (McCarthy, 2000) or
learned from existing lexicons (Bond et al., 2002).
4.1 Target
In this experiment, we look at one family of al-
ternations, the S = O alternation. The candidate
words are thus intransitive verbs with no transi-
tive alternate, or transitive entries with no in-
transitive alternate. Alternations should be be-
tween senses, but the alternation list is only of
words. Many of the candidate words (those that
have a entry for only one alternate) have sev-
eral entries. Only some of these are suitable as
seeds. We don’t use entries which are intransi-
tive lemmas but have an accusative argument,
which are intransitive (or transitive) lemmas but
have an transitive translation (or intransitive),
or which have both topic and nominative, such
as (1), where the nominative argument is incor-
porated in the English translation.
(1) N1:hanimalsi
N1
N1
a23
ha
top
N3:h"a24a24a24 "i
N3:\chikara"
N3:power
a27
ga
nom
a25a16a15 a17
nukeru
lose
N1 lose N1’s energy
There are 115 entries (37 lemmas) which have
only intransitive entries and 81 entries (25 lem-
mas) which have only transitive entries which are
in our reference list of alternating verbs. We cre-
ate intransitive entries using the existing transi-
tive entries, and transitive entries using the ex-
isting intransitive entries.
4.2 Creating the Japanese subcat and
SRs
In creating the intransitive entries from the tran-
sitive entries, we map the O’s SRs onto the S’s
SRs, and change the case marker from accusative
to nominative. We delete the A argument, and
transfer any other dependents as they are.
In creating the transitive entries, we map the
intransitive S’s SRs onto the new O’s SRs, and
give it an accusative case-marker. If the in-
transitive entry has a demoted subject argument
(where the Japanese case-marker is ni and the
English preposition is by), we promote it to sub-
ject and use its SR for A. Otherwise we add a
causative argument as ergative subject (A) with
a default SR of hagenti3 and a nominative case-
marker. We show an example in Figure 3.
4.3 Creating the English Equivalents
The English translation can be divided into
three types: S=O, passive and synthetic.
Therefore it is necessary to judge which type
is appropriate for each entry, and then cre-
ate the English. This judgement is shown
in Figure 4. To judge whether an English
3hagenti is the most frequent SR for transitive verbs
undergoing this alternation as shown in x 3.1.
Entry ID : 202204
S a0 N1:hagent animali a1 nom
X a2 N3:h*i a3 dat
a4 Vi
a0a13a11 odoroku \be surprised"
S a0 N1 subject
a2 Cop be
a1 Participle surprised
X a4 PP at/by N3
New Entry ID : 760038
A a0 N1 h*i a1 nom
O a2 N2 hagent animali a10 acc
a4 Vt
a0
a13a15a10 odorokasu \surprise"
A a0 N1 subject
a2 Vt surprise
O a4 N2 direct object
Figure 3: Seed: Vi a0a12a11 odoroku \be surprised" ) New entry: Vt a0
a13a4a10 odorokasu \surprise"
verb could undergo the S = O alternation
we used the LCS Database (EVCA+) (Dorr,
1997, http://www.umiacs.umd.edu/~bonnie/
LCS_Database_Documentation.html).
5 Evaluation
A total of 196 new entries were created for 62
verbs (25 Vi + 37 Vt) using the method outlined
in x 4. We evaluated the quality by using the new
entries in a machine translation system.
5.1 Translation-Based Evaluation
We evaluated the quality of the created entries in
a translation-based regression test. We got two
example sentences using each verb from Japanese
newspapers and web pages: this gave a total of
124 test sentences. We translated the test sen-
tences using ALT-J/E, both with (with) and
without (w/out) the new entries.
Translations that were identical were marked
no change (the system translates with a sim-
ple word dictionary if it has no valency en-
try). Translations that changed were evaluated
by people  uent in both languages (two thirds
by Japanese native speakers and one third by an
English native speaker, not the authors). The
translations were randomly presented to the eval-
uators labeled by A and B. Therefore evaluators
did not know whether a translation is with or
w/out. The translations were placed into three
categories: (i) A is better than B, (ii) A and B are
equivalent in quality, and (iii) A is worse than B.
For example in (2), the evaluation was (iii). In
this case A is w/out and B is with, so the new
entry has improved the translation.
(2) a2a4a3
Shioda
Shioda
a5a7a6a9a8
Kiyoko
Kiyoko
a10a12a11
san
Ms.
a13a4a14
wa,
nom
a15a17a16
moufu
blanket
a3
ni
in
a11a16a8
a1a19a18
kurumari
wrapped
a17
a1 a8a21a20
nagara.
while.
(A) Ms. Kiyoko Shioda is wrapped
up to a blanket.
(B) Ms. Kiyoko Shioda is wrapped
in a blanket.
Table 3 shows the evaluation results, split into
those for transitive and intransitive verbs. The
most common result was that the new trans-
lation was better (46.0%). The quality was
equivalent for 13.7% and worse for 14.5%. The
overall improvement was 31.5% (46.0  14.5).
Extending the dictionary to include the missing
alternations gave a measurable improvement in
translation quality.
Vi Created Vt Created Total
No. % No. % No. %
better 19 38.0 38 51.4 57 46.0
equivalent 5 10.0 12 16.2 17 13.7
no change 18 36.0 14 18.9 32 25.8
worse 8 16.0 10 13.5 18 14.5
Change +22.0 +37.9 +31.5
Total 50 100.0 74 100.0 124 100.0
Table 3: Results of Translation-based Evaluation
5.2 Lexicographer’s Evaluation
A manual analysis of a subset of the created en-
tries was carried out by expert lexicographers fa-
miliar with the seed lexicon (not the authors).
They found three major source of errors. The
 rst was that alternation is a sense based phe-
nomenon. As we built alternations for all pat-
terns in the seed dictionary, this resulted in
the creation of some spurious patterns. An
example of an impossible entry is a22 a8a24a23 a6 a8
torawareru \be caught", translated as be picked
up with the inappropriate semantic restriction
hconcrete,material-phenomenoni on the sub-
ject. However, another good entry was cre-
Creating Intransitive entries:
if the original subcat has a control verb
(Vc 2 fmake,have,get,causeg)
 A Vc O Vi/Adj
) S Vi/be Adj [synthetic]
(A make O cry ) S cry )
else (original head is Vt)
 if Vt undergoes the S = O alternation
{ A Vt O ) S Vi [S=O]
(A turn O ) S turn)
 else
{ A Vt O ) S be Vt-ed [passive]
(A injure O in X
) S be injured in X )
We made a special rule for the English Vt have. In this
case the intransitive alternation will be There is: for ex-
ample, a0a2a1a4a3a6a5a8a7 A have O on X ) a0a9a1a11a10a8a7 There be S
on X.
Creating Transitive Entries a12a12a12
If the original subcat is:
 S Vi
{ if Vi undergoes the S = O alternation
) A Vt O [S=O]
(S spoil ) A spoil O )
{ else ) A Vcy O Vi [synthetic]
(S rot ) A make O rot)
 S be Adj ) A Vcy O Adj [synthetic]
(S be prosperous ) A make O prosperous)
 S be Vt-ed ) A Vt O (by A) [passive]
(S be defeated (by A) ) A defeat O )
y We use make as the control verb, Vc
Figure 4: Method of Creating English Side
ated, with the translation be caught and SRs
hpeople,animal,artifacti, and this was judged
to be good.
The second source of errors was in the selec-
tional restrictions. In around 10% of the entries,
the lexicographers wanted to change the SRs.
The most common change was to make the SR
for A more speci c than the default of agent.
The third source of errors was in the English
translation, where the lexicographers sometimes
preferred a di erent verb as a translation, rather
than a regular alternation.
6 Discussion and Future Work
The above results show that alternations can be
used to create rich and useful bilingual entries.
In this section we discuss some of the reasons for
errors, and suggest ways to improve and expand
our method.
6.1 Rejecting Innappropriate Candidates
To make the construction fully automatic, a test
for whether the Japanese side of the entry is ap-
propriate or not is required.
One possibility is to add a corpus based  lter:
if no examples can be found that match the selec-
tional restrictions for an entry, then it should be
rejected. This could be done for each language
individually. The problem with this approach is
that many of the entries we created were for in-
frequent verbs. The average frequency in 16 years
of Japanese newspaper text was only 173, and 22
verbs never appeared, although all were familiar
to native speakers. We can, of course, use the
web to alleviate the data sparseness problem.
6.2 Improving the English Translations
In this section we compare the distribution of
the di erent types of translations for the refer-
ence data (x 3.1) and the entries created by our
method (x 3.2). The breakdown is shown in Ta-
ble 4. The  rst three rows show entries with the
same English main verb.
One major discrepancy is in the frequency of
the control verb construction. In Vi, no origi-
nal transitive entry used control verbs. In gen-
eral, when lexicographers create an entry, they
prefer a simple entry to a synthetic one. Look-
ing at the linguists’ reference data, about 6.5%
of the examples used control verbs. In the con-
structed data, 66.1% (77 entries) use the control
verb make, more than any other category. For ex-
ample, when the original intransitive entry is N1
be exhausted, exhausted is de ned as adjective in
the existing dictionary. So we create a new en-
try N1 make N2 exhaustedadj. However, there
is a transitive verb exhaust, and it was preferred
by the lexicographers: N1 exhaust N2. The al-
gorithm needs to optionally convert adjectives to
verbs in cases where there is overlap between the
adjective and past participle.
Finally, we consider those Japanese alterna-
tions where the transitive and intransitive alter-
natives need translations with di erent English
main verbs. A good example of this is Vi a16 a11 a17
English Structure Reference Data (Table2) Vi Created Vt Created
Type Vi Vt No. (%) No. (%) No. (%)
S=O S Vi A Vt O 138 30.0 9 11.1 24 21.7
passive S be Vt-ed A Vt O 91 19.8 71 87.7 14 12.2
synthetic S Vi/be Adj A Vc O Vi/Adj 30 6.5 0 0 76 66.1
Di erent Head 191 41.5 0 0.0 0 0.0
Di erent Structure 10 2.2 1 1.2 0 0.0
Total 460 100 81 100 115 100
Table 4: A Comparison of Reference Data with Created Alternations
a8 nakunaru \S pass away" and Vt
a16
a11
a10 nakusu
\A lose O".4 These are impossible to generate
using our method. Even with reliable English
syntactic data, it would be hard to rule out pass
away as a possible transitive verb or lose as an
intransitive. They can only be ruled out by us-
ing data linking the subcat with the meaning,
and this would need to be linked to the Japanese
verbs’ meanings. This may become possible with
larger linked multi-lingual dictionaries, such as
those under construction in the Papillon project,5
but is not now within our reach.
In summary, we could improve the construction
of the English translations by using richer English
information, especially about past-participles or
verb senses.
6.3 Usage as a Lexical/Translation Rule
Although we have investigated the use of al-
ternations in lexicon construction, the algo-
rithms could also be used directly, either as lexi-
cal/translation rules or to generate transitive and
intransitive entries from a common underlying
representation. For example, Shirai et al. (1999)
uses the existing entries and lexical rules deploy-
ing them to translate causatives and passives (in-
cluding adversative passives) from Japanese to
English. Trujillo (1995) showed a method to ap-
ply lexical rules for word translation. That is,
they expand the vocabulary using prepared lex-
ical rules for each language, and create links for
translation between the lexical rules of a pair of
languages. Dorr (1997) and Baldwin et al. (1999)
generate both alternates from a single underlying
representation.
Our proposed method could partially be im-
plemented as a lexical or a translation rule. But
not all the word senses alternate (x 4.2), and not
all the target language entries are regularly trans-
lated by the same head (x 3). Further many of the
4My friend passed away $ I lost my friend.
5http://www.papillon-dictionary.org/
rules mix lexical and syntactic information, mak-
ing them quite complicated. Because of that, it
is easier to expand out the rules beforehand and
enter them into the system.
6.4 Further Work
In this paper, we targeted native Japanese verbs
only. ALT-J/E already has a very high coverage
of native Japanese verbs. However, even in this
case, we could increase the cover of this alterna-
tion from 85% to 98% (442 out of 449 alternation
pairs now in the dictionary). Most valency dic-
tionaries or new language pairs have less cover,
and so will get more results. It is also possible
to use this method so as to only create half the
entries by hand, and then to automatically make
the alternating halves (although not all the cre-
ated entries will be perfect).
In addition to the native Japanese verbs, there
are many Sino-Japanese verbal nouns that un-
dergo S=O alternation (For example, (3) $ (4)).
(3) a0
mise
shop
a27
ga
nom
a1a3a2
seihin
products
a4
o
acc
a5a7a6a9a8a11a10
kanbai-shita
sold out
The shop sold out of the products.
(4) a1a3a2
seihin
products
a27
ga
nom
a5a7a6a9a8a11a10
kanbai-shita
sold out
The products are sold out.
ALT-J/E’s Japanese dictionary has about
2,400 verbal nouns which have usage as both
transitive and intransitive. Of these only 536 are
in the valency dictionary. Our next plan is to
add them all to the valency dictionary, using al-
ternations to make the process more e cient and
consistent.
Another extension is to apply the method to
other alternations, using either linguists’ data or
automatically acquired alternations (Oishi and
Matsumoto, 1997; Furumaki and Tanaka, 2003;
McCarthy, 2000). In particular, S = O alterna-
tions make up only 34% of those discovered by
Bond et al. (2002), we intend to investigate the
alternations that make up the remainder.
7 Conclusion
We presented a method that uses alternation data
to add new entries to an existing translation lex-
icon. The new entries have detailed information
about argument structure and selectional restric-
tions. We were able to increase the coverage of
the S=O alternation to 98%, and the new entries
gave an overall improvement in translation qual-
ity of 32%.

References
Collin F. Baker, Charles J. Fillmore, and John B
Lowe. 1998. The Berkeley FrameNet project.
In 36th Annual Meeting of the Association
for Computational Linguistics and 17th Interna-
tional Conference on Computational Linguistics:
COLING/ACL-98, Montreal, Canada.
Timothy Baldwin, Francis Bond, and Ben Hutchin-
son. 1999. A valency dictionary architecture for
machine translation. In Eighth International Con-
ference on Theoretical and Methodological Issues
in Machine Translation: TMI-99, pages 207{217,
Chester, UK.
Francis Bond, Timothy Baldwin, and Sanae Fujita.
2002. Detecting alternation instances in a valency
dictionary. In 8th Annual Meeting of the Associa-
tion for Natural Language Processing, pages 519{
522. The Association for Natural Language Pro-
cessing.
Jim Breen. 1995. Building an electronic Japanese-
English dictionary. Japanese Studies Association of
Australia Conference (http://www.csse.monash.
edu.au/~jwb/jsaa_paper/hpaper.html).
Ben Bullock. 1999. Alternative sci.lang.japan
frequently asked questions. http://www.csse.
monash.edu.au/~jwb/afaq/jitadoushi.html.
Bonnie J. Dorr. 1997. Large-scale dictionary con-
struction for foreign language tutoring and inter-
lingual machine translation. Machine Translation,
12(4):271{322.
Sanae Fujita and Francis Bond. 2002. A method
of adding new entries to a valency dictionary by
exploiting existing lexical resources. In Ninth In-
ternational Conference on Theoretical and Method-
ological Issues in Machine Translation: TMI-2002,
pages 42{52, Keihanna, Japan.
Hisanori Furumaki and Hozumi Tanaka. 2003. The
consideration of <n-suru> for construction of the
dynamic lexicon. In 9th Annual Meeting of The As-
sociation for Natural Language Processing, pages
298{301. (in Japanese).
Ralph Grishman, Catherine Macleod, and Adam My-
ers, 1998. COMLEX Syntax Reference Manual.
Proteus Project, NYU. (http://nlp.cs.nyu.edu/
comlex/refman.ps).
Satoru Ikehara, Satoshi Shirai, Akio Yokoo, and Hi-
romi Nakaiwa. 1991. Toward an MT system with-
out pre-editing { e ects of new methods in ALT-
J/E{. In Third Machine Translation Summit:
MT Summit III, pages 101{106, Washington DC.
(http://xxx.lanl.gov/abs/cmp-lg/9510008).
Satoru Ikehara, Masahiro Miyazaki, Satoshi Shirai,
Akio Yokoo, Hiromi Nakaiwa, Kentaro Ogura,
Yoshifumi Ooyama, and Yoshihiko Hayashi. 1997.
Goi-Taikei | A Japanese Lexicon. Iwanami
Shoten, Tokyo. 5 volumes/CDROM.
Wesley Jacobsen. 1981. Transitivity in the Japanese
Verbal System. Ph.D. thesis, University of Chicago.
(Reproduced by the Indiana University Linguistics
Club, 1982).
Adam Kilgarri . 1993. Inheriting verb alterna-
tions. In Sixth Conference of the European
Chapter of the ACL (EACL-1993), pages 213{
221, Utrecht. (http://acl.ldc.upenn.edu/E/
E93/E93-1026.pdf).
Anna Korhonen. 2002. Semantically motivated sub-
categorization acquisition. In Proceedings of the
ACL Workshop on Unsupervised Lexical Acquisi-
tion, Philadelphia, USA.
Beth Levin. 1993. English Verb Classes and Alterna-
tions. University of Chicago Press, Chicago, Lon-
don.
Diana McCarthy. 2000. Using semantic preferences
to identify verbal participation in role switching al-
ternations. In Proceedings of the  rst Conference
of the North American Chapter of the Association
for Computational Linguistics. (NAACL), Seattle,
WA.
Hiromi Nakaiwa and Satoru Ikehara. 1997. A system
of verbal semantic attributes in japanese focused
on syntactic correspondence between japanese and
english. Information Processing Society of Japan
(IPSJ), 38(2):215{225. (In Japanese).
Akira Oishi and Yuji Matsumoto. 1997. Detecting
the organization of semantic subclasses of Japanese
verbs. International Journal of Corpus Linguistics,
2(1):65{89.
Satoshi Shirai, Francis Bond, Yayoi Nozawa, Tomiko
Sasaki, and Hiromi Ueda. 1999. One method of  t-
ting valency patterns to text. In 5th Annual Meet-
ing of the Association for Natural Language Pro-
cessing, pages 80{83. The Association for Natural
Language Processing.
Arturo Trujillo. 1995. Bi-lexical rules for multi-
lexeme translation in lexicalist MT. In Sixth In-
ternational Conference on Theoretical and Method-
ological Issues in Machine Translation: TMI-95,
pages 48{66, July.
Hans Uszkoreit. 2002. New chances for deep linguis-
tic processing. In 19th International Conference on
Computational Linguistics: COLING-2002, pages
XIV{XXVII, Taipei.
