Proceedings of the Fourth International Natural Language Generation Conference, pages 41–43,
Sydney, July 2006. c©2006 Association for Computational Linguistics
Adjective-to-Verb Paraphrasing in Japanese
Based on Lexical Constraints of Verbs
Atsushi Fujita
†
Naruaki Masuno
††
Satoshi Sato
†
Takehito Utsuro
‡
†
Graduate School of Engineering, Nagoya University
††
IBM Engineering & Technology Services, IBM Japan, Ltd.
‡
Graduate School of Systems and Information Engineering, University of Tsukuba
Abstract
This paper describes adjective-to-verb
paraphrasing in Japanese. In this para-
phrasing, generated verbs require addi-
tionalsuffixes accordingtotheir difference
in meaning. To determine proper suffixes
for agivenadjective-verbpair, wehaveex-
amined the verbal features involved in the
theory of Lexical Conceptual Structure.
1 Introduction
Textual expressions that (roughly) convey the
same meaning are called paraphrases. Since gen-
erating and recognizing paraphrases has a poten-
tial to contribute to a broad range of natural lan-
guage applications, such as MT,IE, and QA,many
researchers have done a lot of practices on auto-
matic paraphrasing in the last decade.
Most previous studies have addressed para-
phrase phenomena where the syntactic category
is not changed: e.g., noun-to-noun (“document”
⇔“article”), verb-to-verb (“raise”⇔“bring up”).
In these inner-categorial paraphrasing, only lim-
ited types of problems arise when replacing words
or phrases with their synonymous expressions.
On the other hand, this paper focuses on inter-
categorial paraphrasing, such as adjective-to-verb
(“attractive” ⇔ “attract”) that leads to novel type
of problems due to the prominent differences in
meaning and usage. In other words, calculating
those differences is more crucial to determine how
they can or cannot be paraphrased.
The aim of this study is to clarify what lexical
knowledge is required for capturing those differ-
ences, and to explore where such a knowledge can
be obtained from. Recent work in lexical seman-
tics has shown that syntactic behaviors and seman-
tic properties of words provide useful informa-
tion to explain the mechanisms of several classes
of paraphrases. More specifically, lexical proper-
ties involved in the theory of Lexical Conceptual
Structure (LCS) (Jackendoff, 1990) have seemed
to be beneficial because each verb does not func-
tion idiosyncratically. However, in the literature,
there have been less studies for other syntactic
categories than verbs. To the best of our knowl-
edge, the Meaning-Text Theory (MTT) (Mel’ˇcuk
and Polgu`ere, 1987) is one of the very few frame-
works. In MTT, lexical properties and inter-
categorial paraphrasing are realized with a unique
semantic representation irrespective of syntactic
categories and what are called lexical functions,
e.g., S
0
(receive)=reception.
To make out how the recent advances in lexi-
cal semantics for verbs can be extended to other
syntactic categories, we assess LCS for inter-
categorial paraphrasing. We choose adjectives as
a counterpart of paraphrasing because they behave
relatively similar to verbs compared with other
categories: both adjectives and verbs have inflec-
tion and function as predicates, adnominal ele-
ments, etc. Yet, we speculate that their difference
in meaning and usage reveal intriguing generation
problems. To put it briefly, adjective-to-verb para-
phrasing in Japanese requires verbal suffixes such
as “ta (past / attributive)” in example (1)
1
:
(1) s. furui otera-no jushoku-o tazune-ta.
be old temple-GEN priest-ACC to visit-PAST
I visited a priest in the old temple.
t. furubi-ta otera-no jushoku-o tazune-ta.
to olden-ATTR temple-GEN priest-ACC to visit-PAST
I visited a priest in the olden(ed) temple.
2 Preliminary investigation
To make an investigation into the variation and
distribution of required verbal suffixes, we col-
lected a set of paraphrase examples through the
following semi-automatic procedure:
Step 1. We handcrafted adjective-verb pairs
based on JCore (Sato, 2004), which classifies
Japanese words into five-levels of readability.
Our 128 pairs (for 85 adjectives) contain only
those sharing first few phonemes (reading)
1
For each example, “s” and “t” denote an original sen-
tence and its paraphrase, respectively.
41
Table 1: Distribution of verbal suffixes used.
Verbal suffix C
ad
c
C
pr
1
C
pr
2
ru 9 16 0
tei-ru 5 42 0
re-ru 14 8 0
re-tei-ru 2 5 0
ta 57 0 7
tei-ta 2 0 2
re-ta 6 0 1
re-tei-ta 0 0 1
both ta and tei-ru 4 0 0
both ta and ru 1 0 0
tea-ru 0 2 0
Total 100 73 11
where
ru:baseform
tei: progressive / perfective
re: passive / potential
ta: past / attributive
tea: perfective
and characters (kanji), and either of adjective
or verb falls into the easiest three levels.
Step 2. Candidate paraphrases for a given sen-
tence collection are automatically generated
by replacing adjectives with their corre-
sponding verbs. Multiple candidates are gen-
erated for adjectives that correspond to mul-
tiple verbs.
Step 3. The correctness of each candidate para-
phrase is judged by two human annotators.
The basic criterion for judgement is that two
sentences are regarded as paraphrases if and
only if they share at least one interpretation.
In this step, the annotators are allowed to re-
vise candidates: (i) append verbal suffixes,
(ii) change of case markers, and (iii) insert
adverbs. Finally, candidates that both anno-
tators judge correct qualify as paraphrases.
Assuming that the variation and distribution of
verbal suffixes vary according to the usage of ad-
jectives, we separately collected paraphrase exam-
ples for adnominal and predicative usages.
Adnominal usages: For 960 sentences randomly
extracted from a one-year newspaper corpus,
Mainichi 1995, we obtained 165 examples for 142
source sentences. We then divided them into two
portions: 12 adjectives that appeared only once
and at least one examples for the other adjectives
were kept unseen (C
ad
o
), while the remaining ex-
amples (C
ad
c
) were used for our investigation.
Predicative usages: For 157 example sentences
within IPAL adjective dictionary (IPA, 1990), we
generated candidate paraphrases. 84 candidates
for 70 sentences qualified as paraphrases. They
are then divided into two portions according to
the tense of adjectives: C
pr
1
consists of examples
where adjectives appear in base form and C
pr
2
is
for “ta” form (past tense).
Table 1 shows the distribution of verbal suffixes
used for given adjective-verb pairs in each portion
of example collections. We confirmed that their
distribution was fairly different. In the remaining
sections, we focus on adnominal usages because
examples of predicative usages have displayed a
degree of compositionality. Which of “ru”or“ta”
must be used is given by the input: if a given ad-
jective accompanies past tense, the resultant ver-
bal suffix is necessarily that for present tense fol-
lowed by “ta.”
3 Determining verbal suffixes
The task we address here is to determine verbal
suffixes for a given input, a pair of an adnominal
usage of adjective in a certain context and a candi-
date verb given by our adjective-verb list.
From the viewpoint of language generation,
this task can be thought of as generating verbal
expressions where options are already given in
Table 1. A straightforward way for determining
verbal suffixes is to make use of lexical properties
of verbs as constraints on generation. To manifest
them, in particular aspectual properties involved
in LCS, we first designed seven types of linguis-
tic tests shown in Table 2. They are derived from
a classical analysis of verb semantics in Japanese
(Kageyama, 1996) and some ongoing projects on
constructing LCS dictionaries (Kato et al., 2005;
Takeuchi et al., 2006). We then manually ex-
amined 128 verbs in Section 2 under those tests.
To determine the word sense in which the deriva-
tive relationship hold good, example sentences in
IPAL verb dictionary (IPA, 1987) for each verb
were used. For a verb which was out of the dic-
tionary, we manually gave a sample sentence.
Since our aim is to explain why a certain ver-
bal suffix is used for a given input, we have not
feverishly applied a machine learning algorithm to
the task. Instead, we have manually created a rule-
based model shown in Table 3 using C
ad
c
,where
each if-then rule assigns either of verbal suffixes in
Table 1 to a given input based on verbal features in
Table 2 and some other features below:
• D: affix pair of the adjective and the candi-
date verb: e.g., “A shii-V mu” for “kuyashii
(be regretful)” ⇔ “kuyamu (to regret)”
• N: disjunction of semantic classes in a the-
saurus (The Natural Institute for Japanese
Language, 2004) for the modified noun
• C: whether the adjective is head of clause
4 Experiment and discussion
By conducting an empirical experiment with C
ad
c
and C
ad
o
, we evaluate how our model (RULE)
properly determines verbal suffixes. A compar-
ison with a simple baseline model (BL)isalso
done. BL selects the most frequently used suffix
(in this experiment “ta”) for any given input.
42
Table 2: Linguistic tests for verbs derived from Lexical Conceptual Structure (Kageyama, 1996).
Label Description
V
a
whether the verb allows accusative case
V
b
whether the verb can co-occur with a temporal adverb “ichi-jikan (for one hour)” or its variant
V
c
whether the verb can co-occur with a temporal adverb “ichi-jikan-de (in one hour)” or its variant
V
d
whether the verb can be followed by “tearu (perfective)” when its accusative case is moved to nominative
V
e
interpretation of the verb followed by “tei-ru (progressive / perfective)”
V
f
when followed by “ta,” whether the verb can have the perfective interpretation or just past tense
V
g
whether the verb can co-occur with a sort of adverb which indicates intention of the action: e.g. “wazato (purposely)” and “iyaiya (reluctantly)”
Table 3: The rule-set for determining verbal suf-
fixes, where “(non)” indicates non-paraphrasable.
Order Condition (conjunction of “feature label =“ value”) Verbal suffix
1 V
a
=“yes”∧V
b
=“yes”∧V
f
=“no”∧ re-ru
N=“except Human (1.10)” ∧
D=“A ui–V bumu” ∨“A i–V mu” ∨“A asii–V u”
2 V
a
=“yes”∧V
b
=“yes”∧V
f
=“no”∧ ta
V
d
=“no”∧N=“Mind: mind, attitude (1.303)”
3 V
a
=“no”∧V
g
=“yes” ta
4 V
a
=“no”∧V
f
=“yes”∧D=“A i–V migakaru” ta / tei-ru
5 C=“clause”∧D=“A i–V maru” ru
6 V
a
=“no”∧V
f
=“yes” ta
7 V
a
=“no”∧V
b
=“yes”∧V
f
=“no” ta
8 V
b
=“yes”∧V
f
=“no”∧V
c
=“yes”∧ tei-ru
V
d
=“yes”∧V
e
=“progressive”∧N=“Subject (1.2)”
9 ∗ (non)
Table 4 shows the experimental results, where
recall and precision are calculated with regard
to input adjective-verb pairs. Among rules in
Table 3, rules 1 (for “re-ru”),3,6,and7(for“ta”
where V
a
=“no”) performed much better than the
other rules. This indicates that these rules and fea-
tures in their conditions properly reflect our lin-
guistic intuition. For instance, rule 6 reflects that
a change-of-state intransitive verb expresses re-
sultative meaning as adjectives when it modifies
Theme of the event via “ta” (Kageyama, 1996)
as shown in (1), and rule 2 does that a psycho-
logical verb modifies a nouns with “re-ru”when
the noun arouses the specific emotion, such as re-
gretting mistakes (e.g., “kuyashii (be regretful)”
⇔ “kuyama-re-ru (be regretted)”). The aspectual
property captured by the tests in Table 2 is used to
classify verbs into these semantic classes.
On the other hand, the rules for the other types
are immature due to lack of examples: we cannot
find out even necessary conditions to be “ru,” “tei-
ru,” etc. What is required to induce proper con-
ditions for these suffixes is a larger example col-
lection and discovering another semantic property
and a set of linguistic tests for capturing it.
5 Conclusion and future work
In this paper, we focused on inter-categorial para-
phrasing and reported on our study on an issue
in adjective-to-verb paraphrasing. Two general-
purpose resources and a task-specificrule-sethave
been handcrafted to generate proper verbal suf-
fixes. Although the rule-based model has achieved
better performance than a simple baseline model,
there is a plenty of room for improvement.
Table 4: Recall and precision of determining ver-
bal suffix for given adjective-verb pairs.
C
ad
c
C
ad
o
Verbal suffix Recall Precision Recall Precision
ta (V
a
=“yes”) 3/13 3/3 1/6 1/1
ta (V
a
=“no”) 42/44 42/63 18/18 18/29
re-ru 12/14 12/19 7/13 7/11
ru 3/9 3/6 0/2 0/5
tei-ru 1/5 1/7 2/8 2/6
ta / tei-ru 2/4 2/2 1/2 1/1
No rule for 11 inputs for 7 inputs
Total (RULE) 63/100 63/100 29/56 29/53
(63%) (63%) (52%) (55%)
BL 57/100 57/148 24/56 24/83
(57%) (39%) (43%) (29%)
Future work includes (i) to enlarge our two
resources as in (Dorr, 1997; Habash and Dorr,
2003) evolving an effective construction method,
(ii) intrinsic evaluation of those resources, and, of
course, (iii) to enhance the paraphrasing models
through further experiments with a larger test-set.

References
B. J. Dorr. 1997. Large-scale dictionary construction for
foreign language tutoring and interlingual machine trans-
lation. Machine Translation, 12(4):271–322.
N. Habash and B. J. Dorr. 2003. A categorial variation
database for English. In Proceedings of the 2003 Human
Language Technology Conference and the North Ameri-
can Chapter of the Association for Computational Lin-
guistics (HLT-NAACL), pages 17–23.
IPA. 1987. IPA Lexicon of the Japanese language for com-
puters (Basic Verbs). Information-technology Promotion
Agency. (in Japanese).
IPA. 1990. IPA Lexicon of the Japanese language for com-
puters (Basic Adjectives). Information-technology Pro-
motion Agency. (in Japanese).
R. Jackendoff. 1990. Semantic structures. The MIT Press.
T. Kageyama. 1996. Verb semantics. Kurosio Publishers.
(in Japanese).
T. Kato, S. Hatakeyama, H. Sakamoto, and T. Ito. 2005.
Constructing Lexical Conceptual Structure dictionary for
verbs of Japanese origin. In Proceedings of the 11th An-
nual Meeting of the Association for Natural Language
Processing, pages 871–874. (in Japanese).
I. Mel’ˇcuk and A. Polgu`ere. 1987. A formal lexicon in
meaning-text theory (or how to do lexica with words).
Computational Linguistics, 13(3-4):261–275.
S. Sato. 2004. Identifying spelling variations of Japanese
words. In Information Processing Society of Japan SIG
Notes, NL-161-14, pages 97–104. (in Japanese).
K. Takeuchi, K. Inui, and A. Fujita. 2006. Construction of
compositional lexical database based on Lexical Concep-
tual Structure for Japanese verbs. In T. Kageyama, editor,
Lexicon Forum No.2. Hitsuji Shobo. (in Japanese).
The Natural Institute for Japanese Language. 2004. Word
list by semantic principles, revised and enlarged edition.
Dainippon Tosho. (in Japanese).
