Text Simplification for Reading Assistance: A Project Note
Kentaro Inui Atsushi Fujita Tetsuro Takahashi Ryu Iida
Nara Advanced Institute of Science and Technology
Takayama, Ikoma, Nara, 630-0192, Japan
CUinui,atsush-f,tetsu-ta,ryu-iCV@is.aist-nara.ac.jp
Tomoya Iwakura
Fujitsu Laboratories Ltd.
Kamikodanaka, Nakahara, Kawasaki, Kanagawa, 211-8588, Japan
iwakura.tomoya@jp.fujitsu.com
Abstract
This paper describes our ongoing research
project on text simplification for congenitally
deaf people. Text simplification we are aiming
at is the task of offering a deaf reader a syn-
tactic and lexical paraphrase of a given text for
assisting her/him to understand what it means.
In this paper, we discuss the issues we should
address to realize text simplification and re-
port on the present results in three different
aspects of this task: readability assessment,
paraphrase representation and post-transfer er-
ror detection.
1 Introduction
This paper reports on our ongoing research into
text simplification for reading assistance. Potential
users targeted in this research are congenitally deaf
people (more specifically, students at (junior-)high
schools for the deaf), who tend to have difficulties
in reading and writing text. We are aiming at the
development of the technology of text simplification
with which a reading assistance system lexically and
structurally paraphrases a given text into a simpler
and plainer one that is thus more comprehensible.
The idea of using paraphrases for reading as-
sistance is not necessarily novel. For example,
Carroll et al. (1998) and Canning and Taito (1999)
report on their project in which they address syn-
tactic transforms aiming at making newspaper text
accessible to aphasics. Following this trend of re-
search, in this project, we address four unexplored
issues as below besides the user- and task-oriented
evaluation of the overall system.
Before going to the detail, we first clarify the four
issues we have addressed in the next section. We
then reported on the present results on three of the
four, readability assessment, paraphrase representa-
tion and post-transfer error detection, in the subse-
quent sections.
2 Research issues and our approach
2.1 Readability assessment
The process of text simplification for reading as-
sistance can be decomposed into the following three
subprocesses:
a. Problem identification: identify which portions of
a given text will be difficult for a given user to
read,
b. Paraphrase generation: generate possible candi-
date paraphrases from the identified portions, and
c. Evaluation: re-assess the resultant texts to choose
the one in which the problems have been resolved.
Given this decomposition, it is clear that one of the
key issues in reading assistance is the problem of as-
sessing the readability or comprehensibility
1
of text
because it is involved in subprocesses (a) and (c).
Readability assessment is doubtlessly a tough is-
sue (Williams et al., 2003). In this project, however,
we argue that, if one targets only a particular popu-
lation segment and if an adequate collection of data
is available, then corpus-based empirical approaches
may well be feasible. We have already proven that
one can collect such readability assessment data by
conducting survey questionnaires targeting teachers
at schools for the deaf.
1
In this paper, we use the terms readability and comprehen-
sibility interchangeably, while strictly distinguishing them from
legibility of each fragment (typically, a sentence or paragraph)
of a given text.
2.2 Paraphrase acquisition
One of the good findings that we obtained through
the aforementioned surveys is that there are a broad
range of paraphrases that can improve the readabil-
ity of text. A reading assistance system is, therefore,
hoped to be able to generate sufficient varieties of
paraphrases of a given input. To create such a sys-
tem, one needs to feed it with a large collection of
paraphrase patterns. Very timely, the acquisition of
paraphrase patterns has been actively studied in re-
cent years:
AF Manual collection of paraphrases in the context of
language generation, e.g. (Robin and McKeown,
1996),
AF Derivation of paraphrases through existing lexical
resources, e.g. (Kurohashi et al., 1999),
AF Corpus-based statistical methods inspired by the
work on information extraction, e.g. (Jacquemin,
1999; Lin and Pantel, 2001), and
AF Alignment-based acquisition of paraphrases from
comparable corpora, e.g. (Barzilay and McKe-
own, 2001; Shinyama et al., 2002; Barzilay and
Lee, 2003).
One remaining issue is how effectively these meth-
ods contribute to the generation of paraphrases in our
application-oriented context.
2.3 Paraphrase representation
One of the findings obtained in the previous stud-
ies for paraphrase acquisition is that the automatic
acquisition of candidates of paraphrases is quite re-
alizable for various types of source data but acquired
collections tend to be rather noisy and need manual
cleaning as reported in, for example, (Lin and Pan-
tel, 2001). Given that, it turns out to be important to
devise an effective way of facilitating manual correc-
tion and a standardized scheme for representing and
storing paraphrase patterns as shared resources.
Our approach is (a) to define first a fully express-
ible formalism for representing paraphrases at the
level of tree-to-tree transformation and (b) devise an
additional layer of representation on its top that is de-
signed to facilitate handcoding transformation rules.
2.4 Post-transfer text revision
In paraphrasing, the morpho-syntactic informa-
tion of a source sentence should be accessible
throughout the transfer process since a morpho-
syntactic transformation in itself can often be a mo-
tivation or goal of paraphrasing. Therefore, such
an approach as semantic transfer, where morpho-
syntactic information is highly abstracted away as
in (Dorna et al., 1998; Richardson et al., 2001),
does not suit this task. Provided that the morpho-
syntactic stratum be an optimal level of abstraction
for representing paraphrasing/transfer patterns, one
must recall that semantic-transfer approaches such as
those cited above were motivated mainly by the need
for reducing the complexity of transfer knowledge,
which could be unmanageable in morpho-syntactic
transfer.
Our approach to this problem is to (a) leave the de-
scription of each transfer pattern underspecified and
(b) implement the knowledge about linguistic con-
straints that are independent of a particular trans-
fer pattern separately from the transfer knowledge.
There are a wide range of such transfer-independent
linguistic constraints. Constraints on morpheme
connectivity, verb conjugation, word collocation,
and tense and aspect forms in relative clauses are typ-
ical examples of such constraints.
These four issues can be considered as different
aspects of the overall question how one can make
the development and maintenance of a gigantic re-
source for paraphrasing tractable. (1) The introduc-
tion of readability assessment would free us from
cares about the purposiveness of each paraphrasing
rule in paraphrase acquisition. (2) Paraphrase ac-
quisition is obviously indispensable for scaling up
the resource. (3) A good formalism for representing
paraphrasing rules would facilitate the manual re-
finement and maintenance of them. (4) Post-transfer
error detection and revision would make the system
tolerant to flows in paraphrasing rules.
While many researchers have addressed the issue
of paraphrase acquisition reporting promising results
as cited above, the other three issues have been left
relatively unexplored in spite of their significance in
the above sense. Motivated by this context, in the
rest of this paper, we address these remaining three.
3 Readability assessment
To the best of our knowledge, there have never
been no reports on research to build a computational
model of the language proficiency of deaf people, ex-
cept for the remarkable reports by Michaud and Mc-
Coy (2001). As a subpart of their research aimed at
developing the ICICLE system (McCoy and Master-
man, 1997), a language-tutoring application for deaf
learners of written English, Michaud and McCoy de-
veloped an architecture for modeling the writing pro-
ficiency of a user called SLALOM. SLALOM is de-
signed to capture the stereotypic linear order of ac-
quisition within certain categories of morphological
and/or syntactic features of language. Unfortunately,
the modeling method used in SLALOM cannot be
directly applied to our domain for three reasons.
AF Unlike writing tutoring, in reading assistance, tar-
get sentences are in principle unlimited. We
therefore need to take a wider range of morpho-
syntactic features into account.
AF SLALOM is not designed to capture the difficulty
of any combination of morpho-syntactic features,
which it is essential to take into account in reading
assistance.
AF Given the need to consider feature combinations,
a simple linear order model that is assumed in
SLALOM is unsuitable.
3.1 Our approach: We ask teachers
To overcome these deficiencies, we took yet an-
other approach where we designed a survey ques-
tionnaire targeting teachers at schools for the deaf,
and have been collecting readability assessment data.
In this questionnaire, we ask the teachers to compare
the readability of a given sentence with paraphrases
of it. The use of paraphrases is of critical importance
in our questionnaire since it makes manual readabil-
ity assessment significantly easier and more reliable.
3.1.1 Targets
We targeted teachers of Japanese or English liter-
acy at schools for the deaf for the following reasons.
Ideally, this sort of survey would be carried out
by targeting the population segment in question, i.e.,
deaf students in our study. In fact, pedagogists and
psycholinguists have made tremendous efforts to ex-
amine the language proficiency of deaf students by
giving them proficiency tests. Such efforts are very
important, but they have had difficulty in capturing
enough of the picture to develop a comprehensive
and implementable reading proficiency model of the
population due to the expense of extensive language
proficiency testing.
In contrast, our approach is an attempt to model
the knowledge of experts in this field (i.e., teaching
deaf students). The targeted teachers have not only
rich experiential knowledge about the language pro-
ficiency of their students but are also highly skilled in
paraphrasing to help their students’ comprehension.
Since such knowledge gleaned from individual ex-
periences already has some generality, extracting it
through a survey should be less costly and thus more
comprehensive than investigation based on language
proficiency testing.
3.1.2 Questionnaire
In the questionnaire, each question consists of sev-
eral paraphrases, as shown in Figure 1 (a), where
(A) is a source sentence, and (B) and (C) are para-
phrases of (A). Each respondent was asked to as-
sess the relative readability of the paraphrases given
for each source sentence, as shown in Figure 1 (b).
The respondent judged sentence (A) to be the most
difficult and judged (B) and (C) to be comparable.
A judgment that sentence D7
CX
is easier than sentence
D7
CY
means that D7
CX
is judged likely to be understood
by a larger subset of students than D7
CY
. We asked
the respondents to annotate the paraphrases with
format-free comments, giving the reasons for their
judgments, alternative paraphrases, etc., as shown in
Figure 1 (b).
To make our questionnaire efficient for model ac-
quisition, we had to carefully control the variation in
paraphrases. To do that, we first selected around 50
morpho-syntactic features that are considered influ-
ential in sentence readability for deaf people. For
each of those features, we collected several sim-
ple example sentences from various sources (literacy
textbooks, grammar references, etc.). We then man-
ually produced several paraphrases from each of the
collected sentences so as to remove the feature that
characterized the source sentence from each para-
phrase. For example, in Figure 1, the feature char-
acterizing sentence (A) is a non-restrictive relative
clause (i.e., sentence (A) was selected as an example
of this feature). Neither (B) nor (C) has this feature.
We also controlled the lexical variety to minimize
the effect of lexical factors on readability; we also
restricted the vocabulary to a top-2000 basic word
set (NIJL, 1991).
3.1.3 Administration
We administrated a preliminary survey targeting
three teachers. Through the survey, we observed that
(a) the teachers largely agreed in their assessments of
relative readability, (b) their format-free comments
indicated that the observed differences in readabil-
ity were largely explainable in terms of the morpho-
syntactic features we had prepared, and (c) a larger-
scaled survey was needed to obtain a statistically re-
liable model. Based on these observations, we con-
ducted a more comprehensive survey, in which we
prepared 770 questions and sent questionnaires with
a random set of 240 of them to teachers of Japanese
or English literacy at 50 schools for the deaf. We
Figure 1: Sample question and response
asked them to evaluate as many as possible anony-
mously. We obtained 4080 responses in total (8.0
responses per question).
3.2 Readability ranking model
The task of ranking a set of paraphrases can be de-
composed into comparisons between two elements
combinatorially selected from the set. We consider
the problem of judging which of a given pair of para-
phrase sentences is more readable/comprehensible
for deaf students. More specifically, given para-
phrase pair B4D7
CX
BND7
CY
B5, our problem is to classify it into
either left (D7
CX
is easier), right (D7
CY
is easier), or com-
parable (D7
CX
and D7
CY
are comparable).
Once the problem is formulated this way, we can
use various existing techniques for classifier learn-
ing. So far, we have examined a method of using the
support vector machine (SVM) classification tech-
nique.
A training/testing example is paraphrase pair
B4D7
CX
BND7
CY
B5 coupled with its quantified class label
BWB4D7
CX
BND7
CY
B5 BE CJA0BDBNBDCL. Each sentence D7
CX
is character-
ized by a binary feature vector BY
D7
CX
, and each pair
B4D7
CX
BND7
CY
B5 is characterized by a triple of feature vectors
CWBY
BV
D7
CX
D7
CY
BNBY
C4
D7
CX
D7
CY
BNBY
CA
D7
CX
D7
CY
CX, where
AF BY
BV
D7
CX
D7
CY
BP BY
D7
CX
CMBY
D7
CY
(features shared by D7
CX
and D7
CY
),
AF BY
C4
D7
CX
D7
CY
BP BY
D7
CX
CMBY
D7
CY
(features belonging only to D7
CX
),
AF BY
CA
D7
CX
D7
CY
BP BY
D7
CX
CMBY
D7
CY
(features belonging only to D7
CY
).
BWB4D7
CX
BND7
CY
B5 represents the difference in readability be-
tween D7
CX
and D7
CY
; it is computed in the following way.
1. Let CC
D7
CX
D7
CY
be the set of respondents who assessed
B4D7
CX
BND7
CY
B5.
2. Given the degree of readability respondent D8 as-
signed to D7
CX
(D7
CY
), map it to real value CSD3D6B4D8BND7B5 BE
CJBCBNBDCL so that the lowest degree maps to 0 and the
highest degree maps to 1. For example, the de-
gree of readability assigned to (A) in Figure 1 (b)
maps to around 0.1, whereas that assigned to (B)
maps to around 0.9.
3. BWB4D7
CX
BND7
CY
B5BP
BD
CYCC
D7
CX
D7
CY
CY
C8
D8BECC
D7
CX
D7
CY
CSD3D6B4D8BND7
CX
B5A0CSD3D6B4D8BND7
CY
B5BM
Output score CBCR
C5
B4D7
CX
BND7
CY
B5 BE CJA0BDBNBDCL for input
B4D7
CX
BND7
CY
B5 was given by the normalized distance be-
tween B4D7
CX
BND7
CY
B5 and the hyperplane.
3.3 Evaluation and discussion
To evaluate the two modeling methods, we con-
ducted a ten-fold cross validation on the set of 4055
paraphrase pairs derived from the 770 questions used
in the survey. To create a feature vector space, we
used 355 morpho-syntactic features. Feature annota-
tion was done semi-automatically with the help of a
morphological analyzer and dependency parser.
The task was to classify a given paraphrase pair
into either left, right,orcomparable. Model C5’s
output class for B4D7
CX
BND7
CY
B5 was given by
BVD0D7
C5
B4D7
CX
BND7
CY
B5BP
B4
D0CTCUD8 (CBCR
C5
B4D7
CX
BND7
CY
B5 AKA0AI
D1
)
D6CXCVCWD8 (CBCR
C5
B4D7
CX
BND7
CY
B5 AL AI
D1
)
CRD3D1D4CPD6CPCQD0CT (otherwise)
BN
where AI
D1
BE CJA0BDBNBDCL is a variable threshold used to
balance precision with recall.
We used the 473 paraphrase pairs that satisfied the
following conditions:
AFCYBWB4D7
CX
BND7
CY
B5CY was not less than threshold AI
CP
(AI
CP
BP
BCBMBH). The answer of B4D7
CX
BND7
CY
B5 is given by
BVD0D7
BTD2D7
B4D7
CX
BND7
CY
B5BP
D2
D0CTCUD8 (BWB4D7
CX
BND7
CY
B5 AKA0AI
CP
)
D6CXCVCWD8 (BWB4D7
CX
BND7
CY
B5 AL AI
CP
)
BM
AF B4D7
CX
BND7
CY
B5 must have been assessed by more then one
respondent, i.e., CYCC
D7
CX
D7
CY
CY BQ BDBM
AF Agreement ratio BTCVD6B4D7
CX
BND7
CY
B5 must be suffi-
ciently high, i.e., BTCVD6B4D7
CX
BND7
CY
B5 AL BCBMBL, where
BTCVD6B4D7
CX
BND7
CY
B5BPB4CUD3D6B4D7
CX
BND7
CY
B5 A0CPCVD7D8B4D7
CX
BND7
CY
B5B5BP
CYCC
D7
CX
D7
CY
CY, and CUD3D6B4D7
CX
BND7
CY
B5 and CPCVD7D8B4D7
CX
BND7
CY
B5 are the
number of respondents who agreed and disagreed
with BVD0D7
BTD2D7
B4D7
CX
BND7
CY
B5, respectively.
We judged output class BVD0D7
C5
B4D7
CX
BND7
CY
B5 correct if and
only if BVD0D7
C5
B4D7
CX
BND7
CY
B5 BP BVD0D7
BTD2D7
B4D7
CX
BND7
CY
B5. The overall
performance was evaluated based on recall CACR and
precision C8D6:
CACR BP
CYCUB4D7
CX
BND7
CY
B5CY BVD0D7
C5
B4D7
CX
BND7
CY
B5 is correctCVCY
CYCUB4D7
CX
BND7
CY
B5CY BVD0D7
BTD2D7
B4D7
CX
BND7
CY
B5BECUD0CTCUD8BND6CXCVCWD8CVCVCY
C8D6BP
CYCUB4D7
CX
BND7
CY
B5CY BVD0D7
C5
B4D7
CX
BND7
CY
B5 is correctCVCY
CYCUB4D7
CX
BND7
CY
B5CY BVD0D7
C5
B4D7
CX
BND7
CY
B5BECUD0CTCUD8BND6CXCVCWD8CVCY
.
The model achieved 95% precision with 89% re-
call. This result confirmed that the data we collected
through the questionnaires were reasonably noiseless
and thus generalizable. Furthermore, both models
exhibited a clear trade-off between recall and preci-
sion, indicating that their output scores can be used
as a confidence measure.
4 Paraphrase representation
We represent paraphrases as transfer patterns be-
tween dependency trees. In this section, we propose
a three-layered formalism for representing transfer
patterns.
4.1 Types of paraphrases of concern
There are various levels of paraphrases as the fol-
lowing examples demonstrate:
(1) a. She burst into tears, and he tried to comfort
her.
b. She cried, and he tried to console her.
(2) a. It was a Honda that John sold to Tom.
b. John sold a Honda to Tom.
c. Tom bought a Honda from John.
(3) a. They got married three years ago.
b. They got married in 2000.
Lexical vs. structural paraphrases Example (1)
includes paraphrases of the single word “comfort”
and the canned phrase “burst into tears”. The sen-
tences in (2), on the other hand, exhibit structural
and thus more general patterns of paraphrasing. Both
types of paraphrases, lexical and structural para-
phrases, are considered useful for many applications
including reading assistance and thus should be in
the scope our discussion.
Atomic vs. compositional paraphrases The pro-
cess of paraphrasing (2a) into (2c) is compositional
because it can be decomposed into two subpro-
cesses, (2a) to (2b) and (2b) to (2c). In develop-
ing a resource for paraphrasing, we have only to
cover non-compositional (i.e., atomic) paraphrases.
Compositional paraphrases can be handled if an ad-
ditional computational mechanism for combining
atomic paraphrases is devised.
Meaning-preserving vs. reference-preserving
paraphrases It is also useful to distinguish
reference-preserving paraphrases from meaning-
preserving ones. The above example in (3) is of the
reference-preserving type. This types of paraphras-
ing requires the computation of reference to objects
outside discourse and thus should be excluded from
our scope for the present purpose.
4.2 Dependency trees (MDSs)
Previous work on transfer-based machine transla-
tion (MT) suggests that the dependency-based repre-
sentation has the advantage of facilitating syntactic
transforming operations (Meyers et al., 1996; Lavoie
et al., 2000). Following this, we adopt dependency
trees as the internal representations of target texts.
We suppose that a dependency tree consists of a set
of nodes each of which corresponds to a lexeme or
compound and a set of edges each of which repre-
sents the dependency relation between its ends. We
call such a dependency tree a morpheme-based de-
pendency structure (MDS). Each node in an MDS is
supposed to be annotated with an open set of typed
features that indicate morpho-syntactic and semantic
information. We also assume a type hierarchy in de-
pendency relations that consists of an open set of de-
pendency classes including dependency, compound,
parallel, appositive and insertion.
4.3 Three-layered representation
Previous work on transfer-based MT sys-
tems (Lavoie et al., 2000; Dorna et al., 1998)
and alignment-based transfer knowledge acqui-
sition (Meyers et al., 1996; Richardson et al.,
2001) have proven that transfer knowledge can be
best represented by declarative structure mapping
(transforming) rules each of which typically consists
of a pair of source and target partial structures as in
the middle of Figure 2.
Adopting such a tree-to-tree style of representa-
tion, however, one has to address the issue of the
trade-off between expressibility and comprehensi-
bility. One may want a formalism of structural
rule editing
translation
compilation
simplified MDS transfer rule
N shika V- nai  ->  V no wa N dake da.
(someone does not V to nothing but N)   (it is only to N that someone does V)
MDS transfer rule
sp_rule(108, negation, RefNode) :-
  match(RefNode, X4=[pos:postp,lex: shika]),
  depend(X3=[pos:verb], empty, X4),
  depend(X1=[pos:aux_verb,lex: nai],
         X2=[pos:aux_verb*], X3),
  depend(X4, empty, X5=[pos:noun]),
  replace(X1, X6=[pos:aux_verb,lex: da]),
  substitute(X5, X12=[pos:noun]),
  move_dtrs(X5, X12),
  substitute(X3, X10=[pos:verb]),
                            :
pos: postp
lex: shika (except)
pos: aux_verb
lex:  da (copula)
pos: postp
lex: wa (TOP)
X6
X11
X12
pos: noun
lex:  no (thing)
pos: postp
lex: dake (only)
pos: noun
pos: noun
aux_verb*
pos: aux_verb
lex: nai (not)
pos: verbX3
X4
X1
X5
X2 X7
X8
X10 pos: verb
X9 vws
MDS processing operators
(=X5)
(=X2)
(=X3)
Figure 2: Three-layered rule representation
transformation patterns that is powerful enough to
represent a sufficiently broad range of paraphrase
patterns. However, highly expressible formalisms
would make it difficult to create and maintain rules
manually.
To mediate this trade-off, we devised a new layer
of representation to add on the top of the layer of
tree-to-tree pattern representation as illustrated in
Figure 2. At this new layer, we use an extended natu-
ral language to specify transformation patterns. The
language is designed to facilitate the task of hand-
coding transformation rules. For example, to define
the tree-to-tree transformation pattern given in the
middle of Figure 2, a rule editor needs only to spec-
ify its simplified form:
(4) N shika V- nai AX V no ha N dake da.
(Someone does V to nothing but N AX It is only to
N that someone does V)
A rule of this form is then automatically translated
into a fully-specified tree-to-tree transformation rule.
We call a rule of the latter form an MDS rewriting
rule (SR rule), and a rule of the former form a sim-
plified SR rule (SSR rule).
The idea is that most of the specifications of an SR
rule can usually be abbreviated if a means to auto-
matically complement it is provided. We use a parser
and macros to do so; namely, the rule translator com-
plements an SSR rule by macro expansion and pars-
ing to produce the corresponding SR rule specifica-
tions. The advantages of introducing the SSR rule
layer are the following:
AF The SSR rule formalism allows a rule writer to
edit rules with an ordinary text editor, which
makes the task of rule editing much more efficient
than providing her/him with a GUI-based com-
plex tool for editing SR rules directly.
AF The use of the extended natural language also
has the advantage in improving the readability of
rules for rule writers, which is particularly impor-
tant in group work.
AF To parse SSR rules, one can use the same parser
as that used to parse input texts. This also im-
proves the efficiency of rule development because
it significantly reduces the burden of maintaining
the consistency between the POS-tag set used for
parsing input and that used for rule specifications.
The SSR rule layer shares underlying motiva-
tions with the formalism reported by Hermjakob et
al. (2002). Our formalism is, however, considerably
extended so as to be licensed by the expressibility of
the SR rule representation and to be annotated with
various types of rule applicability conditions includ-
ing constraints on arbitrary features of nodes, struc-
tural constraints, logical specifications such as dis-
junction and negation, closures of dependency rela-
tions, optional constituents, etc.
The two layers for paraphrase representation
are fully implemented on our paraphrasing engine
KURA (Takahashi et al., 2001) coupled with another
layer for processing MDSs (the bottom layer illus-
trated in Figure 2). The whole system of KURA
and part of the transer rules implemented on it
(see Section 5 below) are available at http://cl.aist-
nara.ac.jp/lab/kura/doc/.
5 Post-transfer error detection
What kinds of transfer errors tend to occur in lex-
ical and structural paraphrasing? To find it out, we
conducted a preliminary investigation. This section
reports a summary of the results. See (Fujita and
Inui, 2002) for further details.
We implemented over 28,000 transfer rules for
Japanese paraphrases on the KURA paraphrasing en-
gine based on the rules previously reported in (Sato,
1999; Kondo et al., 1999; Kondo et al., 2001; Iida et
al., 2001) and existing lexical resources such as the-
sauri and case frame dictionaries. The implemented
rules ranged from such lexical paraphrases as those
that replace a word with its synonym to such syn-
tactic/structural paraphrases as those that remove a
cleft construction from a sentence, devide a sentence,
etc. We then fed KURA with a set of 1,220 sentences
randomly sampled from newspaper articles and ob-
tained 630 transferred output sentences.
The following are the tendencies we observed:
AF The transfer errors observed in the experiment ex-
hibited a wide range of variety from morphologi-
cal errors to semantic and discourse-related ones.
AF Most types of errors tended to occur regardless
of the types of transfer. This suggests that if one
creates an error detection module specialized for
a particular error type, it works across different
types of transfer.
AF The most frequent error type involved inappropri-
ate conjugation forms of verbs. It is, however,
a matter of morphological generation and can be
easily resolved.
AF Errors in regard to verb valency and selectional
restriction also tended to be frequent and fatal,
and thus should have preference as a research
topic.
AF The next frequent error type was related to the
difference of meaning between near synonyms.
However, this type of errors could often be de-
tected by a model that could detect errors of verb
valency and selectional restriction.
Based on these observations, we concluded that
the detection of incorrect verb valences and verb-
complement cooccurrence was one of the most se-
rious problems that should have preference as a re-
search topic. We are now conducting experiments
on empirical methods for detecting this type of er-
rors (Fujita et al., 2003).
6 Conclusion
This paper reported on the present results of our
ongoing research on text simplification for reading
assistance targeting congenitally deaf people. We
raised four interrelated issues that we needed address
to realize this application and presented our previ-
ous activities focuing on three of them: readabil-
ity assessment, paraphrase representation and post-
transfer error detection.
Regarding readability assessment, we proposed a
novel approach in which we conducted questionnaire
surveys to collect readability assessment data and
took a corpus-based empirical method to obtain a
readability ranking model. The results of the sur-
veys show the potential impact of text simplification
on reading assistance. We conducted experiments on
the task of comparing the readability of a given para-
phrase pair and obtained promising results by SVM-
based classifier induction (95% precision with 89%
recall). Our approach should be equally applicable
to other population segments such as aphasic read-
ers and second-language learners. Our next steps
includes the investigation of the drawbacks of the
present bag-of-features modeling approach. We also
need to consider a method to introduce the notion
of user classes (e.g. beginner, intermediate and ad-
vanced). Textual aspects of readability will also need
to be considered, as discussed in (Inui and Nogami,
2001; Siddahrthan, 2003).
Regarding paraphrase representation, we pre-
sented our revision-based lexico-structural para-
phrasing engine. It provides a fully expressible
scheme for representating paraphrases, while pre-
serving the easiness of handcraft paraphrasing rules
by providing an extended natural language as a
means of pattern editting. We have handcrafted over
a thousand transfer rules that implement a broad
range of lexical and structural paraphrasing.
The problem of error detection is also critical.
When we find a effective solution to it, we will be
ready to integrate the technologies into an applica-
tion system of text simplification and conduct user-
and task-oriented evaluations.
Acknowledgments
The research presented in this paper was partly
funded by PREST, Japan Science and Technology
Corporation. We thank all the teachers at the schools
for the deaf who cooperated in our questionnaire sur-
vey and Toshihiro Agatsuma (Joetsu University of
Education) for his generous and valuable coopera-
tion in the survey. We also thank Yuji Matsumoto
and his colleagues (Nara Advanced Institute of Sci-
ence and Technology) for allowing us to use their
NLP tools ChaSen and CaboCha, Taku Kudo (Nara
Advanced Institute of Science and Technology) for
allowing us to use his SVM tool, and Takaki Makino
and his colleagues (Tokyo University) for allow-
ing us to use LiLFeS, with which we implemented
KURA. We also thank the anonymous reviewers for
their suggestive and encouraging comments.

References
Barzilay, R. and McKeown, K. 2001. Extracting para-
phrases from a parallel corpus. In Proc. of the 39th An-
nual Meeting and the 10th Conference of the European
Chapter of Association for Computational Linguistics
(EACL), pages 50–57.
Barzilay, R. and Lee, L. 2003. Learning to paraphrases: an
unsupervised approach using multiple-sequence align-
ment. In Proc. of HLT-NAACL.
Canning, Y. and Taito, J. 1999. Syntactic simplification of
newspaper text for aphasic readers. In Proc. of the 22nd
Annual International ACM SIGIR Conference (SIGIR).
Carroll, J., Minnen, G., Canning, Y., Devlin, S. and Tait, J.
1998. Practical simplification of English newspaper
text to assist aphasic readers. In Proc. of AAAI-98
Workshop on Integrating Artificial Intelligence and As-
sistive Technology.
Dorna, M., Frank, A., Genabith, J. and Emele, M. 1998.
Syntactic and semantic transfer with F-structures. In
Proc. of COLING-ACL, pages 341–347.
Fujita, A. and Inui, K. 2002. Decomposing linguistic
knowledge for lexical paraphrasing. In Information
Processing Society of Japan SIG Technical Reports,
NL-149, pages 31–38. (in Japanese)
Fujita, A., Inui, K. and Matsumoto, Y. 2003. Automatic
detection of verb valency errors in paraphrasing. In In-
formation Processing Society of Japan SIG Technical
Reports, NL-156. (in Japanese)
Hermjakob, U., Echihabi, A. and Marcu, D. 2002. Nat-
ural language based reformulation resource and Web
exploitation for question answering. In Proc. of the
TREC-2002 Conference.
Iida, R., Tokunaga, Y., Inui, K. and Eto, J. 2001. Explo-
ration of clause-structural and function-expressional
paraphrasing using KURA.InProc. of the 63th Annual
Meeting of Information Processing Society of Japan,
pages 5–6. (in Japanese).
Inui, K. and Nogami, M. 2001. A paraphrase-based explo-
ration of cohesiveness criteria. In Proc. of the Eighth
European Workshop on Natulan Language Generation,
pages 101–110.
Jacquemin, C. 1999. Syntagmatic and paradigmatic rep-
resentations of term variations. In Proc. of the 37th
Annual Meeting of the Association for Computational
Linguistics (ACL), pages 341–349.
Kondo, K., Sato, S. and Okumura, M. 1999. Paraphras-
ing of “sahen-noun + suru”. Journal of Information
Processing Society of Japan, 40(11):4064–4074. (in
Japanese).
Kondo, K., Sato, S. and Okumura, M. 2001. Para-
phrasing by case alternation. Journal of Informa-
tion Processing Society of Japan, 42(3):465–477. (in
Japanese).
Kurohashi, S. and Sakai, Y. 1999. Semantic analysis of
Japanese noun phrases: a new approach to dictionary-
based understanding. In Proc. of the 37th Annual Meet-
ing of the Association for Computational Linguistics
(ACL), pages 481–488.
Lavoie, B. Kittredge, R. Korelsky, T. Rambow, O. 2000.
A framework for MT and multilingual NLG ystems
based on uniform lexico-structural processing. In Proc.
of ANLP-NAACL.
Lin, D. and Pantel, P. 2001. Discovery of inference rules
for question-answering. Natural Language Engineer-
ing, 7(4):343–360.
McCoy ,K. F. and Masterman (Michaud), L. N. 1997. A
Tutor for Teaching English as a Second Language for
Deaf Users of American Sign Language, In Proc. of
ACL/EACL ’97 Workshop on Natural Language Pro-
cessing for Communication Aids.
Meyers, A., Yangarber, R. and Grishman, R. 1996. Align-
ment of shared forests for bilingual corpora. In Proc.
of the 16th International Conference on Computational
Linguistics (COLING), pages 460–465.
Michaud, L. N. and McCoy, K. F. 2001. Error profiling:
toward a model of English acquisition for deaf learn-
ers. In Proc. of the 39th Annual Meeting and the 10th
Conference of the European Chapter of Association for
Computational Linguistics (EACL), pages 386–393.
NIJL, the National Institute for Japanese Language. 1991.
Nihongo Kyˆoiku-no tame-no Kihon-Goi Ch ˆosa (The
basic lexicon for the education of Japanese). Shuei
Shuppan, Japan. (In Japanese)
Richardson, S., Dolan, W., Menezes, A. and Corston-
Oliver, M. 2001. Overcoming the customization bottle-
neck using example-based MT. In Proc. of the 39th An-
nual Meeting and the 10th Conference of the European
Chapter of Association for Computational Linguistics
(EACL), pages 9–16.
Robin, J. and McKeown, K. 1996. Empirically designing
and evaluating a new revision-based model for sum-
mary generation. Artificial Intelligence, 85(1–2):135–
179.
Sato, S. 1999. Automatic paraphrase of technical pa-
pers’ titles. Journal of Information Processing Society
of Japan, 40(7):2937–2945. (in Japanese).
Shinyama, Y., Sekine, S. Kiyoshi, Sudo. and Grishman,
R. 2002. Automatic paraphrase acquisition from news
articles. In Proc. of HLT, pages 40–46.
Siddahrthan, A. 2003. Preserving discourse structure
when simplifying text. In Proc. of European Workshop
on Natural Language Generation, pages 103–110.
Takahashi, T., Iwakura, T., Iida, R., Fujita, A. and Inui, K.
2001. KURA: a transfer-based lexico-structural para-
phrasing engine. In Proc. of the 6th Natural Language
Processing Pacific Rim Symposium (NLPRS) Workshop
on Automatic Paraphrasing: Theories and Applica-
tions, pages 37–46.
Williams, S., Reiter, E. and Osman, L. 2003. Experiments
with discourse-level choices and readability. In Proc. of
European Workshop on Natural Language Generation,
pages 127–134.
