Producing More tleadable Extracts by Revising Them 
Hidetsugu Nanba (~) and Manabu Okumura (f, ~) 
~School of Informatioi1 Science 
Japan Advanced Institute of Science and :t'echnology 
++Precision and intelligence Laboratory 
Tokyo Institute of Technology 
nanbaOj aist. ac. jp, oku©p:. "c2"cech. ac. jp 
Abstract 
In this paper, we first experimentally investigated 
the factors that make extracts hard to read. We did 
this by having human subjects try to revise extracts 
to produce more readable ones. We then classified 
the factors into five, most of which are related to 
cohesion, after which we devised revision rnles for 
each factor, and partially implemented a system that 
revises extracts. 
1 Introduction 
The increasing number of on-line texts available has 
resulted in automatic text summarization becom- 
iI,g a major research topic in the NLP community. 
The main approach is to extract important sentences 
fiom the texts, and the main task in this approach 
is: that of evaluating the importance of sentences 
\[MIT, 1999\]. This producing of extracts - that is, 
sets of extracted important sentences - is thought to 
be easy, and has therefore long been the main way 
that texts are summarized. As Paice pointed out, 
however, computer-produced extracts tend to suffer 
from a 'lack of cohesion' \[Paice, 1990\]. For example, 
the antecedents corresponding to anaphors in an ex- 
tract are not always included in the extract. This 
often makes tile extracts hard to read. 
In the work described in this paper, we there- 
fore developed a method for making extracts easier 
to read by revising them. We first experimentally 
investigated the factors that make extracts hard to 
read. We did this by having human subjects try to 
revise extracts to produce more readable ones. We 
then classified the factors into five, most of which 
are related to cohesion \[Halliday et al., 1976\], after 
which we devised revision rules tbr each factor, and 
partially implemented a system that revises extract- 
s. We then evaluated our system by comparing its 
revisions with those produced by human subjects 
and also by comparing the readability judgments of 
human subjects between the revised and original ex-- 
tracts. 
In tile following sections we briefly review relat- 
ed works, describe our investigation of what make 
extracts hard to read, and explain our system for 
revising extracts to make tl~em more readable. Fi- 
nally, we describe our evaluation of the system and 
discuss the results of that evaluation. 
2 Related Works 
Many investigators have tried to measure the read- 
ability of texts \[Klare, 1963\]. Most of them have e-- 
valuated well-formed texts produced by people, and 
used two measures: percentage of familiar words in 
the texts (word level) and the average length of the 
sentences in the texts (syntactic level). These mea- 
sures, however, do not necessarily reflect the actu- 
al readability of computer-produced extracts. We 
therefore have to take into account other factors that 
might reduce the readability of extracts. 
One of them could be a lack of cohesion. Italli- 
day and ttasan \[ttalliday et al., 1976\] described five 
kinds of cohesion: reference, substitution, ellipsis, 
conjunction, and lexical cohesion. 
Minel \[Minel et al., 1997\] tried to measure the 
readability of extracts in two ways: by counting the 
number of anaphors in an extract that do not have 
antecedents in the extract, and by counting the num- 
ber of sentences which are not included in an extract 
but closely connected to sentences in the extract. 
We therefore regard kinds of cohesion as impor- 
tant in trying to classify tile factors that make ex- 
tracts less readable in the next section. 
One of the notable previous works dealing with 
ways to produce more cohesive extracts is that 
of Paiee \[Paiee, 1990\]. Mathis presented a frame- 
work in which a pair of short sentences are com- 
bined into one to yield a more readable extract 
\[Mathis et al., 197,3\]. We think, however, that none 
of the previous studies have adequately investigated 
the factors making extracts hard to read. 
Some investigators have compared human- 
produced abstracts with the original texts and inves- 
tigated how people revise texts to produce abstracts 
1071 
\[Kawahara, 1989, Jing, 1999\]. Revision is thought to 
be done for (at least) the following three purposes: 
(1) to shorten texts, 
(2) to change the style of texts, 
(3) to make texts more readable. 
Jing \[aing, 1999\] is trying to implement a human 
summarization model that includes two revision op- 
erations: reduction (1) and combination (3). Mani 
\[Mani et al., 1999\] proposed a revision system that 
uses three operations: elimination (1), aggregation 
(1), and smoothing (1, 3). Mani showed that his 
system can make extracts more informative with- 
out degrading their readability. The present work, 
however, is concerned not with improving readabili- 
ty but with improving the informativeness. 
3 Less Readability of Extracts 
To investigate the revision of extracts experimental- 
ly, we had 12 graduate students produce extracts 
of 25 newspaper articles from the NIHON KEIZAI 
SHINBUN, the average length of which was 30 sen- 
tences. We then asked them to revise the extracts 
(six subjects per extract). 
We obtained extracts containing 343 revisions, 
made for any of the three purposes listed in the last 
section. We selected the revisions for readability, 
and classified them into 5 categories, by taking into 
account the categories of cohesion by Halliday and 
Hasan\[Halliday et al., 1976\]. Table 1 shows the sum 
of the investigation. 
Next, we illustrate each category of revisions. In 
the examples, darkened sentences are those that are 
not included in extracts, but are shown for explana- 
tion. The serial number in the original text is also 
shown at the beginning of sentences a 
A) Lack of conjunctive expressions/presence 
of extraneous conjunctive expressions 
The relation between sentences 15 and 16 is ad- 
versative, because there is a conjunctive 'L. b'\[l 
(However)' at the beginning of sentence 16. But 
because sentence 15 is not in the extract, 'L h'L 
(However)' is considered unnecessary and should be 
deleted. Conversely, lack of conjunctive expression- 
s might cause the relation between sentences to be 
difficult to understand. In such a case, a suitable 
conjunctive expression should be added. For these 
tasks, discourse structure analyzer is required. 
We use the following three tags to show revisions. 
< ,,t,~ > F,~ < I,aa >: add a new expression ~. 
< ,~t > B z < In,4 >: delete an expression ~.. 
< ,,I, ~t > t% < /r,r >: replace an expression ~3 with B41 
(The company plans to give women more opportu- 
nity to work by employing fidl-time workers.) 
15. ~KmJUI~TIY, h ~ ~ ,¢J<, y.~,J: -) ~" !cV~.:~¢lnm 
(Since there have been no similar cases before, the 
project that women join is now in a hard situation, 
though the company puts hopes on it.) 
16. <del> L b'L </del> \[~{~/.~" _'_O0-~':12"~e(.~ ~4.s'~t~ 
)\ f { J~ iiriI~\] Z ~,i#l-: t{lJ 6\[i:~.~ ~ if(Ill ~ \]EYb ~~., 5. 
(<del>However,</del> it is making efforts of ref- 
ormation which will be profitable both for the com- 
pany and the female workers.) 
B) Syntactic complexity 
2. (fl:flEf~ij;before revision) 
(It is the first project in telecommunication busi- 
ness, which President Kashio wants to be one of 
the central businesses in the future, and it is also 
the preparation for expanding the business to cel- 
lular phone.) $ 
(/f'f iE ~{;aft er revision) 
P~ rE- f~ !:~ ?.k. ~ I\] 6 -I~ ~m ~':~,#:.~ ~ ~ I,: ~-~ ~ ~ f~ ~, -~- 
(It is the first project in telecommunication busi- 
nesses, which President Kashio wants to be one of 
the central business in the future.) 
6. 
(It is also the preparation for expanding the busi- 
ness to cellular phone.) 
Longer sentences tend to have a syntactically 
complex structure \[Klare, 1963\], and a long com- 
pound sentence should generally be divided into two 
simpler sentences. It has also been claimed, however, 
that short coordinate sentences should be combined 
\[Mathis et al., 1973\]. 
C) Redundant repetition 
00_ b,:~;.~,~l: \]\ ~(rb. 
(The new product 'ECHIGO BEST 100' which 
ECHIGO SEIKA released this April is popular a- 
mong housewives.) 
(<rep The company> ECHIGO SEIKA </rep> 
has been making use of NTT Captain system since 
1987.) 
If subjects of adjacent sentences in an extract are 
the same, as in the above example, readers might 
think they are redundant. In such a ease, repeated 
expressions should be omitted or replaced by pro- 
nouns. In this example, the anaphoric expression 
'\[iiJ ~1: (the eoinpany)' is used instead of the original 
expression. 
1072 
Table 1: Factors of less readability and their revision methods 
~ factors 
-A-- 
B syntactic complexity 
C 
lack of conjunctive expressions/ 
presence of extraneous 
conjunctive expressions 
redundant repetition 
lack of information 
revision methods 
add/delete conjunctive expressions 
combine two sentences; divide a sentence into two 
prononfinalizei onfit expressions; 
add demonstratives 
supplement omitted expressions; 
replace anaphors by antecedents; delete anaphors 
required techniques 
discourse structure 
analysis 
anaphora and 
ellipsis resolution 
add supplementary information information extraction 
lack of adverbial particles; add/delete adverbial particles 
presence of extraneous 
adverbial particles 
D) Lack of information 
~:' :,.-- Y--25. 
(These are the car maker C\[tRYSLER and the com- 
puter maker COMPAC.) 
(We are now in a vicious circle where the layoffs by 
companies discourage consumptions, which in turn 
results in lower sales.) 
9. <del> ~"q~'(". </del> "; 9 4 2. ~--t)' )<~\[,hJL-C 
(<:del>In such a situation,</del> CHRYSLER has 
done well, because its management strategy exactly 
fits the age of low growth.) 
In this example, the referent of '~l~-(" (in such 
a situation)' in sentence 9 is sentence 8, which is 
not in the extract. In such a case, there are two 
ways to revise: to replace the anaphoric expression 
with its antecedent, or to delete the expression. The 
re.vision in the example is the latter one. For the 
task, a method for anaphora and ellipsis resolution 
is required. 
(Masayoshi Son, CEO of Softhank, is now suffering 
from jet lag.) 
3. <:add> '/ 7 I" ';D q'~ <: /add> ~'~:~t:t-~#' ?\[ig~#' 
ROM ~2 f'l{'~ 12 '~' :~ I- 0'sltJi;Yd. 
(CEO Son <add> of Softbank <:/add> is eager to 
sell softwares using CD-ROM, and he think it is a 
big project for his company.) 
In this second example, since 'CEO Son' appears 
without the name of the company in the extrac- 
t, without any background knowledge, we may not 
u:nderstand what company Mr. Son is the CEO 
of. Therefore, the name of the company 'Softbank' 
should be added as the supplementary information. 
The task requires a method for information extrac- 
tion or at least named entity extraction. 
E) lack of adverbial particles/presence of 
extraneous adverbial particles 
'2,6. ~,"\["1~ F - :-. • t/q,~dl-(-J~atil I l~tl,I,jlq~,'l, tl?D~.?-~z~.,.vb 
(It is a good opportunity to promote the mutual un- 
derstanding between Japan and Vietnam that M- 
r. Do MUOI, a chief secretary of Vietnam, visits 
J t~pan.) 
(From a viewpoint of security, Vietnam will be a 
key country in Asia.) 
30..~':}','fib}d 7J~\[(li~C " <del> 4, </del> ~ ,tl-~.~O~.~e2~l~ 
(Japanese government should consider long-term e- 
conomical support<del>, too </del>.) 
In the above example, there is an adverbial parti- 
cle "5 (, too)' and we can find that sentences 29 and 
30 are paratactical. But, because sentence 29 is not 
in the extract, the particle '-L (, too)' is unnecessary 
and should be deleted. 
4 Revision System 
Our system uses the Japanese public-domain an- 
alyzers JUMAN \[Kurohashi et al., 1998\] and KNP 
\[t(urohashi, 1998\] morphologically and syntactically 
analyze an original newspaper article and its extrac- 
t. It then applies revisions rules to the extract re- 
peatedly, with reference to the original text, until no 
rules can revise the extract further. 
4.1 Revision Rules 
Because tile techniques needed for dealing with all 
the categories of revisions dealt with in the previous 
1073 
section were not available, we devised and imple- 
mented revision rules only for factors (A), (C), and 
(D) in Table 1 by using JPerl. 
a) Deletion of conjunctive expressions 
We prepared a list of 52 conjunctive expres- 
sions, and made it a rule to delete each of them 
whenever the extract does not include the sentence 
that expression is related. To identify the sen- 
tence related to the sentence by the conjunction 
\[Mann et al., 1986\], the system performs partial dis- 
course structure analysis taking into account all sen- 
tences within three sentences of the one containing 
the conjunctive expression. 
The implementation of our partial discourse 
structure analyzer was based on Fukumoto's dis- 
course structure analyzer \[Fukumoto, 1990\]. It in- 
fers the relationship between two sentences by refer- 
ring to the conjunctive expressions, topical words, 
and demonstrative words. 
c) Omission of redundant expressions 
If subjects (or topical expressions marked with 
topical postposition 'wa') of adjacent sentences in 
an extract were the same, the repeated expressions 
were considered redundant and were deleted. 
d-l) Deletion of anaphors 
To treat anaphora and ellipsis successfully, we 
would need a mechanism for anaphora and ellipsis 
resolution (finding the antecedents and omitted ex- 
pressions). Because we have no such mechanism, 
we implement a rule with ad hoc heuristics: If an 
anaphor appears at the beginning of a sentence in 
an extract, its antecedent must be in the preceding 
sentence. Therefore, if that sentence was not in the 
extract, the anaphor was deleted. 
d-2) Supplement of omitted subjects 
If a subject in a sentence in an extract is omit- 
ted, the revision rule supplements the subject from 
the nearest preceding sentence whose subject is not 
omitted in the original text. This rule is implement- 
ed by using heuristics similar to the above revision 
rule. 
5 Evaluation of Revision Sys- 
tem 
We evaluated our revision system by comparing its 
revisions with those by human subjects (evaluation 
1), and comparing readability judgments between 
the revised and original extracts (evaluation 2). 
5.1 Evaluation 1: comparing system 
revisions and human revisions 
Because revision is a subjective task, it was not easy 
to prepare an answer set of revisions to which our 
system's revisions could be compared. The revisions 
that more subjects make, however, can be consid- 
ered more reliable and more likely to be necessary. 
When comparing the revisions made by our system 
with those made by human subjects, we therefore 
took into account the degree of agreement among 
subjects. 
For this evaluation, we used 31 newspaper ar- 
ticles (NIHON KEIZAI SHINBUN) and their ex- 
tracts. They were different from the articles used 
for making rules. Fifteen of extracts are taken fronl 
Nomoto's work \[Nomoto et al., 1997\], and the rest 
were made by our group. The average numbers of 
sentences in the original articles and the extracts 
were 25.2 and 5.1. 
Each extract was revised by five subjects who 
had been instructed to revise the extracts to make 
them more readable and had been shown the 5 ex- 
amples in section 3. As a result, we obtained 167 
revisions in total. The results are listed in Table 2. 
Table 2: The number of revisions 
I \]revision methods I total I 
A add(61)/delete(ll) 
conjunctive expressions 72 
B combine two sentences(2) 
divide a sentence into two(6) 8 
C pronominalize(5); omit expressions(3) 
add demonstratives(8) 16 
D supplement omitted expressions(lI) 
replace anaphors by antecedents(10) 
delete anaphors(15) 36 
add supplementary information(26) 26 
E delete adverbial particles(4) 
add adverbial particles(5) 9 
167 
We compared our system's revisions with the an- 
swer set comprising revisions that more than two 
subjects made. And we used recall (R) and preci~ 
sion (P) as measures of the system's performances. 
( Numberofsystem'srevisions ) 
matched to the answer R= 
Number of revisions in the answer 
( Number ofsystem'srevisions ) 
matched to the answer P= 
Number of systemfs revisions 
Evaluation results are listed in Table 3. As in 
Table 3, the coverage of our revision rules is rather 
small (about 1/4) in the whole set of revisions in 
Table 2. It is true that the experiment is rather small 
and can be considered as less reliable. Though it is 
less reliable, some of the implemented rules can cover 
most of the necessary revisions by human subjects. 
However, precision should be improved. 
1074 
Table 3: Comt)arison between tile revisions by hu- 
nlan aud our system 
re visionrules J\] I-( I P I 
a(total:ll) 2/2 2/5 
c(total:3) 0/0 0/0 
d-l(total:15) 4/5 4/7 
d-2(total:ll) II 2/4 2/10 
5.2 Evaluation 2: colnparillg human 
readability judgments of original 
and revised extracts 
In the second evaluation, using the same 31 texts 
as in evaluation 1, we asked five human subject- 
s to rank the following four kinds extracts in the 
order of readability: the original extract (without 
revision)(NON-REV), human-revised ones (REV-1 
and REV-2), and the one revised by our system 
(REV-AUTO). REV-1 and REV-2 were respective- 
ly extracts revised in the cases where more than one 
and more than two subjects agreed to revise. 
We considered a judgment by tile majority (more 
than two subjects) to be reliable. The results are 
listed in Table 4. The column 'split' in Table 4 indi- 
cates the number of cases where no majority could 
agree. The results show that both REV-1 and REV- 
2 extracts were more readable than NON-REV ex- 
tracts and that REV-2 extracts might be better than 
REV-1 extracts, since the number of 'worse' evalua- 
tions was smaller for REV-2 extracts. 
Table 4: Comparison of readability among original 
extracts and revised ones 
\]\] better same \] worse \] split 
REV-2vs. NON \] 15 1: 2 2 
REV-1 vs. NON 22 7 1 
AUTO vs. NON \[\[ 2 13 \[ 12 \] 0 
In comparing REV-AUTO with NON-REV, we 
use 27 texts where the readability does not de- 
grade in REV-2, since the readability cannot im- 
prove with revisions by our system in those texts 
where the readability degrades even with human re- 
visions. Even with those texts, however, in ahnost 
half the cases, the readability of the revised extrac- 
t was worse than that of the original extract. The 
main reason is that the revision system supplement- 
ed incorrect subjects. 
6 Discussion 
Although the results of the evaluation are encour- 
aging, they also show that our system needs to be 
improved. We have to impleinent inore revision rules 
to enlarge the coverage of our system. One of the 
most frequent revisions is to add conjunctions(37%). 
We also need to reform our revision rules into more 
thorough implementation. To improve our system, 
we think it is necessary to develop a robust discourse 
structure analyzer, a robust mechanism for anapho- 
ra and ellipsis resolution, and a robust system of 
extracting named entities. They are under develop- 
lllent now. 
7 Conclusion 
In this paper we described our investigation of the 
factors that make extracts less readable than they 
should be. We had human subjects revise extracts 
to made them more readable, and we classified the 
factors into five categories. We then devised revision 
rules for three of these factors and iinplemented a 
system that uses them to revise extracts. We found 
experimentally that our revision system can improve 
the readability of extracts. 

References 

\[Fukumoto, 1990\] Fukumoto, J. (1990) Context Struc- 
ture Analysis Based on the Writer's Insistence. IPSJ 
SIG Notes, NL-78-15, pp.113-120, in Japanese. 

\[ttalliday et al., 1976\] ttalliday, M.A.K., Hasan,R. 
(1976) Cohesion in English. Longman. 

\[Jing, 1999\] Jing,H. (1999) Sunmmry Generation 
through Intelligent Cutting and Pasting of the Input 
Document. Ph.D. Thesis Proposal, Columbia Univ. 

\[Kawahara, 1989\] Kawahara,H. (1989) Chapter 9, in 
Bunshoukouzou to youyakubun no shosou. Kuroshio- 
shuppan, pp.141-167, in Japanese. 

\[Klare, 1963\] Klare,G.R. (1963} The Measurement of 
Readability. Iowa State University Press. 

\[Kurohashi et al., 1998\] Kurohashi,S., Nagao,M. (1998) 
Japanese Morphological Analysis System JUMAN 
version 3.5. 

\[Kurohashi, 1998\] I(urohashi, S. (1908) Japanese parser 
KNP version 2.0 b6. 

\[Mathis et al., 1973\] 
Mathis,B., I/.ush,J., Young,C. (1973) hnprovement of 
Automatic Abstracts by the Use of Structural Analy~ 
sis. JASIS,24(2),pp.lOl-109. 

\[Mani et al., 1999\] Mani,I., Gates,B., Bloedorn,E. 
(1999) Improving Sununaries by Revising Them. the 
37th Annual Meeting of the ACL, pp.558-565. 

\[MIT, 1999\] Mani,I., Maybury,M.T. (1999) Advances in 
Automatic Text Summarization. MIT Press. 

\[Mann et al., 1986\] Mann,W.C., Thompson,S.A. (1986) 
Rhetorical Structure Theory: Description and Con- 
struction of Text Structure. Proe. of the third Inter- 
national Workshop on Text Generation. 

\[Minel et al., 1997\] Minel,J., Nugier,S., Geralcl,P. (1997) 
How to Appreciate the Quality of Automatic Text 
Summarization? Examples of FAN and MLUCE Pro- 
tocols and their Results on SERAPHIN. Irttelligent 
Sealable Text Summarization, Proe. of a Workshop, 
A CL, pp.25-30. 

\[Nomoto et al., 1997\] Nonmto,T., Matsumoto,Y. (1997) 
The Readability of tIuman Coding and Effects on Au- 
tmnatic Abstracting. IPSJ SIG Notes, NL-120-11, 
pp.71-76, in Japanese. 

\[Paice, 19901 Paice,C.D. (1990) Constructing Literature 
Abstracts by Comlmter: Techniques and Prospects. 
Info. Proe. gJ Manage., 26 (1), pp. 171-186. 
