Evaluation of Importance of Sentences 
based on Connectivity to Title 
Takehiko Yoshimi and Toshiyuki Okunishi 
Takahiro Yamaji and Yoji Fukumochi 
Software Business Development Center, SHARP Corporation 
492 Minosho-cho Yamatokoriyama Nara, Japan 
Abstract 
This paper proposes a method of selecting impor- 
tant sentences from a text based on the evaluation 
of the connectivity between sentences by using sur- 
face information. We assume that the title of a text 
is the most concise statement which expresses the 
most essential information of the text, and that the 
closer a sentence relates to an important sentence, 
the more important this sentence is. The importance 
of a sentence is defined as the connectivity between 
the sentence and the title. The connectivity between 
two sentences is measured based on correference be- 
tween a pronoun and a preceding (pro)noun, and on 
lexical cohesion of lexical items. In an experiment 
with 80 English texts, which consist of an average 
of 29.0 sentences, the proposed method has marked 
recall of 78.2% and precision of 57.7%, with the se- 
lection ratio being 25%. The recall and precision 
values surpass those achieved by conventional meth- 
ods, which means that our method is more effective 
in abridging relatively short texts. 
(Luhn, 1958; Edmundson, 1969; i~I~I~, 1987; 
l~)g~)~ and ~I~, 1988; r~\]i~l~ et al., 1989; 
Salton et al., 1994; Brandow et al., 1995; ~'2~tZ~ 
and ~3kl~, 1995; {~)l~ et al., 1995; I£1~$1~ et 
al., 1995; Watanabe, 1996; Zechner, 1996; {q~it~, 
1997). 
1443 
i~©~ L~, 1) ~~9 ~ b~ 
~tS~ (F,~i~k~ et al., 1989; Ono et al., 1994) 
(Hoey, 1991; Collier, 1994; ~gi~--, 1997; {~ ;~ 
--~ et al., 1993) 7)~b5. ~&i~-~t,¢<'~-~ 
ot,¢7)~ 9 ~)~b;5 (Halliday and Hasan, 1976) 7)~, ~ 
2 
2.1 ff-#Y, I- ~Ili~ J- R:q) ~_~l~ I= ~ ~ 7~ {K~ 
~:Z ( ¢~ ~) ~ b ~? :Z ( ©~ ~) ~-.© ;~ ~ 
\[A,B,C} ~. 
$2 $4"~''--~. ~. '\ {A,D,E} \[B,C,G} ~-...~\, 
/ ~'k IA,D,H,K} 
{ A,D,E,F} {D,H} f4~ .~lZ ~ 7o ~}/~\[~.(--5~ ~ ~) 
\[\] 1: ~¢~/~6~\]~ 
~. 
i<3 (1) 
2.2 -- ~a~ ~9 ~1~ o~ ~ffi 
.~.~ ~ ©~-e~:, #o~ ~--,~ Iz t~ t ~-c ~, ~. 
I~, ~-~, ),.~.f~, .~-~, ~-~, ~l~© 
~,-~-e~ ~ ~ L.,= e~_~°, t~.~. 
& eS~¢)~t~ = M~,I (e) 
y_,y_z2, Mi, i ~S i ~©~ ~ t~S~ 
(2) ~9,~.~t~ 2.2A ~_/~-~-t6. _.--9¢~ 
1444 
2.2.1 A,~~(~) :~O)~,,~o)~t~ 
2.2.2 ~@~ ~"~ ~/J¢ ~.J ~ ~ 
to~=~d'h~t~, ff~J~_\[~, "put pressure on" 
"put" \[:t~'--~W, "cabinet meeting" ~ "meet- 
ing" g~ ~--~k'~ ,~,.~h~- 6. 
2.2.3 ~-,, 0) ~:g~ {-~ 
:k:-~ t~{\[~_ ~ (Edmundson, 1969; P,~8,~ et al., 
1989; Watanabe, 1996) ©~){~'~bTo. ~g~'~t±, 
t 1 
b, ~.~I: w : 5 ~ bt:. 
2.2.4 ~-~:a)~.a)--9~ 9 
~X Sj t:_$31,,~©~ (theme) ~ b'~l~, 
(Givon, 1979). ~o~:, ~ Sj ¢)~Y~S~ ~e)~tZ\]~ 
(t~, 1985)~L ~T~, ~e{£~$<, 
~ ~:~#~, Sj ~ ~l~\]~ ~o~ , ~©~i~:Y. ~" 
~X~¢) 1/4 ~-F¢~ 
3 
¢)~© 17.9% ~b o ~:.. 
b'~I~-~hl~T~ (~k~/K and ~ 
t.. 
F ~S¢)~_~ = -- 
N 
-:zx~.~ A, ~~,~o©-:z:~-~ B, C, 
25% ~ LT:. ~2 1:-3:~l~, --~--+Yi=~J~ 
1445 
20 
15 
~ b~J~ 10 
I I 
© 0 
0 0 
© © 0 
0 © 0 
0 © 0 0 
0 0 © O0 
0 0 0 0 O0 
0 0 0 0 O0 
I 
40 60 
I 
2O 
\[\] 2: ~-~t:~ 6~/~¢ 
$ 
© 
© 
© © 
I 
8O 100 
~-~ 78.2% 57.7% 25% 
-:/x ~.h A 72.3% 52.6% 26% 
-:/;z ~ ~ B 61.7% 39.5% 29% 
-:/y~ ~.h C 61.4% 40.9% 29% 
->" ~ ~- A D 57.5% 42.2% 27% 
~g/~otz=k-~5. ~za~-~x b-~l~, "shooting" 
"gunfire" ©~1~,~-~-~ t;~z~, "gun- 
fire" ~ZE~t~ g'~ ~lz ~o ot~7\]~ ~ t~!, , ~ 7% 
~#©(~ b ~:~ (base) ©~J~ =~ ~t~ ~_tff, 
nounce" ~ "announcement" t~:, ~.~ L~ b~" 
1446 
t$5. 
4 ~3~ ~) l: 
t:--~ btz ~ -~, ~ 78.2%, ~#~ 57.7% ©~ 
~6 : k ~, U~=. 
RIU (Hearst, 1997), ~r+)-:~ " b l:°y~ ' ='~ {:-~-T-~ 

References 
R. Brandow, K. Mitze, and L. F. Rau. 1995. Auto- 
matic Condensation of Electric Publications by 
Sentence Selection. Information Processing &' 
Management, 31(5):675-685. 
A. Collier. 1994. A System for Automating Concor- 
dance Line Selection. In Proceedings of NeMLaP, 
pages 95-100. 
H. P. Ednmndson. 1969. New Methods in 
Automatic Extracting. Journal of the ACM, 
16(2):264-285. 
T. Givon. 1979. From Discourse to Syntax: Gram- 
mar as a Processing Strategy. In T. Givon, editor, 
Discourse and Syntax, pages 81-112. Academic 
Press. 
M. A. K. Halliday and R. Hasan. 1976. Cohesion in 
English. Longman. 
M. A. Hearst. 1997. TextTiling: Segmenting Text 
into Multi-paragraph Subtopic Passages. Compu- 
tational Linguistics, 23(1):33-64. 
M. Hoey. 1991. Patterns of Lexis in Text. Describ- 
ing English Language. Oxford University Press. 
H. P. Luhn. 1958. The Automatic Creation of Lit- 
erature Abstracts. IBM Journal for Research and 
Development, 2(2):159-165. 
K. Ono, K. Sumita, and S. Miike. 1994. Abstract 
Generation based on Rhetorical Structure Extrac- 
tion. In Proceedings of COLING, pages 344-348. 
G. Salton, J. Allan, C. Buckley, and A. Singhal. 
1994. Automatic Analysis, Theme Generation, 
and Summarization of Machine-Readable Texts. 
Science, 264(3):1421-1426. 
H. Watanabe. 1996. A Method for Abstracting 
Newspaper Articles by Using Surface Clues. In 
Proceedings of COLING, pages 974-979. 
1447 
K. Zechner. 1996. Fast Generation of Abstracts 
from General Domain Text Corpora by Extract- 
ing Relevant Sentences. In Proceedings of COL- 
ING, pages 986-989. 
