A Multiple-Document Summarization System with User
Interaction
Hiroyuki SAKAI
Toyohashi University of Technology
1-1 Hibarigaoka, Tempaku,
Toyohashi 441-8580,
Japan,
sakai@smlab.tutkie.tut.ac.jp
Shigeru MASUYAMA
Toyohashi University of Technology
1-1 Hibarigaoka, Tempaku,
Toyohashi 441-8580,
Japan,
masuyama@tutkie.tut.ac.jp
Abstract
We propose a multiple-document summa-
rization system with user interaction. Our
system extracts keywords from sets of docu-
ments to be summarized and shows the key-
words to a user on the screen. Among them,
the user selects some keywords reflecting
his/her needs. Our system controls the pro-
duced summary by using these selected key-
words. For evaluation of our method, we
participated in TSC3 of NTCIR4 workshop
by letting our system select 12 best key-
words regarding scoring by the system. Our
participated system attained the best per-
formance in content evaluation among sys-
tems not using sets of questions. Moreover,
we evaluated effectiveness of user interac-
tion in our system. With user interaction,
our system attained both higher coverage
and precision than that without user inter-
action.
1 Introduction
Recent rapid progress of computer and com-
munication technologies enabled us to access
enormous amount of machine-readable informa-
tion easily. However, this has caused the in-
formation overload problem. In order to solve
this problem, automatic summarization meth-
ods have been studied (Mani and T.Maybury,
1999). In particular, the necessity for a
multiple-document summarization has been in-
creasing and the multiple-document summa-
rization technology has been intensively studied
recently (Mani, 2001).
In this paper, we define multiple-document
summarization as a process for producing a
summary from a relevant document set. Such a
document set may be very large and may con-
tain a number of topics. It is preferable that
a summary produced by a multiple-document
summarization system from the document set
covers all topics contained in the document set.
However, it is difficult to produce a summary
that covers all the topics in the document set
withasmallnumberofcharacters. Forexample,
a document set relevant to “releasing AIBO”
contains some topics, e.g., what is AIBO?, how
to sell AIBO?, etc. Moreover, sentences recog-
nized as important sentences considerably dif-
fer person to person (Nomoto and Matsumoto,
2001). This is because “summarization need”,
i.e., topics a different person wants to read, may
differ. Hence, we propose a multiple-document
summarization system with user interaction for
coping appropriately with user’s summarization
need. Oursystemextractskeywordsfromadoc-
ument set to be summarized and shows the key-
words to a user. Among them, the user selects
keywords reflecting user’s summarization need.
Our system controls a produced summary by
using the keywords selected by the user. For re-
alizing our purpose, we have devised a scoring
method for keywords extraction specialized to
our purpose. We would like to emphasize here
the fact that scoring of words for extracting key-
words shown to a user is crucial for the system
performance as well as different from those used
in usual automatic indexing.
We participated in TSC3 (Text Summariza-
tion Challenge - 3) of NTCIR4 workshop 1 and
attained the best performance in content evalu-
ation among systems not using sets of questions.
Note that our system participated in TSC3 is an
automatic summarization system without user
interaction by letting our system with user in-
teractionselect12bestkeywordsregardingscor-
ing by the system. Moreover, we evaluated ef-
fectivenessofuserinteractionandthatwithuser
interaction attained both higher coverage and
precision than that without user interaction.
2 Feature of our multiple-document
summarization system
Our multiple-document summarization system
proposed in this paper is different from previ-
1http://www.lr.pi.titech.ac.jp/tsc/index-en.html
