Multi-Document Summarization By Sentence Extraction 
Jade Goldstein* Vibhu Mittal t Jaime Carbonell* Mark Kantrowitzt jade@cs.cmu.edu mittal@jprc.com jgc@cs.cmu.edu mkant@jprc.com 
*Language Technologies Institute 
Carnegie Mellon University 
Pittsburgh, PA 15213 
U.S.A. 
tJust Research 
4616 Henry Street 
Pittsburgh, PA 15213 
U.S.A. 
Abstract 
This paper discusses a text extraction approach to multi- 
document summarization that builds on single-document 
summarization methods by using additional, available in-, 
formation about the document set as a whole and the 
relationships between the documents. Multi-document 
summarization differs from single in that the issues 
of compression, speed, redundancy and passage selec- 
tion are critical in the formation of useful summaries. 
Our approach addresses these issues by using domain- 
independent techniques based mainly on fast, statistical 
processing, a metric for reducing redundancy and maxi- 
mizing diversity in the selected passages, and a modular 
framework to allow easy parameterization for different 
genres, corpora characteristics and user requirements. 
1 Introduction 
With the continuing growth of online information, it 
has become increasingly important to provide improved 
mechanisms to find and present textual information ef- 
fectively. Conventional IR systems find and rank docu- 
ments based on maximizing relevance to the user query 
(Salton, 1970; van Rijsbergen, 1979; Buckley, 1985; 
Salton, 1989). Some systems also include sub-document 
relevance assessments and convey this information to the 
user. More recently, single document summarization sys- 
tems provide an automated generic abstract or a query- 
relevant summary (TIPSTER, 1998a). i However, large- 
scale IR and summarization have not yet been truly in- 
tegrated, and the functionality challenges on a summa- 
rization system are greater in a true IR or topic-detection 
context (Yang et al., 1998; Allan et al., 1998). 
Consider the situation where the user issues a search 
query, for instance on a news topic, and the retrieval sys- 
tem finds hundreds of closely-ranked documents in re- 
sponse. Many of these documents are likely to repeat 
much the same information, while differing in certain 
i Most of these were based on statistical techniques applied to var- 
ious document entities; examples include frait, 1983; Kupiec et al., 
1995; Paice, 1990, Klavans and Shaw, 1995; MeKeown et al., 1995; 
Shaw, 1995; Aon¢ et al., 1997; Boguraev and Kennedy, 1997; Hovy 
and Lin, 1997; Mitra et al., 1997; Teufel and Moens, 1997; Barzilay 
and Elhadad, 1997; Carbonell and Goldstein, 1998; Baldwin and Mor- 
tbn, 1998; Radev and McKeown, 1998; Strzalkowski et al., 1998). 
parts. Summaries of the individual documents would 
help, but are likely to be very similar to each other, un- 
less the summarization system takes into account other 
summaries that have already been generated. Multi- 
document summarization - capable of summarizing ei- 
ther complete documents sets, or single documents in the 
context of previously summarized ones - are likely to 
be essential in such situations. Ideally, multi-document 
summaries should contain the key shared relevant infor- 
mation among all the documents only once, plus other 
information unique to some of the individual documents 
that are directly relevant to the user's query. 
Though many of the same techniques used in single- 
document summarization can also be used in multi- 
document summarization, there are at least four signif- 
icant differences: 
1. The degree of redundancy in information contained 
within a group of topically-related articles is much 
higher than the degree of redundancy within an arti- 
cle, as each article is apt to describe the main point 
as well as necessary shared background. Hence 
anti-redundancy methods are more crucial. 
2. A group of articles may contain a temporal dimen- 
sion, typical in a stream of news reports about an 
unfolding event. Here later information may over- 
ride earlier more tentative or incomplete accounts. 
3. The compression ratio (i.e. the size of the summary 
with respect to the size of the document set) will 
typically be much smaller for collections of dozens 
or hundreds of topically related documents than 
for single document summaries. The SUMMAC 
evaluation (TIPSTER, 1998a) tested 10% compres- 
sion summaries, but in our work summarizing 200- 
document clusters, we find that compression to the 
1% or 0.1% level is required. Summarization be- 
comes significantly more difficult when compres- 
sion demands increase. 
4. The co-reference problem in summarization 
presents even greater challenges for multi- 
document than for single-document summariza- 
tion (Baldwin and Morton, 1998). 
This paper discusses an approach to multi-document 
summarization that builds on previous work in single- 
40 
I 
i 
l 
i 
i 
I 
! 
I 
i 
i 
l 
I 
I 
! 
I 
I 
! 
I, 
I 
document summarization by using additional, available 
information about the document set as a whole, the re- 
lationships between the documents, as well as individual 
documents. 
2 Background and Related Work 
Generating an effective summary requires the summa- 
rizer to select, evaluate, order and aggregate items of 
information according to their relevance to a particular 
subject or purpose. These tasks can either be approx- 
imated by IR techniques or done in greater depth with 
fuller natural  processing. Most previous work 
in summarization has attempted to deal with the issues by 
focusing more on a related, but simpler, problem. With 
text-span deletion the system attempts to delete "less im- 
portant" spans of text from the original document; the 
text that remains is deemed a summary. Work on auto- 
mated document summarization by text span extraction 
dates back at least to work at IBM in the fifties (Luhn, 
1958). Most of the work in sentence extraction applied 
statistical techniques (frequency analysis, variance anal- 
ysis, etc.) to linguistic units such as tokens, names, 
anaphora, etc. More recently, other approaches have 
investigated the utility of discourse structure (Marcu, 
1997), the combination of information extraction and 
 generation (Klavans and Shaw, 1995; McKe- 
own et al., 1995), and using machine learning to find 
patterns in text (Teufel and Moens, 1997; Barzilay and 
Elhadad, 1997; Strzalkowski et al., 1998). 
Some of these approaches to single document summa- 
rization have been extended to deal with multi-document 
summarization (Mani and Bloedern, 1997; Goldstein and 
Carbonell, 1998; TIPSTER, 1998b; Radev and McKe- 
own, 1998; Mani and Bloedorn, 1999; McKeown et al., 
.!999; Stein et al., 1999). These include comparing tem- 
plates filled in by extracting information - using special- 
ized, domain specific knowledge sources - from the doc- 
"ument, and then generating natural  summaries 
from the templates (Radev and McKeown, 1998), com-- 
• paring named-entities - extracted using specialized lists 
- between documents and selecting the most relevant 
section (TIPSTER, 1998b), finding co-reference chains 
in the document set to identify common sections of inter- 
est (TIPSTER, 1998b), or building activation networks 
of related lexical items (identity mappings, synonyms, 
hypernyms, etc.) to extract text spans from the document 
set (Mani and Bloedern, 1997). Another system (Stein et 
al., 1999) creates a multi-document summary from mul- 
tiple single document summaries, an approach that can 
be sub-optimal in some cases, due to the fact that the 
process of generating the final multi-document summary 
takes as input the individual summaries and not the com- 
plete documents. (Particularly if the single-document 
summaries can contain much overlapping information.) 
The Columbia University system (McKeown et al., 1999) 
creates a multi-document summary using machine learn- 
ing and statistical techniques to identify similar sections 
41 
and  generation to reformulate the summary. 
The focus of our approach is a multi-document system 
that can quickly summarize large clusters of similar doc- 
uments (on the order of thousands) while providing the 
key relevant useful information or pointers to such in- 
formation. Our system (1) primarily uses only domain- 
independent techniques, based mainly on fast, statistical 
processing, (2) explicitly deals with the issue of reducing 
redundancy without eliminating potential relevant infor- 
mation, and (3) contains parameterized modules, so that 
different genres or corpora characteristics can be taken 
into account easily. 
3 Requirements for Multi-Document 
Summarization 
There are two types of situations in which multi- 
document summarization would be useful: (1) the user 
is faced with a collection of dis-similar documents and 
wishes to assess the information landscape contained in 
the collection, or (2) there is a collection of topically- 
related documents, extracted from a larger more diverse 
collection as the result of a query, or a topically-cohesive 
cluster. In the first case, if the collection is large enough, 
it only makes sense to first cluster and categorize the doc- 
uments (Yang et al., 1999), and then sample from, or 
summarize each cohesive cluster. Hence, a "summary" 
would constitute of a visualization of the information 
landscape, where features could be clusters or summaries 
thereof. In the second case, it is possible to build a syn- 
thetic textual summary containing the main point(s) of 
the topic, augmented with non-redundant background in- 
formation and/or query-relevant elaborations. This is the 
focus of our work reported here, including the necessity 
to eliminate redundancy among the information content 
of multiple related documents. 
Users' information seeking needs and goals vary 
tremendously. When a group of three people created a 
multi-document summarization of 10 articles about the 
Microsoft Trial from a given day, one summary focused 
on the details presented in court, one on an overall gist 
of the day's events, and the third on a high level view of 
the goals and outcome of the trial. Thus, an ideal multi- 
document summarization would be able to address the 
different levels of detail, which is difficult without natu- 
ral  understanding. An interface for the summa- 
rization system needs to be able to permit the user to en- 
ter information seeking goals, via a query, a background 
interest profile and/or a relevance feedback mechanism. 
Following is a list of requirements for multi-document 
summarization: 
• clustering: The ability to cluster similar documents 
and passages to find related information. 
• coverage: The ability to find and extract the main 
points across documents. 
• anti-redundancy: The ability to minimize redun- 
dancy between passages in the summary. 
*. summary cohesion criteria: The ability to combine 
text passages in a useful manner for the reader.-This 
may include: 
- document ordering: All text segments of high- 
est ranking document, then all segments from 
the next highest ranking document, etc. 
- news-story principle (rank ordering):present 
the most relevant and diverse information first 
so that the reader gets the maximal information 
content even if they stop reading the summary. 
- topic-cohesion: Group together the passages 
by topic clustering using passage similarity cri- 
teria and present the information by the cluster" 
centroid passage rank. 
-time line ordering: Text passages ordered 
based on the occurrence of events in time. 
* coherence: Summaries generated should be read- 
able and relevant to the user. 
. context: Include sufficient context so that the sum- 
mary is understandable to the reader. 
• identification of source inconsistencies: Articles of- 
ten have errors (such as billion reported as million, 
etc.); multi-document summarization must be able 
to recognize and report source inconsistencies. 
• summary updates: A new multi-document summary 
must take into account previous summaries in gen- 
erating new summaries. In such cases, the system 
needs to be able to track and categorize events. 
• effective user interfaces: 
- Attributability: The user needs to be able to 
easily access the source of a given passage. 
This could be the single document summary. 
- Relationship: The user needs to view related 
passages to the text passage shown, which can 
highlight source inconsistencies. 
- Source Selection: The user needs to be able to 
,- select or eliminate various sources. For exam- 
ple, the user may want to eliminate information 
from some less reliable foreign news reporting 
sources. 
- Context: The user needs to be able to zoom 
in on the context surrounding the chosen pas- 
sages. 
- Redirection: The user should be able to high- 
light certain parts of the synthetic summary 
and give a command to the system indicating 
that these parts are to be weighted heavily and 
that other parts are to be given a lesser weight. 
4 Types of Multi-Document Summarizers 
In the previous section we discussed the requirements 
for a multi-document summarization system. Depend- 
ing on a user's information seeking goals, the user may 
want to create summaries that contain primarily the com- 
mon portions of the documents (their intersection) or an 
overview of the entire cluster of documents (a sampling. 
of the space that the documents span). A user may also 
want to have a highly readable summary, an overview of 
pointers (sentences or word lists) to further information, 
• or a combination of the two. Following is a list of var- 
ious methods of creating multi-document summaries by 
extraction: 
1. Summary from Common Sections of Documents: 
Find the important relevant parts that the cluster of 
documents have in common (their intersection) and 
use that as a summary. 
2. Summary from Common Sections and Unique Sec- 
tions of Documents: Find the important relevant 
parts that the cluster of documents have in common 
and the relevant parts that are unique and use that as 
a summary. 
3. Centroid Document Summary: Create a single doc- 
ument summary from the centroid document in the 
• cluster. 
4. Centroid Document plus Outliers Summary: Cre- 
ate a single document summary from the centroid 
document in the cluster and add some representa- 
tion from outlier documents (passages or keyword 
extraction) to provide a fuller coverage of the docu- 
ment set. 2 
5. Latest Document plus Outliers Summary: Create 
a single document summary from the latest time 
stamped document in the cluster (most recent in- 
formation) and add some representation of outlier 
documents to provide a fuller coverage of the docu- 
ment set. 
6. Summary from Common Sections and Unique Sec- 
tions of Documents with Time Weighting Factor: 
Find the important relevant parts that the cluster of 
documents have in common and the relevant parts 
that are unique and weight all the information by 
the time sequence of the documents in which they 
appear and use the result as a summary. This al- 
lows the more recent, often updated information to 
be more likely to be included in the summary. 
There are also much more complicated types of sum- 
mary extracts which involve natural  process- 
ing and/or understanding. These types of summaries in- 
clude: (1) differing points of view within the document 
collection, (2) updates of information within the doc- 
ument collection, (3) updates of information from the 
document collection with respect to an already provided 
summary, (4) the development of an event or subtopic of 
2This is similar to the approach of Textwise fHPSTER, 1998b), 
whose multi-document summary consists of the most relevant para- 
graph and specialized word lists. 
42 
I 
I 
I 
I 
l 
I 
I 
I 
I 
I 
I 
i 
an event (e.g., death tolls) over time, and (5) a compara- 
tive development of an event. 
Naturally, an ideal multi-document summary would 
include a natural  generation component to cre- 
ate cohesive readable summaries (Radev and McKeown, 
1998; McKeown et al., 1999). Our current focus is on 
the extraction of the relevant passages. 
5 System Design 
In the previous sections we discussed the requirements 
and types of multi-document summarization systems. 
This section discusses our current implementation of 
a multi-document summarization system which is de- 
signed to produce summaries that emphasize "relevant 
novelty." Relevant novelty is a metric for minimizing re- 
dundancy and maximizing both relevance and diversity. 
A first approximation to measuring relevant novelty is to 
measure relevance and novelty independently and pro- 
vide a linear combination as the metric. We call this lin- 
ear combination "marginal relevance" .-- i.e., a text pas- 
sage has high marginal relevance if it is both relevant to 
the query and useful for a summary, while having mini- 
mal similarity to previously selected passages. Using this 
metric one can maximize marginal relevance in retrieval 
and summarization, hence we label our method "maxi- 
mal marginal relevance" (MMR) (Carboneli and Gold- 
stein, 1998). 
The Maximal Marginal Relevance Multi-Document 
(MMR-MD) metric is defined in Figure 1. Sirnl and 
Sire2 cover some of the properties that we discussed in 
Section 3. 3 
: For Sirnl, the first term is the cosine similarity metric 
for query and document. The second term computes a 
coverage score for the passage by whether the passage 
is in one or more clusters and the size of the cluster. 
The third term reflects the information content of the pas- 
.sage by taking into account both statistical and linguis- 
tic features for summary inclusion (such as query expan- 
.sion, position of the passage in the document and pres- 
ence/absence of named-entities in the passage). The final 
term indicates the temporal sequence of the document in 
the collection allowing for more recent information to 
have higher weights. 
For Sire2, the first term uses the cosine similarity met- 
ric to compute the similarity between the passage and 
previously selected passages. (This helps the system to 
minimize the possibility of including passages similar to 
ones already selected.) The second term penalizes pas- 
sages that are part of clusters from which other passages 
have already been chosen. The third term penalizes doc- 
uments from which passages have already been selected; 
however, the penalty is inversely proportional to docu- 
ment length, to allow the possibility of longer documents 
3Sirnn and Sirn2 as previously defined in MMR for single- 
document summarization contained only the first term of each equa- 
tion: 
43 
contributing more passages. These latter two terms allow 
for a fuller coverage of the clusters and documents. 
Given the above definition, MMR-MD incrementally 
computes the standard relevance-ranked list - plus some 
additional scoring factors - when the parameter A= 1, and 
computes a maximal diversity ranking among the pas- 
sages in the documents when A=0. For intermediate val- 
ues of A in the interval \[0,1 \], a linear combination of both 
criteria is optimized. In order to sample the information 
space in the general vicinity of the query, small values of 
can be used; to focus on multiple, potentially overlap- 
ping or reinforcing relevant passages, A can be set to a 
value closer to 1. We found that a particularly effective 
search strategy for document retrieval is to start with a 
small A (e.g., A = .3) in order to understand the informa- 
tion space in the region of the query, and then to focus 
on the most important parts using a reformulated query 
(possibly via relevance feedback) and a larger value of 
(e.g., A = .7) (Carboneli and Goldstein, 1998). 
Our multi-document summarizer works as follows: 
• Segment the documents into passages, and index 
them using inverted indices (as used by the IR 
engine). Passages may be phrases, sentences, n- 
sentence chunks, or paragraphs. 
• Identify the passages relevant to the query using 
cosine similarity with a threshold below which the 
passages are discarded. 
• Apply the MMR-MD metric as defined above. De- 
pending on the desired length of the summary, se- 
lect a number of passages to compute passage re- 
dundancy using the cosine similarity metric and use 
the passage similarity scoring as a method of clus- 
tering passages. Users can select the number of pas- 
sages or the amount of compression. 
• Reassemble the selected passages into a summary 
document using one of the summary-cohesion cri- 
teria (see Section 3). 
The results reported in this paper are based on the use 
of the SMART search engine (Buckley, 1985) to compute 
cosine similarities (with a SMART weighting of lnn for 
both queries and passages), stopwords eliminated from 
the indexed data and stemming turned on. 
6 Discussion 
The TIPSTER evaluation corpus provided several sets of 
topical clusters to which we applied MMR-MD summa- 
rization. As an example, consider a set of 200 apartheid- 
related news-wire documents from the Associated Press 
and the Wall Street Journal, spanning the period from 
1988 to 1992. We used the TIPSTER provided topic de- 
scription as the query. These 200 documents were on 
an average 31 sentences in length, with a total of 6115 
sentences. We used the sentence as our summary unit. 
Generating a summary 10 sentences long resulted in a 
MMR-MD ~ Arg max \[A(Siml (Pii, Q, Cij, Di, D)) - (1 - A) max Sirn2 (Pij, Pnm, C, S, Di))\] Pij ER\S t - P,=.. ES 
Sire1 (P,.j, Q, Cij, Di, D) = wl *(Pij'Q)+w2*coverage(Pij, Cij)+wa*content(Pij)+w4*tirne_sequenee(Di, D) 
Sim2 ( Pij, Pare, C, S, Di ) = tOa * ( ff ij " Pnm) + rob * clusters_selected( (7ij , S) + we * documents_selected( Di , S) 
~ov~r~ge(Pi~,C) = ~ wk * Ikl 
kECi./ 
eonlent(Pij) = ~ wtvp,(W) 
WEPij 
tirnesiarap( D,,a=tim, ) - timestamp( Di ) 
time_sequ_ence ( Di, D) = timestamp( Dmaxtime ) - tiraestamp( D,nintime ) 
clusters_selected(C~, S) = IC~ n L.J cv=l 
v,w:P,,,~ES 
documents_selected(Di, S) = ~ = 
where 
Sire1 is the similarity metric for relevance ranking 
Sim~ is the anti-redundancy metric 
D is a document collection 
P is the passages from the documents in that collection (e.g., ~j is passage j from document Di) 
Q is a query or user profile 
R = IR(D, P, Q, 8), i.e., the ranked list of passages from documents retrieved by an IR system, given D, P, Q and a 
' relevance threshold O, below which it will not retrieve passages (O can be degree of match or number of passages) 
._5" is the subset of passages in R already selected 
R\S is the set difference, i.e., the set of as yet unselected passages in R 
' C is the set of passage clusters for the set of documents 
(7vw is the subset of clusters of (7 that contains passage Pvw 
(7~ is the subset of clusters that contain passages from document D~ Ikl 
is the number of passages in the individual cluster k 
IC~,~ N Cijl is the number of clusters in the intersection of (7,,,nand(Tij 
wi..are weights for the terms, which can be optimized 
W is a word in the passage/~j 
type is a particular type of word, e.g., city name IOil 
is the length of document i. 
Figure l: Definition of multi-document summarization algorithm - MMR-MD 
i 
I 
I 
I 
i 
I 
! 
I 
I 
I 
i 
! 
i 
sentence compression ratio of 0.2% and a character com- 
pression of 0.3%, approximately two orders of magni- 
tude different with compression ratios used in single doc- 
ument summarization. The results of summarizing this 
document set with a value of A set to I (effectively query 
relevance, but no MMR-MD) and A set to 0.3 (both query 
relevance and MMR-MD anti-redundancy) are shown in 
Figures 2 and 3 respectively. The summary in Figure 2 
clearly illustrates the need for reducing redundancy and 
maximizing novel information. 
Consider for instance, the summary shown in Figure 2. 
The fact that the ANC is fighting to overthrow the gov- 
44 
i. wsJg10204-0176:1 CAPE TOWN, South Africa - President EW. de Klerk's proposal to repeal the major pillars 
of apartheid drew a generally positive response from black leaders, but African National Congress leader Nelson 
Mandela called on the international community to continue economic sanctions against South Africa until the 
government takes further steps. 
2. AP880803-0082:25 Three Canadian anti-apartheid groups issued a statement urging the government to sever 
diplomatic and economic links with South Africa and aid the African National Congress, the banned group fighting 
the white-dominated government in South Africa. 
3. AP880803-0080:25 Three Canadian anti-apartheid groups issued a statement urging the government to sever 
diplomatic and economic links with South Africa and aid the African National Congress, the banned group fighting 
the white-dominated government in South Africa. 
4. AP880802-0165:23 South Africa says the ANC, the main black group fighting to overthrow South Africa's white 
government, has seven major military bases in Angola, and the Pretoria government wants those bases closed 
down. 
5. AP880212-0060:14 ANGOP quoted the Angolan statement as saying the main causes of confict in the region 
are South Africa's "illegal occupation" of Namibia, South African attacks against its black-ruled neighbors and 
its alleged creation of armed groups to carry out "terrorist a~tivities" in those countries, and the denial of political 
rights to the black majority in South Africa. 
6. AP880823-0069:17 The ANC is the main guerrilla group fighting to overthrow the South African government 
and end apartheid, the system of racial segregation in which South Africa's black majority has no vote in national 
affairs. 
7. AP880803-0158:26 South Africa says the ANC, the main black group fighting to overthrow South Africa's white- 
led government, has seven major military bases in Angola, and it wants those bases closed down. 
8. AP880613-0126:15 The ANC is fighting to topple the South African government and its policy of apartheid, 
under which the nation's 26 million blacks have no voice in national affairs and the 5 million whites control the 
economy and dominate government. 
9. AP880212-0060:13 The African National Congress is the main rebel movement fighting South Africa's white-led 
government and SWAPO is a black guerrilla group fighting for independence for Namibia, which is administered 
by South Africa. 
I0. WSJ870129-0051:1 Secretary of State George Shultz, in a meeting with Oliver Tambo, head of the African 
National Congress, voiced concerns about Soviet influence on the black South African group and the ANC's use 
of violence in the struggle against apartheid. 
Figure 2: Sample multi-document summary with A = 1, news-story-principle ordering (rank order) 
• ernment is mentioned seven times (sentences #2,-#4,#6- 
#9),"which constitutes 70% of the sentences in the sum- 
mary. Furthermore, sentence #3 is an exact duplicate of 
sentence #2, and sentence #7 is almost identical to sen- 
tence #4. In contrast, the summary in Figure 3, generated 
using MMR-MD with a value of A set to 0.3 shows sig- 
nificant improvements in eliminating redundancy. The 
fact that the ANC is fighting to overthrow the govern- 
ment is mentioned only twice (sentences #3,#7), and one 
of these sentences has additional information in it. The 
new summary retained only three of the sentences from 
the earlier summary. 
Counting clearly distinct propositions in both cases, 
yields a 60% greater information content for the MMR- 
MD case, though both summaries are equivalent in 
length. 
When these 200 documents were added to a set of 4 
other topics of 200 documents, yielding a document-set 
with 1000 documents, the query relevant multi-document 
summarization system produced exactly the same re- 
suits. 
We are currently working on constructing datasetsfor 
experimental evaluations of multi-document summariza- 
tion. In order to construct these data sets, we attempted 
to categorize user's information seeking goals for multi- 
document summarization (see Section 3). As can be seen 
in Figure 2, the standard IR technique of using a query to 
extract relevant passages is no longer sufficient for multi- 
document summarization due to redundancy. In addi- 
tion, query relevant extractions cannot capture temporal 
sequencing. The data sets will allow us to measure the 
effects of these, and other features, on multi-document 
summarization quality. 
Specifically, we are constructing sets of 10 documents, 
• which either contain a snapshot of an event from mul- 
tiple sources or the unfoldment of an event over time. 
45 
I 
I 1. WSJ870129-0051 1 Secretary of State George Shultz, in a meeting with Oliver Tambo, head of the African Na- 
tional Congress, voiced concerns about Soviet influence on the black South African group and the ANC's use of 
violence in the struggle against apartheid. 
2. wsJgg0422-0133 44 (See related story: "ANC: Apartheid' s Foes - The Long Struggle: The ANC Is Banned, 
But It Is in the Hearts of a Nation's Blacks -- In South Africa, the Group Survives Assassinations, Government 
Crackdowns n The Black, Green and Gold" - WSJ April 22, 1988) 
3. AP880803-0158 26 South Africa says the ANC, the main black group fighting to overthrow South Africa's white- 
led government, has seven major military bases in Angola, and it wants those bases closed down. 
4. AP880919-0052 5 But activist clergymen from South Africa said the pontiff should have spoken out more force- 
fully against their white-minority government's policies of apartheid, under which 26 million blacks have no say 
in national affairs. 
5. AP890821-0092 10 Besides ending the emergency and lifting bans on anti- apartheid groups and individual ac- 
tivists, the Harare summit's conditions included the removal of all troops from South Africa's black townships, 
releasing all political prisoners and ending political trials and executions, and a government commitment to free 
political discussion. 
6. wsJg00503-0041 11 Pretoria and the ANC remain'far ap~t ontheir vision s for a post-apartheid South Africa: 
The ANC wants a simple one-man, one-vote majority rule system, while the government claims that will lead to 
black domination and insists on constitutional protection of the rights of minorities, including the whites. 
7. WSJ900807-0037 1 JOHANNESBURG, South Africa - The African National Congress suspended its 30-year 
armed struggle against the whiie minority government, clearing the way for the start of negotiations over a new 
constitution based on black-white power sharing. 
8. WSJ900924-011920 The African National Congress, South Africa's main black liberation group, forged its sanc- 
tions strategy as a means of pressuring the government to abandon white-minority rule. 
9. WSJ910702-0053 36 At a, meeting in South Africa this week, the African National Congress, the major black 
group, is expected to take a tough line again st the white-rnn government. 
10. wsJg10204-01761 CAPE TOWN, South Africa - President EW. de Klerk's proposal to repeal the major pillars 
of apartheid drew a generally positive response from black leaders, but African National Congress leader Nelson 
Mandela called on the international community to continue economic sanctions against South Africa until the 
government takes further steps. 
Figure 3: Sample multi-document summary with A = 0.3, time-line ordering 
From these sets we are performing two types of exper- 
iments. In the first, we are examining how users put 
sentences into pre-defined clusters and how they create 
sentence based multi-document summaries. The result 
will also serve as a gold standard for system generated 
summaries - do our systems pick the same summary sen- 
tences as humans and are they picking sentences from 
the same clusters as humans? The second type Of exper- 
iment is designed to determine how users perceive the 
output summary quality. In this experiment, users are 
asked to rate the output sentences from the summarizer 
as good, okay or bad. For the okay or bad sentences, 
they are asked to provide a summary sentence from the 
document set that is "better", i.e., that makes a better set 
of sentences to represent the information content of the 
document set. We are comparing our proposed summa- 
rizer #6 in Section 4 to summarizer #1, the common por- 
tions of the document sets with no anti-redundancy and 
summarizer #3, single document summary of a centroid 
document using our single document summarizer (Gold- 
stein et al., 1999). 
7 Conclusions and Future Work 
This paper presented a statistical method of generating 
extraction based multi-document summaries. It builds 
upon previous work in single-document summarization 
and takes into account some of the major differences be- 
tween single-document and multi-document summariza- 
tion: (i) the need to carefully eliminate redundant infor- 
mation from multiple documents, and achieve high com- 
pression ratios, (ii) take into account information about 
document and passage similarities, and weight different 
passages accordingly, and (iii) take temporal information 
into account. 
Our approach differs from others in several ways: it 
is completely domain-independent, is based mainly on 
fast, statistical processing, it attempts to maximize the 
novelty of the information being selected, and different 
46 
I 
I 
I 
I 
I 
I 
! 
I 
! 
I 
! 
! 
I 
i 
I 
I 
! 
I 
! 
I 
I 
! 
I 
I 
genres or corpora characteristics can be taken into ac- 
count easily. Since our system is not based on the use of 
sophisticated natural  understanding or informa- 
tion extraction techniques, summaries lack co-reference 
resolution, passages may be disjoint from one another, 
and in some cases may have false implicature. 
In future work, we will integrate work on multi- 
document summarization with work on clustering to pro- 
vide summaries for clusters produced by topic detection 
and tracking. We also plan to investigate how to gen- 
erate coherent temporally based event summaries. We 
will also investigate how users can effectively use multi- 
document summarization through interactive interfaces 
to browse and explore large document sets. 

References 
James Allan, Jaime Carbonell, George Doddington,, 
Jonathan Yamron, and Yiming Yang. 1998. Topic de- 
tection and tracking pilot study: Final report. In Pro- 
ceedings of the DARPA Broadcast News Transcription 
and Understanding Workshop. 
Chinatsu Aone, M. E. Okurowski, J. Gorlinsky, and 
B. Larsen. 1997. A scalable summarization sys- 
tem using robust NLP. In Proceedings of the 
ACL'97/EACL'97 Workshop on Intelligent Scalable 
Text Summarization, pages 66-73, Madrid, Spain. 
Breck Baldwin and Thomas S. Morton. 1998. Dy- 
namic coreference-based summarization. In Proceed- 
ings of the Third Conference on Empirical Methods in 
Natural Language Processing (EMNLP-3), Granada, 
Spain, June. 
Regina Barzilay and Michael Elhadad. 1997. Using lex- 
ical chains for text summarization. In Proceedings of 
the ACL'97/EACL'97 Workshop on Intelligent Scal- 
able Text Summarization, pages 10-17, Madrid, Spain. 
Branimir Boguraev and Chris Kennedy. 1997. Salience 
based content characterization of text documents. In 
Proceedings of the ACL'97/EACL'97 Workshop on 
Intelligent Scalable Text Summarization, pages 2-9,. 
Madrid, Spain. 
Chris Buckley. 1985. Implementation of the SMART in- 
formation retrieval system. Technical Report TR 85- 
686, Cornell University. 
Jaime G. Carbonell and Jade Goldstein. 1998. The 
use of MMR, diversity-based reranking for reordering 
documents and producing summaries. In Proceedings 
of SIGIR-98, Melbourne, Australia, August. 
Jade Goldstein and Jaime Carbonell. 1998. The use 
of mmr and diversity-based reranking in document 
reranking and summarization. In Proceedings of the 
14th Twente Workshop on Language Technology in 
Multimedia Information Retrieval, pages 152-166, 
Enschede, the Netherlands, December. 
Jade Goldstein, Mark Kantrowitz, Vibhu O. Mittal, and 
• Jaime G. Carbonell. 1999. Summarizing Text Doc- 
uments: Sentence Selection and Evaluation Metrics. 
Irf Proceedings of the 22nd International ACM SIGIR 
Conference on Research and Development in Informa- 
tion Retrieval (S1G1R-99), pages 121-128, Berkeley, 
CA. 
Eduard Hovy and Chin-Yew Lin. 1997. Automated text 
summarization in SUMMARIST. In ACUEACL-97 
Workshop on Intelligent Scalable Text Summarization, 
pages 18-24, Madrid, Spain, July. 
Judith L. Klavans and James Shaw. 1995. Lexical se- 
mantics in summarization. In Proceedings of the First 
Annual Workshop of the IFIP Working Group FOR 
NLP and KR, Nantes, France, April. 
Julian M. Kupiec, Jan Pedersen, and Francine Chen. 
1995. A trainable document summarizer. In Proceed- 
ings of the 18th Annual Int. ACM/SIG1R Coaference 
on Research and Development in IR, pages 68-73, 
Seattle, WA, July. 
P. H. Luhn. 1958. Automatic creation of literature ab- 
stracts. IBM Journal, pages 159-165. 
Inderjeet Mani and Eric Bloedern. 1997. Multi- 
document summarization by graph search and merg- 
ing. In Proceedings of AAA1-97, pages 622--628. 
AAAI. 
Inderjeet Mani and Eric Bloedom. 1999. Summarizing 
similarities and differences among related documents. 
Information Retrieval, 1:35-67. 
Daniel'Marcu. 1997. From discourse structures to text 
summaries. In Proceedings of the ACL'97/EACL'97 
Workshop on Intelligent Scalable Text Summarization, 
pages 82-88, Madrid, Spain. 
Kathleen McKeown, Jacques Robin, and Karen Kukich. 
1995. Designing and evaluating a new revision-based 
model for summary generation. Info. Proc. and Man- 
agement, 31 (5). 
Kathleen McKeown, Judith Klavans, Vasileios Hatzivas- 
siloglou, Regina Barzilay, and Eleazar Eskin. 1999. 
Towards Multidocument Summarization by Reformu- 
lation: Progress and Prospects. In Proceedings of 
AAAI-99, pages 453--460, Orlando, FL, July. 
Mandar Mitra, Amit Singhal, and Chris Buckley. 1997. 
Automatic text summarization by paragraph extrac- 
tion. In ACL/EACL-97 Workshop on Intelligent Scal- 
able Text Summarization, pages 31-36, Madrid, Spain, 
July. 
Chris D. Paice. 1990. Constructing literature abstracts 
by computer: Techniques and prospects. Info. Proc. 
and Management, 26:171-186. 
Dragomir Radev and Kathy McKeown. 1998. Generat- 
ing natural  summaries from multiple online 
sources. Compuutational Linguistics. 
Gerald Salttm. 1970. Automatic processing of foreign 
 docuemnts. Journal of American Society for 
Information Sciences, 21:187-194. 
Gerald Salton. 1989. Automatic Text Processing: The 
Transformation, Analysis, and Retrieval of Informa- 
tion by Computer. Addison-Wesley. 
James Shaw. 1995. Conciseness through aggregation in 
text generation. In Proceedings of 33rd Association 
for Computational Linguistics, pages 329-331. 
Gees C. Stein, Tomek Strzalkowski, and G. Bowden 
Wise. 1999. Summarizing Multiple Documents Us- 
ing Text Extraction and Interactive Clustering. In Pro- 
ceedings of PacLing-99: The Pacific Rim Conference 
on Computational Linguistics, pages 200-208, Water- 
loo, Canada. 
Tomek Strzalkowski, Jin Wang, and Bowden Wise. 
1998. A robust practical text summarization system. 
In AAAI Intelligent Text Summarization Workshop, 
pages 26-30, Stanford, CA, March. 
J. I. Tait. 1983. Automatic Summarizing of English 
Texts. Ph.D. thesis, University of Cambridge, Cam- 
bridge, UK. 
Simone Teufel and Marc Moens. 1997. Sentence ex- 
traction as a classification task. In ACL/EACL-97 
Workshop on Intelligent Scalable Text Summarization, 
pages 58-65, Madrid, Spain, July. 
TIPSTER. 1998a. Tipster text phase III 18-month work- 
shop notes, May. Fairfax, VA. 
TIPSTER. 1998b. Tipster text phase III 24-month work- 
shop notes, October. Baltimore, MD. 
Charles J. van Rijsbergen. 1979. Information Retrieval. 
Butterworths, London. 
Yiming Yang, Tom Pierce, and Jaime 13. Carbonell. 
1998. A study on retrospective and on-line event de- 
tection. In Proceedings of the 21th Ann lnt ACM SI- 
G1R Conference on Research and Development in In- 
formation Retrieval ( SIGIR'98), pages 28-36. 
:Yiming Yang, Jaime G. Carbonell, Ralf D. Brown, 
Tom Pierce, Brian T. Archibald, and Xin Liu. 1999. 
Learning approaches for topic detection and tracking 
. news events. IEEE Intelligent Systems, Special Issue 
on Applications of Intelligent Information Retrieval, 
14(4):32-43, July/August. 
