Automated Text Summarization in SUMMARIST 
Eduard Hovy and ChinYew Lin 
Informatmn Sciences Institute 
of the University of Southern California 
4676 Admiralty Way 
Marina del Rey, CA 90292-6695, U S A 
tel +1-310-822-1511 
fax +1-310-823-6714 
emafl {hovy,cyl}@ml edu 
Abstract 
SUMMARIST is an attempt to create a robust 
automated text summanzaUon system, based 
on the 'equation' summarization = topw 
Ment:ficatwn + mterpretatwn + generatwn 
We descnbe the system's arclutecture and 
provide detmls of some of its modules 
1 Introduction 
1.1 Summary: Extract or Abstract? 
The task of a Summarizer ~s to produce a 
synopsis of any document (or set of documents) 
submitted to It These synopses may range from a 
list of isolated keywords that mdlcate the major 
content of the document(s), through a hst of 
independent single sentences that express the 
major content, all the way up to a coherent, fully 
planned and generated paragraph that compresses 
the document The more sophmUcated a synopsis, 
the more effort it generally takes to produce 
Several exJstmg systems, mcludmg some Web 
browsers, claun to perform text summarization 
However, even a cursory analysis of their output 
shows that their so-called summaries are actually 
portions of the text, produced verbatim While 
there is nothing wrong with such extracts, per se, a 
truly comprehensive and mformaUve text 
summary fuses together various concepts of the 
text Into a sinaller number of concepts, to form 
an abstract We define extracts as consistmg 
wholly of pomons extracted verbatim from the 
ongmal (they may be single words or whole 
passages) and abstracts as conssstmg of novel 
phrasings describing the content of the original 
(which might be paraphrases or fully newly 
synthesized text) Generally, producing an 
abstract requires stages of topic fusion and text 
generation not needed for extracts '- 
L2 SUMMARIST 
Over the past two years we have been 
developing . the text summarization, system. 
SUMMARIST- In this paper, we describe its 
structure and provide de~ls on the evaluated 
results of two of its component modules 
The goal of SUMMARIST is to provide both 
extracts and absWacts for arbitrary English (and 
later, other-language) input text SUMMARIST 
combines symbolic world knowledge (embodied m 
WordNet, dicUonanes, and s~mxlar resources) with 
robust NLP processing (using IR and statistical 
techniques) to overcome the problems endemic to 
either approach alone These .problems arise 
because exmtmg robust NLP methods tend to 
operate at the word level, and hence miss concept- 
level generalizations, which are provided by 
symbolic world knowledge, whale on the other 
hand symbolic knowledge is too difficult to acqmre 
m large enough scale to provide coverage and 
robustness. For robust summarization, both 
aspects are needed 
The heart of abstract formation Is the 
interpretation process performed to fuse concepts 
This step occurs in the middle of the 
summarization procedure, to find the appropriate 
set of concepts in an Input text, an initial stage of 
concept identification and extraction is required, 
to produce the summary, a final stage of 
generation Is needed Thus SUMMARIST IS based 
on the following 'equatson' 
summanzauon = topic ,denttficat, on + 
mterpretatwn + generation 
18 
II 
II 
II 
I 
i 
l 
i 
I 
I! 
i 
I 
I 
l 
! 
i 
I 
I 
I 
II 
This breakdown is motivated as follows 
1. Identification" Select or filter the input to 
determine the most important, central, topics 
For generahty we assume that a text can have 
many (sub)-toplcs, and that the topic extraction 
process can be parametertzed to include more or 
fewer of them to produce longer or shorter 
summaries 
2. Interpretation Sunply aggregatmg 
together frequently mentmned portions of the 
input text does not m itself make an abstract 
What are the central, most important, concepts m 
the following story9 
John and Bdl wanted money 
They bought ski-masks and guns 
and stole an old car from a 
netghbor Wearing their ski- 
masks and wavmg their guns, 
the two entered the bank, and 
within minutes left the bank 
with several bags of $100 bdls 
They drove away happy, 
throwing away the ski-masks 
and guns m a sidewalk trash can 
They were never caught 
The popular method of sunple word counting 
would indicate that the story is about sk|-masks 
and guns, both of which are mentmned three 
times, more than any other word Clearly, 
however, the story is about a robbery, and any 
summary of It must menUon th|s fact Some 
process of interpreting the mdlwdual words as part 
of some encompassing concept is requued One 
such process, word clustenng, ~s an essentml 
technique for topic =dent=ficaUon m IR This 
techmque would match the words "gun", "mask", 
"money", "caught", "stole", etc, against the set 
of words that form the so-called signature for the 
word "robbery" Other, more soph|sttcated forms 
of word clustering and fusion are possible, 
mcludmg script matchmg, deductive reference, and 
concept clustenng 
3. Generation Two options exist either the 
output is a verbatim quotaUon of some portion(s) 
of the input, or ~t must be generated anew In the 
former case, no generator is needed, but the output 
is not lflcely to be htgh-quahty text (although this 
might be sufficient for the apphcatlon) 
2 The Structure of SUMMARIST 
For each of the three steps of the above 
'equation', SUMMARIST uses a mixture of 
symbolic world knowledge (from WordNvt and 
slmdar resources) and statistical or IR-based 
techniques Each stage employs several different, 
complementary, methods (SUMMARIST will 
eventually contain several modules m each stage) 
To date, we have developed some methods for 
each stage of processing, and are busy developing 
additional methods and lmkmg them rote a single 
system In the next sections we describe one 
method from each stage The overall architecture 
is shown m Figure 1 
Figure 1 Architecture of SUMMARIST 
19 
I 
s e, the title (TI) is the most hkely to bear topics, I 
2,1 Tonic Identification followed by the first sentence of paragraph 2, the 
~- • first sentence of paragraph 3, etc In contrast, for II 
Several techmques for topic identification have the Wall Street Journal the OPP is B 
been reported in the hterature, including methods r-r1 P1S1 P1S2 1 
based on Posmon \[Luhn 58, Edmundson 69\], Cue L- -, , . , J Ill 
Phrases \[Baxendale 58\], word frequency, and Evaluation. We evaluated the OPP method m El 
Discourse Segmentation ~darcu 97\] various ways. In one of them, coverage.is the m 
.... . . t?action of the (human-supphed) keywords that 
we ae~oe nere just our work on are included verbatim m the sentences selected II 
SUMMARIST s Positaon module. This method under the nohcv (A random selectmn nohcv 
exploits the fact that m some genres, regularities would extra~ct sentences with a random di~._.bution 
of d~conrse structure _ and/or, methods of of topics, a good position policy would extract 
exposmon mean mat certain sentence posmons rich ton c anna ~ t e II i-be___,~ sentences We measured _h_ El 
tend to carry more topic matenal than others ...... r- - ......... ....... effectiveness of an OPP by taking cumulatively I 
we aermea me upt:ma: J\["O..flllOn J"Oll~ ~U~I'*) as mnr'~, nf ItQ e~ntenc~.~ first m~t th~ bile. then th~ 
a list that indicates m what ordmal positaons in the t,-tl~'~l'~,~'p?gl""~'nd-~n'~n-" Oln'n--ra~rt'~"dot~'~-~ 
text hlgh-topm-beanng sentences occur We th'~.-e~F~t'o~'muTtl~n-rd'kev'nl~ra~'~'~v~m'ate.'"~"d I 
developed a method of automatically trammg new ~n~ w"-~'n'an~w nf';~c'~ea"~,~o-~-~ l"~,~'~'~v~ 
OPPs, given a collection of genre-related texts ~v~r'~l~"~'~l~'~'~'e"~-~'~nr"~e ~-hn~v'~ ,~ 
with keywords This work, descnbed in \[Lm and ~;~~. 9-'h~-,~l~'-~--~l~vn-hv-~v~d~w'~,~,~'- ~',~',,~ II 
Ho'vy 97a\], Is the first systematic study and t~ge~'~e~ ' t'h'e"'muit-'~-w"-o~ co'ntn%ut'm-ns "(wm~do-~, 
evaluation ot the Position method reported sizes 1 to 5) in the top ten sentence posmons 
For the Ziff-Davis corpus (13,000 newspaper (R10), the columns reach 95% over an extract of II 
articles announcing computer products) we have 10 sentences (approx 15% of a typm.al Zlff-Daws 
found that the OPP ss text) an extremely encouraging result 
IT1, P2S1, P3S1, P4S1, PIS1, P2S2, 
{P3S2, P4S2, P5S1, P1S2}, P6S1, \] U 
1 | 
mOB 
II 06 u " '~" " ......... _ .¢ u4 
II 
~03 ." i - m 
°°'°' : i 0 
• -" C~I 03 ~" tO ¢0 ~ ~0 O~ 0 
OPP POSITIONS 
Figure 2 Coverage scores for top ten OPP sentence posiuons, window sizes 1 to 5. 
20 
2.2 Topic Interpretation (Concept 
Fusion) 
The second step m the summarization process 
is that of concept interpretation In this step, a 
collection of extracted concepts are 'fused' into 
their one (or more) higher-level unifying 
concept(s) Concept fusion can be as simple as 
part-whole construction, for example when wheel, 
chain, pedal, saddle, hght,.frame, and handlebars 
together fuse to bicycle Generally, though, R Is 
more complex, ranging from dlrect concept/word 
• clustering as used in IR \[Paice 90\] to scnptally 
based inference as in scripts \[Schank and Abelson 
77\] 
Fusing topics into one or more characterizing 
concepts m the most difficult step of automated 
text summarizatmn Here, too, a variety of 
methods can be employed All of them assocmte a 
set of concepts (the mdwators) with a 
characteristic generahzatlon (the fuser or head) 
The challenge is to develop methods that work 
reliably and to construct a large enough collectmn 
of mdicator-fuser sets to achieve effective topic 
reduction 
SUMMARIST's topic interpretation methods 
currently include Concept Wavefiont \[Lm 95\] and 
Concept Signature \[Lm and Hovy 97b\] 
2.2.1 Concept Counting and the 
Wave front 
A topw is a particular subject that we write 
about or d~cuss To identify the topics of texts, 
IR researchers make the assumptmn that the more 
a word Is used in a text, the more important it Is m 
that text. But although word frequency counting 
operates robustly across different domains without 
relying on stereotypical text structure or semantic 
models, they cannot handle synonyms, 
pronommahzation, and other forms of 
coreferentlahty Furthermore, word counting 
misses conceptual generahzatmns 
John bought some vegetables, fi'u:t, 
bread, and milk -4 John bought some 
grocer:es 
The word counting method must be extended to 
recognize that vegetables, frmt, etc, relate to 
grocerzes Recogmzing this inherent problem, 
people started using Amficml Intelhgence 
techmques \[Jacobs 90, Mauldm 91\] and statistical 
techmques \[Salton et al 94\] to incorporate 
semantac relataons among words Following this 
trend, we have developed a new way to ldenUfy 
topics by counting concepts instead of words, and 
genemhzmg them using a concept generahzatlon 
taxonomy As approximation to such a 
hierarchy, we employ WordNet \[Miller et al 90\] 
(though we could have used any machine-readable 
thesaurus) for inter-concept relatedness links In 
the hmR case, when WordNet does not contain 
the words, this technique defaults to word 
counting 
As described in \[Lm 95\], we locate the most 
appropnate generalization somewhere m middle of 
the taxonomy by finding concepts on the 
interesting wavefront, a set of nodes representing 
concepts that each generalize a set of 
approximately equally strongly represented 
subconcepts (ones that have no obvious dominant 
subconcept to specmhze to) 
Evaluation: We selected 26 amcles about new 
computer products from BusmessWeek (1993-94) 
of average 750 words each For each text we 
extracted the eight sentences containing the most 
interesting concepts using the wavefront 
technique, and comparing them to the contents of 
a professional's abstracts of these 26 texts from 
an onhne service We developed several weighting 
and scoring variations and tried various rauo and 
depth parameter se~ngs for the algorithm We 
also implemented a random sentence selectmn 
algorithm as a baseline comparison 
The average recall (R) and precision (P) values 
over the three sconng vanatmns were RffiO 32 and 
pro 35, when the system produces extracts of 8 
sentences" In comparison, the random selection 
method had RffiO 18 and PffiO 22 precismn in the 
same experimental setting While these R and P 
values are not tremendous, they show that 
semantic knowledge--even as limited as thatm 
WordNet--does enable unprovements over 
traditional IR word-based techniques However, 
the hm~tations of WordNet are serious drawbacks 
there is no domain-specific knowledge, for 
example to relate customer, waiter, cashier, food, 
and menu together with restaurant We thus 
developed a second technique of concept 
interpretation, using category s:gnatures We 
discuss this next 
21 
2.2.2 Interpretation using Signatures 
• Can one automatically find a set of related 
words that can collectwely be fused into a single 
word9 To test thai§ .Idea we developed the Concept 
Signature method \[Lm and Hovy 97b\] We 
defined a signature to be a list of word mdlcators, 
each with relatwe strength of associatmn, jointly 
associated with the signature head. 
To construct signatures automatically, we used 
a set of 30,000 texts from the Wall Street Journal 
(1987) The Journal editors have classified each 
text into one of 32 classes---AROspace, BNKmg, 
ENVironment, •TELecommunications, etc We 
counted the occurrences of each content word 
(canonicalized morphologically to remove• plurals, 
etc ), m the texts of a class, relative to the number 
of tunes they occur m the whole corpus (this is 
the standard tftdf method) We then selected the 
top-sconng 300 terms for each category and 
created a signature with the category name as its 
head The top terms of four example slgnatures 
are shown m Figure 3 It is qmte easy to 
determine the idenUty of the signature head just 
by mspecUng the top few signature mdlcators 
RANK ARO "• BNK ENV TEL 
t contract bank epa at&t 
2 air_force thnft waste network 
3 aircraft banking environmental fcc 
4 navy loan water cbs 
5 army. mr ozone cable 
6 space deposit state bell 
7 missile board • " incinerator long-distance 
8 equipment fslic •agency telephone 
9 mcdonnell fed clean telecomm 
10 northrop institution landfill mcl 
11 nasa federal hazardous mr 
t 2 pentagon fdlc acid_ram doctrine 
13 defense volcker standard service 
'14 receive henkel federal news 
15 boeing banker lake turner 
Figure 3 Pomons of the signatures of several concepts 
• SUMMARIST will use signatures for summary 
creatmn as follows After the topic identification 
module(s) ldentifyhes a set of words or concepts, 
the signature-based concept interpretation module 
wdl iden~fy the most pertinent signatures 
subsummg the topic words, and the signature's 
head concept will then be used as the summarizing 
fuser concepts Matching the identified topic 
terms against all signature indicators involves 
several problems, mcludmg takmg rote account 
the relative frequencies of occurrence and 
resolwng matches wRh muRlple signatures, and 
specifying thresholds of acceptablhty 
Evaluation. First, however, we had to 
evaluate the quality of the signatures formed by 
our algorithm Recogmzmg the similarity of 
signature recognmon to document categorization, 
we evaluated the effectiveness of each signature by 
seeing how well R serves as a selectmn criterion on 
new texts As data we used a set of• 2,204 
prewously unseen WSJ news articles from 1988 
For each test text, we created a single-text 
'document signature' usmg the same ~f:dfmeasure 
as before, and then matched this document 
signature against the category signatures The 
closest match provided the class mto which the 
text was categorized We tested four different 
matching functions, mcludmg a simple binary 
match (count 1 if a term match occurs, 0 
otherwise), curve-fit match (mimmtze the 
difference m occurrence frequency of each term 
between document and concept signatures), and 
cosine match (mmma~ze the cosine angle in the 
hyperspace formed when each signature is viewed 
as a vector and each word frequency specifies the 
distance along the dimension for that word) 
22 
I 
I 
I 
I 
I 
I 
I 
l 
l 
I 
I 
I 
I 
I 
I 
I 
I 
l 
I 
These matching functions all prowded 
approximately the same results The values for 
Recall and Precmion ' (R--0 756625 and 
P---0 69309375). are very . encouraging and 
compare well wah recent IR results \[TREC 95\] 
Extending this work will reqmre the crealaon of 
concept signatures for hundreds, and eventually 
thousands, of different topics needed for robust 
summartzatlon We plan to mvestagate the 
effectiveness of a varterty of methods for doing 
this 
2.3 Summary Generation 
The final step in the summarization process hs 
to generate the summary, conslstmg of the fused 
concepts, m Enghsh A range of posslbdmes 
occurs here, from sunple concept printing to 
sophlsUcated sentence p!annmg and surface-form 
reahzat~on Although, as mentioned m Section 1, 
s~mple extract summaries reqmre no generatton 
stage, eventually SUMMARIST wdl contain three 
generation modules, assocmted as approprmte with 
the various levels for various apphcatlons 
1 Topzc output Sometimes no summary Is 
really needed, a simple hst of the summartzmg 
topics ~s enough SUMMARIST wall print the fuser 
concepts produced by stage 2 of the process, 
sorted by decreasing nnportance 
2 Phrase concatenatzon SUMMARIST wdl 
mclude a rudimentary generator that composes 
noun phrase- and clause-stzed umts into stmple 
sentences It wdl extract the noun phrases and 
clauses from the mput text, by following hnks 
from the fuser concepts through the words .that 
support them back into the mput text 
3 Full sentence planmng and generatton 
SUMMARIST wdl employ the sentence planner 
being bruit at ISI (m collaboration with the 
HealthDoe project from the Umverslty of 
Waterloo) \[Hovy and Wanner 96\], together with 
a sentence generator such as Penman \[Penman 88, 
Matthlessen and Bateman 91\], FUF \[Eihadad 92\], 
or NitroGen \[Kmght and Hatmvassdoglou 95\] to 
produce well-formed, fluent, summaries, takmg as 
input the fuser concepts and their most closely 
related concepts as Identified by SUMMARIST's 
topic ldenUficatlon stage 
3 Conclusion 
As outhned .in. Section I, extract summaries 
reqmre only the stage of topic identification By 
including modules to perform topic interpretation 
and summary generaUon, SUlVlMARIST will also be 
able to produce abstract summaries How well ~t 
wdl do so ts a matter for future mvemgatlon 
An important aspect to be addressed is the 
combination of the outputs of various modules m 
each stage We plan to investigate different 
approaches, from a simple combination by votes 
to methods for automattcally training relattve 
strengths of contribution 
Automated summarmatlon LS sunultaneously an 
old topic--work on tt dates from the 1950's----and 
a new toplc--tt ts so difficult that mterestlng 
headway can be made for many years to come 
We are excited about the posslbflmes offered by 
the combination of semantic and statmtlcal 
techmques m what is, qmte possibly, the most 
complex task of all NLP 

References 
\[Baxendale 58\] Baxendale, PB 1958 
Machine-made index for techmcal 
hteraturePan experiment IBM Journal 
(354-361), October 
\[Edmundson 69\] Edmundson, H P 1968 
New methods m automaUc extraeuon In ?, (264---285) 
\[Elhadad 92\] EIhadad, M 1992 Using 
Argumentatton to Control Lexlcal 
Chmce A Functional Un~catton-Based 
Approach Ph D dlssertatton, Columbia 
Umverslty 
\[Hovy and Wanner 96\] Hovy, E H and L 
Wanner 1996 Managing Sentence 
Planning Reqmrements In Proceedmgs 
of the Workshop on Planmng and 
Generatton (with ECAI) Budapest, 
Hungary 
\[Jacobs 90\] Jacobs, P S and L F Rau 1990 
SCISOR Extracting mformauon from 
on-hne news Commumcattons of the 
ACM 33(11), (88-97) 
\[Kmght and Hatmvassdoglou 95\] Kmght, K 
-and V Hatmvassdoglou 1995 Two- 
level many-paths - generation In 
Proceedings of the 33rd ACL 
Conference, Boston, MA 
\[Lm 95\] Lm, C Y .1995 Topic 
Identlficalaon by Concept 
..,. Generahzatlon. In Proceedmgs of the 33rd ACL Conference, 
Boston, MA -~.. 
\[Lln and Hovy 97a\] Lm, C Y and EH 
Hovy 1997a Identifying Topics by 
PosRlon In Proceedmgs of the Apphed 
Natural Language Processmg Conference, 
Washington, DC 
\[Lm and Hovy 97b\] Lm, CY and EH 
Hovy . 1997b Automatic Text 
Categonzaaon A Concept-Based 
Approach In prep 
\[Luhn 58\] Luhn, H P 1959 The automatic 
creauon of hterature abstracts IBM 
Journal of Research and Development (159-165) 
\[Marcu 97\] Marcu, D 1997 The Rhetoncal 
Parsing of Natural Language Texts 
Subrmtted 
\[Matthlesseu and Bateman 91\] Matthlessen, 
CM.IM and JA Bateman 1991 Text Generatwn and Systemtc-Funcuonal 
Lmgulsucs London, England Prater 
\[Manldm 91\] Mauldm, ML 1991 
Conceptual lnformauon Retneval--A Case Study m Adapttve Parttal Parsmg 
Kluwer Acadermc Pubhshers, Boston, MA 
\[McKeown and Radev 95\] McKeown, K R 
and D R Radev 1995 Generating 
surnmanes of muluple news amcles In Proceedings of the 18th Internauonal 
ACM SIGIR Conference, (74-82), Seattle, WA 
\[1Vhller et ai 90\] Miller, G R BeckwRh, C 
Fellbaum, D. Gross, and K Miller 1990 
Five papers on WordNet CSL Report 43, 
Cognmve Science Laboratory, Princeton 
UmversRy, Princeton, NJ 
\[Pmce 90\] Pmce, C D 1990 Constructing 
literature abstracts by computer 
Techniques and prospects Informatzon Processing and Management, 
26(1), 
(171-186) 
\[Penman 88\] The Penman Pruner, User 
Grade, and Reference Manual 1988 
Unpubhshed documentatmn, USC 
Information Sciences Insumte 
\[Salton et al. 94\] Salton, G, J Allen, C 
Buckley, and A Smghal 1994 
AutomaUc analym, theme generauon, 
and summarization of machine-readable 
texts SSczence 264, (1421-1426), June 
\[Schank and Abelson 77\] Schank, R C and 
R P Abelson 1977 Scripts, Plans, 
Goals, and Understanding Lawrence 
Erlbaum Associates, Hlllsdale, N.l 
\[TREC 95\] Harman, D (ed) 1995 
Proceedings of the TREC Conference. 
