A Scalable Summarization System Using Robust NLP 
Chinatsu Aonet, Mary Ellen Okurowski~ t, James Gorlinskyt, Bjornar Larsent 
tSRA InternatlonM 
4300 Fair Lakes Court 
Falrfax, VA 22033 
{aonec, gorhnsk, larsenb}@sra corn 
~Depaxtment of Defense 
9800 Savage Road 
Fort George G Meade, MD 20755-6000 
meokuro@afterhfe ncsc nnl 
Abstract 
We describe a scalable summarization sys- 
tem which takes advantage of robust NLP 
technology such as corpus-based statlsh- 
cal NLP techmques, information extrac- 
tmn and readily available on-hne resources 
The system attempts to compensate for the 
bottlenecks of traditional frequency-based, 
knowledge-based or discourse-based sum- 
manzatlon approaches by uhhzlng features 
derived by these robust techniques Pre- 
hrmnary evaluation results are reported, 
and the multi-dimensional summary viewer 
is described 
1 Introduction 
Summarization research and system development 
can be broadly characterized as frequency-based, 
knowledge-based or discourse-based These cate- 
gories correspond to a continuum of increasing un- 
derstanding of a text and increasing complextty in 
text processing 
Earliest attempts at summarization (Luhn, 1958, 
Edmundson, 1969, Rush, Salvador, and Zamora, 
1971) essentially rehed on lexlcal and locahonal m- 
formation within the text, 1 e, frequency of words 
or key terms, their proxnmty, and locatmn within 
the text More recent adaptations of tlns approach 
have employed an automated method to combine 
these types of feature sets through classification 
techniques (Kupmc, Pedersen, and Chen, 1995) O r 
have drawn upon tradlhonal information retrieval 
indexing methods to incorporate knowledge of a text 
corpus (Brandow, Mltze, and Rau, 1995) To a large 
extent, these types of shallow approaches are igno- 
rant of dommn knowledge and the text macrostruc- 
ture They create summaries by extracting sentences 
from the original document 
Knowledge-based approaches generally depend on 
rich domain knowledge sources to interpret the 
conceptual structure of the text Systems like 
TOPIC (Relmer and Hahn, 1988), SUSY (Fum, 
Gmda, and Tasso, 1985) or SCISORS (Ran, Ja- 
cobs, and Zermk, 1989) parse domaan specific texts 
and create conceptual representahons for the gener- 
ation of text summarms These types of knowledge- 
ba.~d systems apply knowledge of the domain to 
characterize specific conceptual knowledge of a text 
Palce (Pvace and Jones, 1993) provides a good ex- 
ample of the role of thts conceptual mformahon and 
thloff(Rlloff, 1995) gives a method for automahcally 
identifying relevant concepts lughly correlated with 
a category of interest Because these systems create 
a rich conceptual representation, there are multiple 
ways m whlcha text summary may be created For 
example, SUMMONS (McKeown and Radev; 1995) 
generates a text summary from such a template rep- 
resentahon, whle (Maybury, 1995) describes mulh- 
pie methods for selecting events and presenting event 
summaries Knowledge-based approaches are usu- 
ally very knowledge-intensive and domvan-specific 
Discourse-based approaches are grounded m theo- 
rms of text cohesion and coherence and vary conmd- 
erably m how much they push the lmnts of text un- 
derstanding and the complemty as well as automa- 
hon of that processing Spearheaded by the lack 
of cohesion and coherence m extracts produced by 
frequency-based approaches, much of the work typi- 
fying discourse-based approaches focuses on lmgms- 
tic processing of the text to identify the best cohe- 
sive sentence candidates (Palce, 1990, Johnson et al , 
1993) or the best sentence candidates for represent- 
" mg the rhetorical structure of the text (Mnke et al, 
• 1994) Both approaches revolve parsing the text and 
analyzing dlscoarse relations to select sentences for 
extractmn 
Frequency-based approaches (Brandow, Mltze, 
and Rau, 1995) may incorporate heurmhcs to handle 
readabilityrelated issues and knowledge-based ap- 
proaches • systematically perform discourse process- 
mg m analyzing and condeusmg the text, but m 
a broad classificatmn schema It is the discourse- 
based approaches that tend to focus on the text 
macrostructure and surface clues to that structure 
At the far end of the continuum lies work by Sparck 
Jones (Jones, 1993, Jones, 1995) m describing a 
66 
! 
! 
m, 
l 
i 
\[ 
ii 
li 
I 
i 
Ii 
a 
! 
I 
Ii 
il 
l 
1 
manual method for source texi~ representatlon based 
on hngmstlc, dommn and communlcatlve reforma- 
tion From the NLP technology point of wew, din- 
course theory LS the least understood among sub- 
fields of hnguistlcs 
Our work addresses challenges encountered in 
these previous approaches by applying robust and 
proven NLP techmques such as corpus-based sta- 
tmtlcal NLP,. robust mforrmatlon extractlon, and 
readlly-avmlable on-hne NLP resources These tech- 
tuques and resources allow us to create a richer in- 
dexed source of Imgmstlc and domain knowledge 
than other frequency approaches Our approach 
attempts to apprommate text dlscourse structure 
through these multlple layers of mformatlon, oh- 
tinned from automated methods m contrast to 
labor-lntenslve, discourse-based approaches More- 
over, our planned training methodology will also al- 
low us to explmt thin productlve infrastructure m 
ways whlch model human performance whde avoid- 
mg hand-crafting domain-dependent rules of the 
knowledge-based approaches Our ultlmate goal m 
to make our summarlzatlon system scalable and 
portable by learning summarization rules from easily 
extractable text features 
2 System Description 
Our summarization system DlmSum consmts of 
the Summarization Server and the Summarlzatzon 
Chent The Server extracts features (the Feature 
Extractor) from a document using various robust 
NLP techmques, described In Sectzon 2 1, and com- 
bines these features (the Feature Combiner) to base- 
hne multiple combinations of features, as described 
m Section 2 2 Our work m progress to automatt- 
cally tram the Feature Combiner based upon user 
and apphcatlon needs m presented in Section 2 2 2 
The Java-based Chent, which wdl be dmcnssed In 
Section 4, provides a graphical user interface (GUI) 
for the end user to cnstomlze the summamzatlon 
preferences and see multiple views of generated sum- 
Inarles 
2.1 Extracting Stlmmarization Features 
In this section, we describe how we apply robust 
NLP technology to extract summarization features 
Our goal IS to add more mtelhgence to frequency- 
based approaches, to acqmre domain knowledge In 
a more automated fashion, and to apprommate text 
structure by recogmzing sources of dmcourse cohe- 
sion and coherence 
2.1.1 Going Beyond a Word 
Frequency-based summarization systems typically 
use a single word stnng as a umt for counting fre- 
quencies Whde such a method IS very robust, it 
ignores the semantic content of words and their po- 
tential membership m multi-word phrases For ex- 
ample, zt does not dmtmgumh between "bill" m "Bdl 
Table 1 Collocations with "chlps" 
{potato tortdla corn chocolate b~gle} chips 
{computer pentmm Intel macroprocessor memory} chips 
{wood oak plastlc} cchlps 
bsrgmmng clups 
blue clups 
mr chips 
Clmton" and "bill" in "reform bill" This may intro- 
duce noise m frequency counting as the same strmgs 
are treated umformly no matter how the context 
may have dmamblguated the sense or regardless of 
membership in multl-word phrases For DlrnSum, 
we use term frequency based on tf*Idf (Salton and 
McGdl, 1983, Brandow, Mitze, and Rau, 1995) to 
derive ssgnature words as one of the summarization 
features If single words were the sole basra of count- 
mg for our summarization application, nome would 
be introduced both m term frequency and reverse 
document frequency 
However, recent advances in statmtlcal NLP and 
information extraction make it possible to utilize fea- 
tures which go beyond the single word level Our 
approach is to extract multi-word phrases automat- 
lcally with high accuracy and use them as the ba- 
sic unit in the summarization process, including fre- 
quency calculation 
Ftrst, just as word association methods have 
proven effective m lemcal analysis, e g (Church and 
Hanks, 1990), we are exploring whether frequently 
occurring Collocatlonal reformation can improve 
on simple word-based approaches We have pre- 
processed about 800 MB of LA tlmes/Wastnngton 
Post newspaper articles nsmg a POS tagger (Bnll, 
1993) and derived two-word noun collocations using 
mutual information The. result included, for exam- 
ple, varlons "chips" phrases as shown m Table 1 
The word "ch~ps" occurred 1143 times m this cor- 
pus, and the table shows that thin word m semanti- 
cally very amblguons In word associatmns, It can 
refer to food, computer components, abstract con- 
cepts, etc By incorporating these conocatlons, we 
can dlsamblguate dtfferent meamngs of "chips" 
Secondly, as the recent Message Understanding 
Conference (MUC-6) showed (Adv, 1995), the accu- 
racy and robustness of name extraction has reached 
a mature level, equahng the level of human perfor- 
mance m accuracy (lind-90%) and exceeding human 
speed by many thousands of times We employed 
SRA's NameTag TM (Krupka, 1995) to tag the afore- 
mentioned corpus with names of people, entztIes, and 
places, and derived a baseline database for tffIdf 
calculation In the database, we not only treated 
multi-word names (e g, "Ball Clinton") as single to- 
kens but also dmamblguated the semantic types of 
names so that, for instance, the company "Ford" 
67 
ts treated separately from President "Ford" Our 
approach is thus different from (Kupiec, Pedersen, 
and Chen, !995) where only capitalization reforma- 
tion was used to identify and group various types of 
proper names 
2.1.2 Acquiring Knowledge of the Domain 
In knowledge-based summarization approaches, 
the biggest challenge ts to acquire enough domain 
knowledge to create conceptual representations for a 
text Though summarization from conceptual repre- 
sentation has many advantages (as discussed m Sec- 
tion 1), extracting such representations constrains a 
system to domain dependency and is too knowledge- 
intensive for our approach 
Instead, we took an automatic and robust ap- 
proach where we acqmre some domain knowledge 
from a large corpus and incorporate that knowledge 
as summarization features m the system We incor- 
porated corpus knowledge m three ways, that is, by 
using a large corpus baseline to calculate'ldf values 
for selecting signature words, by denying colloca- 
tions statistically from a large corpus, and by creat- 
ing a word association index derived from a large cor- 
pus (Jmg and Croft, 1994) With thin method, the 
system can automatically adapt to each dmtmct do- 
main, hke newspapers vs legal documents without 
manually developing domain knowledge Domain 
knowledge is embraced in szgnature words, which 
indicate key concepts of a given-document, in col- 
Iocat:on phrases, which provide richer key concepts 
than single-word key concepts (e g "appropriations 
bill," "ommbus bill," "brady ball," "reconciliation 
bill," "crime bill," "stopgap bdl,", etc ), and in their 
assoczated words, which are clusters of dommn re- 
lated terms (e g, "Bayer" and "aspirin," "Columbia 
Raver" and "gorge," "Dead Sea" and "scrolls") 
2.i.3 Recognizing sources of Discourse 
Cohesion and Coherence 
Past research (Pmce, 1990) has described the neg- 
ative impact on abstract quality of fathng to per- 
form some type of discourse processing Since din- 
course knowledge (e g, effective anaphora resolution 
and text segmentation) cannot currently be auto- 
matlcally acquired easily wlth high accuracy and ro- 
bustness, heuristic techniques are often employed in 
summarization systems to suppress sentences with 
interdependent cohesive markers 
However, there are several shallower but robust 
methods we can employ now to acquire some dis- 
course knowledge Namely, we exploit the dmcourse 
features of lexlcal items within a text by using name 
aliases, synonyms, and morphological variants 
Within a document, subsequent references to full 
names are often aliases Thus, linking name aliases 
provides some indication as to which sentences are 
interrelated, as shown below 
The Institutional Revolutionary 
Parry, or PRI, capped sis landmark as- 
sembly to reform ,tself w,th a .Nourish of 
pomp and prom,ses Among the mea- 
sures coming out of the assembly's fiercest 
pubhc debate, zn which party members rose 
up agaznst the,r leadership Saturday nlght, 
are new requsrements for future PRI pres- 
,denttal cand, dates, quahficatwns that net- 
ther ~eddlo nor any of Mezzco's prevzous 
four pres,dents would have met 
The NameTag name extractxon tool discussed m 
the previous section performs hnkmg of name aliases 
within a document such as "Albnght" to "Madeleme 
Albnght," "U S" to "Umted States," and "IBM" for 
"International Business Machine" We used tlus tool 
to link full names and. their aliases so that term fre- 
quency can be more accurately reflected, x e, "IBM" 
and "International Business Machine" are counted 
as two occurrences of the same term 
Another overt clue for chscourse cohesion and co- 
herence is synonymous words When a theme of 
an article m developed throughout the text, synony- 
mous words often appear as variants m the text In 
the example below, forinstance, "pictures" and ~m- 
ages" are used interchangeably 
A new medzcal imaging technzque may 
someday be able to detect lung cancer and 
diseases of the bra:n earher than conven- 
twnal methods, according to doctors at the 
State Un:vers,ty of New York, Stony Brook, 
and Princeton Un:verszty If doctors 
want to take pictures of the lungs, he 
noted, they have to use X-ray machines, ez- 
pos:ng thezr pat:ents to doses of radtatzon 
:n the process The new technlque uses 
an anesthetfc, tenon gas, instead of water 
to create images of the body 
Although synonym sets have not proven ef- 
fective in reformation retrieval for query expan- 
sion (Vorhees, 1994), we are using WordNet (Mallet 
et al, 1990) to link synonymous words m an arti- 
cle In the IR task, a query term is expanded with 
Its synonyms without dlsambxguatmg the senses of 
the term Thus, semantically irrelevant query terms 
are added, and the system typically retrieves more 
irrelevant documents, decreasing the precision Our 
summarization approach, in contrast, attempts to 
exploit WordNet synonym sets of only signature 
terms m a szngle document Our hypothesis m that 
if a synonym of a signature term extsts m the article, 
the term has been dlsamblgnated by the context of 
the article and the "correct" synonym, not a syn- 
onym of the term in a different sense, m likely to 
co-occur in the same document 
In addition, morphological analysts allows us to 
link morphological variants of the same word within 
a document Morphological variants are often used 
to refer to the same concept throughout a document, 
68 
! 
! 
i 
I 
providing discourse clues In the above example, 
"lma~ng" and "Images" are morphologically linked 
Like synonyms, morphology or stemming has not 
proven to be useful for "lmprowng information re- 
trieval (Salton and Lesk, 1968, Harman,~1991) 
However, the recent work by (Church, 1995) showed 
that effectiveness of morphology, or correlations 
among morphological variants within a document, 
vanes from word to word A word hke "hostage" 
has a large correlation with its variant Uhostages" 
while a word like "await" does not According to his 
experiments, good keywords like "hostage" and its 
variants are likely to be repeated more than chance 
within a document and highly correlate with variant 
forms Tins implies that important signature words 
we use for summarization are likely to appear In a 
single document multiple times using their variant 
forms 
2.2 Combining Sl~rnrnarlzatlon Features 
The DlmSum summarizer exploits our flexible def- 
inition of a signature word and sources of domain 
and duscourse knowledge m the texts through 
• the creation of multiple basehne databases cor- 
responding to multiple definitions of signature 
words 
• the application of the discourse features in 
multiple-term frequency calculation methods 
Different baseline databases can affect the inverse 
document frequency (ldf). values We have cre- 
ated multiple baseline databases based upon mul- 
tiple deflmtions of the signature words Signa- 
ture words are flexibly defined as collections of fea- 
tures Presently, we derive databases cousustmg of 
(a) terms alone, (b) terms plus multi-word names, 
(c) stemmed terms plus muti-words names, and 
(d) terms plus multi-word names and collocations 
The duscourse features, 1 e, synonyms, morphologi- 
cal variants or name ahases, for s~gnature words, on 
the other hand, can affect the term frequency (tf) 
values Using these discourse features boosts the 
term frequency score within a text .when they are 
. treated as var!ants of signature words Having mul- 
tiple baseline databases available makes it easy to 
test the contribution of each feature or combination 
of features 
2.2.1 The Feature Combiner: Current 
In order to select sentences for a summary, each 
sentence in the document us scored using different 
combinations of signature word features and dis- 
course features Currently, every token m a docu- 
ment us assigned a score based on its tf*ldf value 
The token score us used, in turn, to calculate the 
score of each sentence in the document More specif- 
ically, the score of a sentence is calculated as the av- 
erage of the scores of the tokens contained m that 
sentence with the exception that certain types of 
69 
tokens can be ehmmated from the sentence as dis- 
cussed That m, the DlmSum system can Ignore any 
combination of name types (1 e, person, place, en- 
tity) from a ~ven document for sconng (cf Section 3 
for more details) 
After every sentence is assigned a score, the top n 
tnghest scoring sentences are chosen as a summary 
of the content of the document Currently, the Dun- 
Sum system chooses the number of sentences equal 
to a power k (between zero and one) of the total 
number of sentences Thus, the system can vary the 
length of a summary accordmg to ~ For instance, 
if 0 5 is chosen as the power, and the document con- 
sists of 100 sentences, the output summary would 
contain 10 sentences Thus scheme has an advantage 
over choosing a given percentage of document size 
as it yields more information for longer documents 
while keeping summary size manageable We use 
the results of thus method as the baseline summary 
performance (; e, without any training), and report 
them m Section 3 
2.2.2 The Feature Combiner. Future 
As our goal is to make our summarization system 
trainable to different user and application needs, we 
are currently workmg on learning the best feature 
combination method from a tralmng corpus auto- 
matically For training and evaluating our summa~ 
nzatlon system, we had a user create extract sum- 
maries by selecting relevant sentences m articles In 
order to compare with the results of a trainable sum- 
manzer reported by (Kuplec, Pedersen, and Chen, 
1995), we first use Bayes' rules to learn the best scor- 
ing method Then, we will use an inductive learning 
algorithm such as the decusion tree algorithm (Qum- 
lan, 1993) to learn summarization rules which can 
deal with feature dependencies across sentences 
3 Evaluation 
Much research has been devoted to assessing cor- 
respondence between human and machine abstracts 
because of the complexity of analyzing "ahoutness" 
as illustrated in (Hahn, 1990) As a result, most of 
the prehmmary evaluations of summarizatlon sys- 
terns have been developer-based A common aF- 
proach IS to compare correspondence between auto- 
matlc performance and human performance (Rath, 
Restock, and Savage, 1961, Edmundson, 1969, Ku- 
plec, Pedersen, and Chen, 1995) or summary accep~ 
ability (Brandow, Mltze, and Ran, 1995) Others 
have been task-based, comparing abstract and full 
text on~nals m terms of the browsing and search 
time (Mnke et al, 1994, Sumlta, Ono, and Ml- 
lke, 1993) or recall and precision m-document re- 
trieval (Brandow, Mltze, and Ran, 1995) 
Our evaluation methodology us two-pronged 
First, we evaluate the system by scoring for cor- 
respondence with human generated extracts (Seco 
tlon 3 1) Second, m our future work we are col- 
laborating with the Umverslty of Massachusetts to 
evaluate retrieval effectiveness for system-generated 
and human-generated summaries (Section 3 2) 
3.1 Developer-based Evaluation 
The DlmSum development envtronment software in- 
corporates automatic sconng software to calculate 
system recall and precision for any user's training or 
test data ThLs allows us to evaluate system perfor- 
mance for any user and for variatl0us m summary 
preferences 
We performed an informal experiment in which 6 
users created summary extract versions of the same 
set of 15 texts These versions varied considerably 
among users, winch supports our view that a sum- 
marlzation system should he trained for user pref- 
erence Then, we ran the DlmSum system over 
these 15 texts using multiple feature combinations 
(l e, combmatlous among names, synonyms, and 
morphologtcal variants), and scored against the six 
versions of summary extracts Though correspon- 
dence between the DlmSum summaries and user 
suminarles was low (ranging between 14% and 31% 
F-measures), clearly some feature sets were more ef- 
fective for some users than for others For example, 
the best feature c0mbmatlon for the best-case corre- 
spondence between the user and DlmSum (l e, 31% 
case) was the combination of name, synonym and 
morphological mforinatlon On the other hand, the 
best combination for the worst-case correspondence 
between the user and DlmSum (l e, 14% case) was 
the combination of name and synonym reformation 
Some summary extracts, however, were not affected 
by different combinations of features 
The second step was to obtain a "bottom-hne" 
score for a singl e user We ran the DlmSum system 
over a set of 86 texts using multiple feature combi- 
natlous The features were combined by taking an 
average Of tf*ldf, tf or ldf scores of each token m 
a-sentence No training was performed We vaned 
the length of summaries (by changing/~ from 0 5 to 
1 0), use of different types Of names (l e, person, 
place, and entity), use of abases, and use of syn- 
onyms for different parts of speech (l e, adjective, 
adverb, noun, and verb) 
Table 2 shows the top three F-measure scores (1- 
3), and the score for using the simplest baseline (4) 
For the best summary (1), place names were used 
winle person and entity ~ames were recognized but 
removed for sentence sconng Synonyms were also 
used The /c value was set to 0 65 (about 20-25% 
of a document as a summary) Use of aliases and 
synonyms chdn't make much difference m the scores 
(2-3) However, they all scored shghtly ingher than 
• the summary which &dn't use any of these features, 
i e, a summary which didn't use names, synonyms, 
or aliases (4) 
It Is interesting that using name tagging in a re- 
70 
verse way, 1 e , recogmzmg and then deleting person 
names from • sentence scoring, made a slgmficantly 
positive effect on summarization The best summary 
score with the person name used m sentence scor- 
ing was 38 6% (5) The.reason why person names 
made negative contnbutlous to•the summary seems 
' to be because personal names were often mentioned 
as passing references (e g, names of spokespeople) 
m the corpus, but they had ingh ldf values 
Finally, m every feature combination, taking tf*ldf 
scores of each word outperformed the ldf-based cal- 
culation, and the latter m turn outperformed the 
if-based score calculation 
These results further motivate us to apply auto- 
mated learning to combine summarization features 
The fact that humans vary m summarization sug- 
gests that recall/preclmon evaluation.is not mean- 
mgful unless a summarization system Is trainable 
to a particular summary style Our current work 
is to identify through training what feature com- 
binations produce an optimal summary for a given 
user We anticipate that the summary performance 
will improve with tralmng as DhmSum learns an- 
tomatlcally how or whether these, different mgna-. 
• ture word definitions are contnbutmg to the sum- 
mary The current design does not incorporate 
para~aph/sentence location reformation or genre- 
specific indicator phrases We are explonng if these 
features can be indirectly subsumed by the derived 
features we have already identified 
Also, the cursory look at the summaries of Dim- 
Sum shows that the system-generated summary may 
be prowdmg the same reformation as the summary 
provided by the user, but the sentences were chosen 
differently ThE happens because the same reforma- 
tion can be conveyed by dnTerent sentences within 
the same document This motivates us to conduct a 
more task-oriented summarization evaluation, winch 
Is discussed below 
3.2 Task-based Evaluation 
As a more task-oriented evaluation, we. are col- 
laborating with the Umverslty of Massachusetts to 
evaluate retrieval etfectiveness for DlmSum system- 
generated .and human-generated extracts for topics 
from TREC-5 (Text REtrieval Conference-5) We 
have selected 30 topics, five assessed as difficult, five 
assessed as easy (Harman, 1996), and the remaining 
15 randomly The top 50 documents judged rele- 
vant by the INQUERY system m TRECC-5 for each 
topic have been identified For each document, two 
extract versions are being manually created One 
extract m based on the topic description, wtule the 
second L9 generated independent of the topic descrip- 
tion In addition, the DlmSum system will automat- 
lcally.generate two versions (query dependent and 
generic) for each of the texts With the TREC-5 full 
text results as a baseline, multiple lteratlous of the 
INQUEI~Y system wall test retrieval performance on 
! 
Ii 
1! 
II 
11 
! 
! 
\[nt.AV , .:. .... 
4 stony I~.,, ~. 
2 I~n~ton unwers~t~ 
1 IS~ ¢o~nmunlty coilllge 
I star* umv~'s~ of ~w yak 
1 food and dr~p aclmmsmmon 
-~ Persoo 
- 8 sJbert, m~ll 
7 IIl~'t 
1 albor~, m~tChoII 
1 balamcro dlhp 
-\[1) p~i 
1 now york 
I~wa~t Vm* 
c~0¢0~ l~2S M28 </D0011~ 
¢S'L~.YID cut ,u \]m,,~. sd.,~m--u~ x~ml .c~TC~.YID> 
• cPORMA'T> ~ ~1 ~T> 
cl~.AD~b - a1744 ~-?.~ 0491 </}I~A.OS~, 
cPREAMBL~> 
~-~ - a1744 
clffLDI1~> ~ ~m McMurme ~'LINIb. 
cCPY~RIGHT> 
:c) ~ Nem~ 
#~.W~JOHT> 
¢Iff.ADI.LNE> 
.~1~. DOCtOa ~XZnm M~l t0 VkwL~ Maw (~ls &~L 
:TEXt'> 
q~ 
:~. md dbmr, ses c~ ire b.Mu ea~4~. ,~l ccmm,~,m~ ~etb~ch~ 
qm, 
,w~mmlmsl~ ,m'mal~bmm~s m~ w~vm .,mwat~ ~ms b 
haz:s~8~m'now~r ~ klznm'~,'~*'-m"m:dthelIp &MD 
nm~ted~I u~al m oaeMs~c,z~l~ iLw4 ram4 d ws~ 
~ ~tSe'b~cli~ 9ec~ me su s\]prr.~ds O~'o~lpm~,~ the 
m~ve c~b w~r.h ~ m:b m a type ~ ~ calh~Di~Is ~bas~e 
potmwd w ~h~t~= m ~ Um ~.m md I=,p m~ m=~ thin 
Figure 1 Name Mode Summary 
t,m,d ~ ~ b ~ o~ VIw 
• -~I\] m~word 
+ 8 albert, m~tchell 
4 s~ony brook 
G ~lnon 
.... d" 7r~7~ .......... 
5 irnlgms 
I p~JrlS 
4 lungs 
7 Mchn~w~ 
14 extstlng 
12 rm~gIs 
2 a~iw~tim~ 
2 semis 
2 ~rtm:e~on umvlrs~ 
I hyplrDolaflzsS 
5 Conventional 
1 ~km~'O dshp 
3 
1 nassau commumW ~o)hlge 
3 dlseesIS 
3 mIls 
| h~l(~l 
I mcmUrlrll 
¢DOG~ 
¢Docm~ ~.s raze ,c,~OCI0~ • 
¢ST0\]tYZD cam-a\]p~.sd-qu-,~c, ~'Q0a <~?~YID3. 
cFORMAT> ~ &DL </FO~.MAT> 
cS\[.UO> bc- - </SLUG> 
¢IW, ADD.> - Lt744 07-25 04~J. </HEADI\[~:P 
r~AMBL)~. 
m-* -a17~ 
&tin, (ndy){A1r'/q,t Ncws $~r.d~ms) A.QL 
~/P'REAMDLB> 
caYLINE:, By~ ' " -~</8YL~b 
ccPYRZOlt'r) 
~YJUOFrr> 
:~ADLINIb. 
&lYJt D~'s1,1~.~m i,j tDYmv,.~ ,d=~J~,..&QL 
Aumv--,'x-'~ ~¢.Jq t . . ..i m~,m,~,Fbs~ ~.ct~.m I 
\[ 'fhe~ed=u,, eAsat~mtmo~- e~,.e~l &t~l ~mapeh~ 
~mem~mmn ~ ~du~uuda.~b~7 Butmmmswhm: 
~t s m mthat o.,~m-~hm had al0t ~'uuMo mlmx~ 
@ 
Tbs~.m~v:d,.,.. cu~a,m.,rb'.i ...... '~.~.," p~ ~,,add~ 
ma.~as .~ ~ddm~ Decme~meSm spreztstbn~Mmutth~ 
Mood s~.am sod t~atb~ ~ corer.me'me ~ arias d ~be hod7 su, rJ~, ~ 
• ~ = ~w'mcSz'enchmamm,W~c~.d~o~xthasthe 
~ • wrlmgr~U~S 
Figure 2 Keyword Mode Summary 
- ? 
71 
Table 2 Summary scores for different feature combinations 
Feature Combination tf*ldf ! ldf I tf 
term+place+synonym 41 5 32 3 20 9 (1) 
term+place+entity, - " 41 2 33 9 21 2 (2) 
term.termWplace+ahas+syn°nym 409399 323242104 210 /~l . 
term-l-person+place+entlty+ahas+synonym 386 321 225 (5) 
the human and machine generated extracts to com- 
pare retrieval effectiveness 
4 Multi-dimensional Summary 
Views . 
The DlmSum Summarization Clientprovldes a sum- 
mary of a document in multiple dlmeuslons through 
a graphical user interface (GUI) to smt dflferent 
users' needs In contrast to a static view of a doc- 
ument, the system brings the contributing hngnm- 
tie and other resources to the desktop and the user 
chooses the view he wants As shown m Flgnre 1, 
the GUI is divided mto the Lint Box on the left and 
the Text Viewer on the right 
When a user asks for a summary of a text, ex- 
tracted summary sentences are hlghhghted m the 
Text Viewer The user can dynarmeally control a 
percentage of sentences to highlight for a summary 
In addition, the Client can automatically color-code 
top keywords m different colors for different types 
(1 e, person, entity, place and other) for quack and 
easy browsing 
In the Lint Box, the user can explore two different 
summary views of a text First, the user can choose 
the "Name Mode," and all the names of people, en- 
tities, and places which were recognized by the name 
extraction tool are sorted and displayed m the List 
Box (el Figure 1) The user can also select a subset 
of name types (e g, only person and entity, but not 
place) to d~play Aliases of a name are indented and 
hsted under their full names 
In the "Keyword Mode," the top keywords, or sig- 
nature words, (including names) axe dmplayed in the 
Last Box Analogous to the name aliases, for each 
keyword its synonyms and morphological variants, if 
exast, are indented and hsted below it (cf Figure 2) 
The user can choose the score threshold or percent- 
age to vary the number of keywords for display 
In both modes, the names and signature words 
in the List Box can be sorted alphabetically, by 
frequency, or by the tf*ldf score Choking on a 
term in the Lint Box also causes the first occurrence 
of the term to be hlghhghted in the Text Viewer 
From there, the user can use the FIRST, PREVIOUS, 
NEXT, or LAST button at the bottom of the GUI to 
track the other occurrences of the term, including its 
ahases, synonyms, and morphological variants This 
provides the user with a way to track themes of the 
text lnteractively 
5 Summary 
The DlmSum summarization system leverages off. 
of the works of (Kuplec, Pedersen, and Chen, 
1995) and (Brandow, Mltze, .and Ran, 1995), 
and advances summarmatlon technology by apply- 
nag corpus-based statistical NLP teehmques, robust 
information extraction, and readily avaalable on-hne 
resources Our prehxmnary experiments with com- 
bining different summarization features have been 
reported, and our current effort to learn to com- 
bine these features to produce the best summaries 
has been described The features derived by these 
robust NLP techmques were also utihzed m present- 
mg multiple summary.vtews to the user m a novel 
way 

References 
Advanced Research Projects Agency 1995 Proceed- 
:rigs of S:zth Message Understanding Conference 
(MUC-6) Morgan Kanfmann Pubhshers 
Brandow, Ron, Karl Mltze, and Lisa Ran 1995 
Automatic condensation of electromc pubhcatlous 
by sentence selection Information Processing and 
• Management, 31, forthcoming 
.Bull, Eric 1993 A Comps-based Approach to Lan- 
guage Learning Ph D thesm, Umverslty of Penn- 
sylvania 
Church, Kenneth and Patrick Hanks 1990 Word 
• Aesoclatlon Norrns, Mutual Information, and Lex- 
icography Computational Lmgmstscs, 16(1) 
Church, Kenneth W 1995 One term or two 9 In 
Proceedings of the 17th Annual International SI- 
GIR Conference on Research and Development In 
Informatzon Retrzeral, pages 310-318 
Edmundson, H P 1969 New methods m automatic 
abstracting Journal of the ACM, 16(2) 264-228 
Fum, Dando, Glovanm Gmda, and Carlo Tasso 
1985 Evalutatmg Importance A step towards 
text surnmarlzatlon In I3CAI85, pages 840-844- 
IJCAi, AAAI 
Hahn, Udo 1990 Topic parsing Accounting for 
text macro structures m full-text analysm In for- 
mat:on Processing and Management, 26(1)135- 
170 
Harman, Donna 1991 How effective is suttixang ~ 
Journal of the Amerlcan Sot:cry for Informatwn 
Sc:ence, 42(1) 7-15 
Harman, Donna 1996 Overview of the fifth text 
retrieval conference (tree-5) In TREC-5 Confer- 
ence Proceedings 
Jmg, Y and B Croft 1994 An Assoc:atwn The- 
saurns for Informatzon Retrseval Umass Techm- 
cal Report 94-I7 Center for Intelligent Informa- 
tion Retrieval, University of Massachusetts 
Johnson, F C, C D Prate, W J Black, and A P 
Neal 1993. The apphcahon of hngumhc process- 
nag to automatic abstract generation Journal of 
Docnmentatwn and Tezt Management, 1(3) 215- 
241 
Jones; Karen Sparck 1993 What mtght be in 
a summary? In Knorz, Krause, and Womser- 
Hacker, edttors, Informatwn Retrieval 'g3, pages 
9-26 
Jones, Karen Sparck 1995 Dmcourse modeling for 
automahc surnmarms In E Hajlcova, M Cer- 
venka, O Leskn, and P Sgall, editors, Prague ~ : 
gmsttc Circle Papers, volume 1, pages 201-227 
Krupka, George 1995 SRA Descnphon of the 
SRA System as Used for MUC-6 In Proceed- 
rags of $:z'th Message Understanding Conference 
(MUC-~) 
Kuplec, Juhem, Jan Pedersen, and Francme Chert 
1995 A trmnable document summarizer In Pro- 
cee&ngs of the 18th Annual Internatwnai SIGIR 
Conference on Research and Development :n In- 
formatwn Retrzeval, pages 68-73 
Luhn, H P 1958 The automatic creation of htera- 
ture abstracts In IBM J Research Development, 
volume 2, pages 159-165 
Maybury, Mark T 1995 Automated even sum- 
marxzatlon techmques In B Endres-Nlggemeyer, 
J Hobbs, and Karen Sparck Jones, editors, 
Summarizing Text for Intelhgent Commun:cat,on, 
pages 101-149 
McKeown, Kathleen and Dragomar Radev 1995 
Generating summaries of mulhple news articles 
In Proceedings of the 18th Annual Internat:onal 
SIGIR Conference on Research and Development 
:n In format:on, pages 74-78 
Make, Seljl, Etsuo Itho, Kenj10no, and Kazuo 
Surmta 1994 A full text retrieval system with 
a dynannc abstract generation function In Pro- 
ceed:ngs of 17th Annual Internatwnal ACM S1- 
GIR Conference on Research and Development :n 
Informatwn Retrieval, pages 152-161 
Miller, George, Richard Beekwith, Chnshane Fell- 
" baum, Derek Gross, and Katherine Miller 1990 
Five papers on WordNet Technical Report CSL 
Report 43, Cogmhve Science Laboratory, Prince- 
ton Umverslty . 
Pmce, C 1990 Constructing hterature abstracts by 
computer Techmques and prospects Informa- 
::on Processing and Management , 26(1) 171-186 
Prate, C and P A Jones 1993 The ldenhficahon of 
important concepts m lughly structured techmeal 
papers In Proceedings of the 16th Annual Inter- 
natwnal A CM SIGIR Conference of Research and 
Development m Inforraatzon Retrieval, pages 69- 
78 
Qumlan, J Ross 1993 C4 5 Programs for Ma- 
chine Learmng Morgan Kaufmann Publmhers 
Rath, G J, A Restock, and T R Savage 1961 
The formation of abstracts by the selechon of sen- 
tenees Amer:can Doeumentatwn, 12(2) 139-143 
Rau, Lma F, Paul S Jacobs, and Un Zermk 1989 
Information extraction and text surnmanzatlon 
umng hngmstlc knowledge acqmslhon Informa- 
:son Processing and Management, 25(4) 419-428 
Relmer, Ulrich and Udo Hahn 1988 Text conden- 
sahon as knowledge base ahstractmn In IEEE 
• Conference on AI Apphcatwns, pages 338-344 
Rxlotf, Ellen 1995 A corpus-based approach 
to dommn-speclfic text surnmanzatlon In 
B Endres-Nlggemeyer, J Hobbs, and K Sparek 
Jones, e&tors, Summarizing TeE/or lntelhgent 
Commumcatwn, pages 69-84 
Rush, J E, R Salvador, and A Zamora 1971 
Automatic abstracting and mdemng Produchon 
of m&eahve abstracts by. application of contex- 
tual inference and syntactic criteria Journal of 
the American Sot:cry for Informat:on Sc:ence, 22(4) 
260-274 
Salton, G and M McGdl, editors 1983 An 
Introductzon to Modern Informatwn Retrieval 
McGraw-Hall 
Salton, Gerald and Mark Leak 1968 Computer 
evaluation of indexing and text processing Jour- 
nal of the ACM, 15(1) 8-36 
Sumxta, Kazuo, Kenjl Ono, and Seljt Make 1993 
Document structure extraction for mterachve 
document retrieval systems In Procee&ngs Of 
SIGDOC'93, pages 1301-310 
Vorhees, E 1994 Query expansion using lexacal- 
semamc relations In Proceedings of the 17th 
Annual Internatwn.al ACM-SIGIR Conference on 
Research and Development of Informatwn Re- 
trzeval, pages 61-69 
