Multi-Topic Multi-Document Summarization 
UTIYAMA Masao 
Communications Research Laboratory 
588-2, Iwaoka, Nishi-ku, Kobe, 
Hyogo 651-2492, Japan 
mutiyama@erl.go.jp 
HASIDA K6iti 
Electrotechnical Laboratory 
1-1-4, Umezono, Tukuba, 
Ibaraki 305-8568, Japan 
hasida@etl.go.jp 
Abstract 
Summarization of multiple documents featur- 
ing multiple topics is discussed. The exam- 
ple trea.ted here consists of fifty articles about 
the Peru hostage incident tbr \])ecember 1996 
through April 1997. They include a. lot of top- 
ics such as opening, negotiation, ending, and 
so on. The method proposed in this paper is 
based on spreading activation over documents 
syntactically and semantically annotated with 
GI)A (Global l)ocument Annotation) tags. The 
method extracts important documents aald im- 
portant parts therein, and creates a network 
consisting of important entities and relations 
among them. It also identifies cross-document 
coreferences to replace expressions with more 
concrete ones. The method is essentially multi~ 
lingua\] due to the -independence of the 
GDA tagset. This tagset can provide a stan- 
dard fornm.t tbr the study on the transfbrmation 
and/or generation stage of summarization pro- 
cess, among other natural  processing 
tasks. 
1 Introduction 
A large ('.vent consists of a, number of smaller 
events. These component events are usually 
related trot such relations may not be strong 
enough to define larger topics. For example, a 
war may consist of opening, battles, negotia- 
tions, and so on. These relatively independent 
events are considered to be topics by themselves 
and would accordingly be reported in multiple 
news re'titles. 
Summarization of such a large event, or mul- 
tiple documents about multiple topics, is the 
concern of this paper. Summarization of multi- 
ple documents containing nmltiple topics is an 
unexplored research issue. Some previous stud- 
ies on summarization (McKeown and Radev, 
1995; Barzilay et al., 1999; Mani and Bloedorn, 
1999) deal with multiple docmnents about a sin- 
gle topic, but not about multiple topics 1. 
In order to smnmarize lnultiple docmne, nts 
with multiple topics, one needs a general, 
semantics-oriented method for evaluating im- 
portance. Summarization of a single document 
may largely exploit the doculnent structure. As 
an extreme example, the first paragraph of a 
newspaper article often serves as a smmnary of 
the entire article. On the other hand, summa.- 
rization of multiple, documents in general must 
be more based on their semantic structures, be- 
cause the, re is no overall consistent document 
structure across them. 
Selection of multiple important topics (not 
keywords) tbr nmltiple-topic summarization has 
not; yet been really addressed in the previ- 
ous literatm:e. The present paper proposes a 
method, based on spreading a.ctivation, for ex- 
tracting important topics and important docu- 
ments. Another method proposed which is use- 
fifl for grasping the overview of nlultiple docu- 
ments is visualization of important entities men- 
tioned and relationships among them. Visu- 
alization of relationships among keywords has 
been studied in the context of information re- 
trieval (Niwa et al., 1997; Sanderson and Croft, 
\] 999), but to the authors' knowledge the present 
study is the first to address such visualization in 
the context of summm'ization. Of conrse a. con- 
cise summary of the entire set of multiple docu- 
lnents can be obtained by recovering sentences 
from important entities and their relationships 
~s demonstrated in section 3.3. 
The present study assumes documents anno- 
tated with GDA (Global Document Annota- 
1Maybury (1999) discusses smnmarlzation of multiple 
topics, but in his study the smmnaries are made ffonl an 
event database lint not fl'om documents. 
892 
tion) Lags (Itasida, 1997; Nagao and llasida, 
1!)98). Since the GI)A tagset is designed to be 
inclel)endent of any particular natural , 
the proposed method is essent, ially multilingual. 
Another merit of using annotate, d documents is 
that we ca.n separate the a,nalysis phase from 
the whole process of summarization so that we 
ca,n locus on the latter, generation t)hase of sum- 
ma.rization process. Annotated documents can 
also be useflfl for a common input format for 
the study of summarization, among other nat- 
ural  processing tasks. 
2 The GDA Tagset 
GI)A is a project to make on-line documents 
ntachinc-ullderstanda.ble on the basis of a lin- 
guistic ta.gset, while developing and si)read- 
ing technologies of content-based presentation, 
retrieval, question-answering, smnma.rization, 
translation, among othe, rs, with much higher 
quality than before. GI)A thus proposes an 
integrated global plattbrm for e,h',ctronic con- 
tent authoring, t)resental;ion, a,nd reuse. The 
GI)A tagset 2 is an XM1, (eXtensible Markup 
l,anguage) insta,nce which allows ma.chines to 
automatically infex the semantic and pra.gma.tic 
structures uncle, flying the raw (locuments. 
Under the current sta.te of the art, GI)A- 
tagging is senfiautomatic and calls for manual 
correction by human mmotators; othe, rwise an- 
notation would ma,ke no sense. "l~h( ,, cost in- 
volved here pays, because annota,ted documents 
are generic information contents from which to 
rend(',r diverse types of 1)resenta.tions, poi;en- 
tially involving summariza.tion, narra,tion, visu- 
alization, translation, information retriewfl, in- 
formation extra.ction, and so forth. The present 
p~,per concerns summarization only, trot the 
merit of GI)A-tagging is not a,t all restricted to 
smmnarization, and that is why it is considered 
reasonable to assume Gl)A-tagged input here. 
2.1. Syntactic structure 
An example of a. Ol)A-tagged sentence is shown 
in Figure 1. <su> means sentential unit. <np>, 
<v>, and <adp> stand for noun t)hrase, verb, 
and adnominal or adverbial phrase. 
<su> and the tags whose name end with 'p' 
(such as <adp> and <vp>) a,re called phrnsal 
ta.qs. In a sentence, an (;lement (a text Sl)an 
2http ://www. etl. go. j p/etl/nl/gDk/tagset, html 
<su> 
<np>Time</np> 
<v>flies</v> 
<adp> 
like 
<np>an arrow</np> 
</adp> 
</su> 
Figure 1: A Gl)A-tagged sentence. 
from a begin tag to the corresponding end tag) 
is usually a syntactic constituent. The elements 
enclosed in phrasal tags are called ph,~asal ele- 
ments, which cannot be the head of larger ele- 
ments. So in Figure 1. 'flies' is specified to be 
the hea.d of the <su> element and qike' the head 
of the <adp> element. 
2.2 Coreferences and Anaphora 
Each element ma.y have an identifier as the va.lue 
for l;he id attrit)ute,. Corefe, rences, including 
identity ana.t)hora , are annotated by the eq at- 
tribute, as follows: 
<np id="j0">John</np> beats 
<adp eq="j0">his</adp> dog. 
When the shared sc, nm.ntic content is not the 
rctb, renl; lint the typ(', (kind, se, t, etc.) of the 
retb, rents, the eq.ab attribute is used like the 
following: 
You bought a <np id="cl">car</np>. 
3 bought <np eq. ab="cl">one</np>, 
too. 
A zero anaphora is encoded as follows: 
Tom visited <np id="ml">Mary</np>. 
He had <v iob="ml">brought</v> a 
present. 
iob="ml" means that the indirect object of 
brough, t is elemenl~ whose id value is ml, that 
is, Mary. 
Other relations, such as sub and sup, can also 
be encoded, sub represents subset, t)art, or ele- 
ment. An example follows: 
She has <np id="bl">many 
books</np>. 
<namep sub =''b i "> c c AI i ce ~ s 
893 
Adventures in Wonderland' '</namep> 
is her favorite. 
sup is the inverse of sub, i.e., ineluder of any 
sort, which is superset a.s to subset, whole as to 
part, or set as to element. 
Syntactic structures and corefc, rences are es- 
sential for the summarization method described 
in section 3. l?urther details such as semantics, 
coordination, scoping, illocutionary act, and so 
on, are omitted here. 
3 Multi-Document Summarization 
3.1 Spreading activation 
A set of GDA-tagged documents is regarded as 
a network in which nodes roughly correspond to 
GI)A elements and links represent the syntac- 
tic and semantic relations among them. This 
network is the tree of GI)A elements plus cross- 
reference (via eq, eq.ab, sub, sup, and so on) 
links among them. Cross-reference \]inks nlay 
encompass different documents. Figure 2 shows 
a schematic, graphical representation of the net- 
work. 
document 
suIMivision or 
paragrat:h " " " 
sentence " " " 
-- s~,t~tact ic link 
suhser~te~ltial - • • 
el£hel llts -- --- cross te eronce lJllk 
Figure 2: Multi-document network. 
Spreading activation is carried out in this 
network to assess the importance of the ele- 
ments. Spreading activation has been applied 
to summarization of single GDA-tagged docu- 
ments (Hasida et al., 1987; Naga.o and Hasida, 
1998). The main conjecture of the present study 
is that the merit of spreading activation in that 
it evaluates importances of semantic entities 
is greater in summarization of multiple docu- 
ments with multiple topics, because smnmariza- 
tion techniques using docnment structures do 
not; apply here, as mentioned em:lier. 
To fit the semantic interpretation, activations 
spread under the condition that coreferent ele- 
ments should have the same activation vahm. 
The algorithm a is shown in Figure 3. Iiere the 
external input c(i) to node i represents a pri- 
ori importance of i, which is set on an empir- 
ical basis; for instance, an entity 4 referred to 
in the title of an article tend to be hnportant, 
and thus c(i) should be relatively large for the 
corresponding node i. Tile weight w(i, j) of an- 
other kind of link Dora node i to node j may 
also be set empMcally, but it is fixed to a uni- 
tbrm value in tile present work. Let E(i) be tile 
equivalence class of node i, that is the set of 
nodes which are coreferent with i (linked with i 
via eq relationships). Condition 
E E\] ,,,(k,0)_< 1 
~eU(i) .iCE(i) 
should be satisfied in order for the spreading ac- 
tivation to converge. This condition is satisfied 
if we treat each equivalence class of nodes as a 
virtual node while setting the weights of other 
types of links to be 1/5, where D is the maximum 
degree of equivalence classes: 
O=max ~ ~ @'i i 
~eE(,i) .iCE(.i) 
where 5&~ is \] if there is a link between node k 
and node j, otherwise it is 0. 
The score score(i) of node i is calculated by 
summing the activation wdues of all the nodes 
m~dcr node i in the syntactic tree strueture: 
.~.'o,'~'.(i) =a(i)+ ~ .~co,'4j ) (l) 
jCch(i) 
where a(i) is the activation vahle of node i and 
oh(i) is the set of child nodes of node i. oh(i) is 
empty if node i is a leaf node, or a word. This 
score is regarded as the importance of node i. 
3.2 Extraction of important documents 
and sentences 
Extraction of ilnportant documents is simple 
once the scores of the nodes in the network are 
obtained. Sorting the document nodes accord- 
ing to their scores and extracting higher-rm~ked 
ones is sufticient for the purpose. 
aAnother spreading activation algorithm is discussed 
by Mani aim Bloedorn (1999). The comparison is a fu- 
ture work. 
awe use tim terms 'entity'~ 'node', aim 'clenmnt' in- 
terclmngeably. 
894 
Variables: 
N: munlier of nodes. 
D: nmxinnnn oul;-degree of equivalence classes, 
c(i): external input I;o node i. 
w(i, j): weight of the link from node i to node j: 
0 if not eonncel;ed, 
1 if connected via eq, 
1/D ot;herwise. 
a(i): actival;ion value of node i. The initial value 
is O. a(i) is the sum of all a.(j,i). 
a(±, j): acl;ivaLion value of l, he link from node i 
1;o node j. The inil;ia\] value is 0. 
Algoril;hm: 
repeat { 
for(i=O; i<N; i++){ 
av = c(i) ; 
for(j=O; j<N; j++){ 
a(j,i) = w(j,i)*(a(j) - a(i,j)) 
av += a(j,i) 
} 
a(i) = av; 
} 
} until convergence. 
Figure 3: Spreading activation a.lgorithm. 
Simila.r procedure is used to extract impel 
Laid; sentences from mi importa.nl; docmnent. 
Extra.cLod sentences aJ'e pruned according to 
their syntactic structures. Ana.phoric expres- 
sions such a.s h.e, or she are substituted \])y their 
a.ld;ecedents if neeessa.ry. 
An experiment ha,s been conducted to test the 
effeetiwmess of the proposed aJgorithm. The ex- 
a.mple set contains fifty Japanese articles about 
the Peru hostage incident which continued over 
four months fl'om I)ecember 1996 to April \] 997. 
They include a lot of topics such a.s opening, 
negotiation, settlement, mid so on. The GI)A- 
tagging of these articles has involved automatic 
morphological ana.lysis by JUMAN (Kurohashi 
mid Na.ga.o, 1998), automatic syntactic ana.lysis 
by KNP (Kurohashi, 1998), and ma.nual anno- 
tation encompassing morphology, syntax, coref- 
erence, and anaphora.. The types of a.naphora, 
identified here are ma.inly pla.in coreference and 
zero ana.phora. Cross-document coreferenees 
among entities ha.ve been a.utomatieaJly identi- 
fled by exa,et string mt~tching. 5 They oonta,ined 
erl'OlTS but those el'l;ors %7e17o llOt corl'eCl;(',d for 
the experiment. Cross-document coreferenees 
found were 'l'eru'(49), 'Japa.n'(39), 'Peru Pres- 
ident' (15), 'members of Tupac Amaru'(9), ... 
and so on, where the mmfi~ors indicate the num- 
bers of documents which contain these expres- 
sions. 
The externa.1 inputs to nodes have been de- 
fined a.ccording to the corresponding nodes: 
c(i) = 10 if node i's antecedent domina.tes sen- 
fences (e.g., a. node eoreferring with a. pa.ra.- 
graph). This sets a, preference for nodes which 
summa.rize preceding sentences, c(i) = 5 if node 
i is in the title of an article, beemlse a. title is 
usually importa.nt. Otherwise c(i) = 1. These 
crude t)a,raJneter va.hles have been set by the au- 
thors on the basis of the investigation of sum- 
ma.riza.tions of va.rious documents. 
Two importa,nt topics, the opening (first a.l;- 
tack by %lpac Ama.ru) a.nd the settlement (a.t- 
tack by the Peruvia.u government comma.n- 
(los), have been extracted fl'om the four highest 
rmiked articles, even though temporal informa.- 
tion has not been incort)or~tted in the aJgorithm. 
Tile opening aa'ticle, is the first a.rticle of the 
sample document set. However, the settlement 
a ri, ich; is the sixth \]a.st one. So mere extra.ction 
of the last m'tic, le would miss the settlement. 
The 25% sunuim.ries of the two a.rticles made 
by extra.cting a.nd priming sentences are shown 
below together with their English trmisla,tions: 
H$, 4)b-09~{,~D~AS#tlc a 61~ 
Armed guerrillas broke inl;o a party al; 
Japanese ambassador's re, sidenco. Gunshots. 
200 hold in hostage. Peru. 
Many people from Japmlese and Peruvian 
sides were held in hostage. The arnled group 
consisl;s of about twenl;y people, several of 
which broke into the anlbassador's resideime. 
IL is reported thai; there are interinitl;enl; 
shool;ings now. 
a.nd 
'~\¥e are planning to incorporate recmlt results (Bagga 
and Baldwin, 1998) to iden|;i\[y cross-docunmnt corefei'- 
elites. 
895 
H $:L~.~!.~II,~i}R!:\] r-{q:c0 f\]~:~-'O- :d>-::: o U K ,, 7 
{~!l!j~m l:q:igff~J:,l: ~ +i7~-:-, '~. ~ ~ m m P_ ~ l~lS/~f,j d\]-.<, 
.\]al)an(~sc, ~ui\]l)assa, dol"~; I;(~,si(t(HI(',C \[)o:3,:\](~,~;si()ll 
incidonl, in lknm. All lio:d,ag(;,~ r(~lea.<;e(I. Aim 
al; r(!cov01'ill~ lii,<~ pow0,r I)a:ds I>r(,,sidcid; l,'li- 
.i \]IlIOFio 
lq'cMdenl, Fujinlori demon.%ral;cd hhnscl\[ as 
II HI;l'Ollg polil;ician t)y r(;solvinl,; i}lt; ,lapau(~(! 
nlill);/s,<~adol'~s i'c.<dd(:ilCC, t)os,~es:don in(:id(;nL 
lie cill;(;ie(l l;h(; resi(h:n(',e <di;e. 'l'hi,~ vi,<;il; Ix) 
I;h(; re,<-;i(Icnc(~ ill\]t)l'(::;sc(\] l;\]l;ll; \]:(; wa:.; \]eadhq{ 
l;he ol)(;ral;ion hin~s(!ll'. Why did he choo,<~c i,o 
I'CHI_)I'\[ l;(J }il;lllS? \'V(~ C;lll S~t\[)' Lhal, \]io n, hnc, d 
;ql; I'O, OOVel"llll' ~ his polil,ical I(:a(l(:r:di\]il hy ro- 
,~;()\]Villg~ |,hlOllp;h lili\]ilAily \])O\V(W l;h(; l'(?:;id(',llC(~ 
inci(\]clil;, which is al, i, he rool; of l;h(' po\]il;icn\] 
crisi.<~. 
3.3 \[E:u.t;i ty-r(:la/;ion graph 
'l'h(; ,q(:ore, s(:oY<.;(i,j) of a r(;la,i;ioii I)e,l;we(:u i, wo 
cnl;ii;ies i a,lid j, i:4 de, tined by: 
j) 
:: I,":('i)l..(',.)i IS':(;)l<-.,(.7) 
\ . 
t / J >,.~ ,S'( S';( i) )f~,S'( l,;(j) ) 
wt,(3re ,';(l';(i)) is l;I,(, <;el; of s(,,,lx.,,c,; ,,odc,<~ whici, 
(lomiiia,lx; olie, o1: l;h(; l,odes hi lO(7) mid I S',:(":)I 
,..,,1,o,. or :,, z,:(.s.), z,:(.,:): ,.,(?:): 
and score(s) ha.v,p b('on d,c:iined in ,S(;ci;hm 3. I. 
I ,:,,ii (,r <zd >." idJ':, wiiicll is 
~1, 111(;;/,SI11"(; Ot: l, OFill iint)orl, a,nce wid(;ly ll,qe(\] \]11 
in forma.l;ion r(;tl:ieva,l. 
If s(:o'#'e(i;j) i,~; sntti(-i(;ntly la,l:,~V; , IJl(m 
.<;'(so(i)) n ,s'(s,;(:/)) ,:,,,,mi.i,,e 
1)olJi l;h(; elli;il;i(;s) caal consisil;uC(; a. cross- 
t./O(:lliilOlli: gUlrllila,l'y C.Ol\](:(;rilill~ i aai(l j6 
An cillJ/;y-rdatioll gra,ph (E-I{ gra,1)h) is mad(; 
of Llie r(;l~tl;ions highly ra,nked in t(;rnis of l;h(; 
score defined hi (2). Figur(; ,\] sliow,~ the E-I{ 
gra,1)h ilia,d(; o\[ l;he, top (;\](w011 lela,t, iOllS oxl;ra,cLe(l 
frolii l;h(; a.rlJiclo,~; a,bout Peru lio,sl;a,l~(; \]ncid(;lll;. 
'\]'h(; llllrlll)(;l'S llO,q,l" 1;h0 lille,'-; l:(;\])l'O,,qOlll, |;lie, l'a,ll\](S 
of the r(',la.i;ions. 
6COl:eferoliC(; chains are used 1;o slinllilarizo single doc-- 
liilleiil;s 1)3; Azzalli el; a\]. (\] 99.()). 
\])C£'U ho:;t:<-tLre \]n<J c-{(!Ilt: 
tt \] )(1 i:'11V:: {lll 
P£-~Z U {ll\[/b\[l~;*<scido\]" # k', 
()1):3 (xr. V(XIT~ 
t;'u:i ±:::or\] 'J'tll)F'tf: Alhct£'ll 
?.I) I of l~(;ru no~-;l,a.g(} iii(:i(l(,,ili,. 
Tli{; l;ot)-la, iiked l:cla,i;ioH wa,,~; i;h(; one t)(;. 
I;w(,,e.ll /)(:?'7t a,n(\] +\](I/f)(I,?I,(',,';(D (hTll,\[J(l,,£,q(l,(\[,Ol",q 7'(.~,q7 
U(;u(:('. 'l'}wee f;(',III;I:HICO, S (;KIXa,(:t;(X\] fl'()lll i;lie ei~,;hl; 
,{;01ii;(;lic(',~; which co~il;a,hl(;d I)ofiti of t;he (;nl;il;:i(;,<~ 
were as follow:< Y 'l'}le, y wet(; iis/;ed ill chl°()i/o 
\]o<9~ica.\] ()v(leY which wa.s i(le\]~l;iN(;d I)y i;il(; (la.i;c 
i l~l'oi:lna, IJoi~ iti die a,li,icl(;,~. 
1. ,-.t)> -;so., ~ ¢),~@tl._ x: t'._, <'-:. i:Tfill 9 "..dTl.:<h~, ! l.'b:A: 
I::.~j~!{",,!:;~ a~. 1 i .'i~. ,<'.;I.,: :~;)l,i,i!,.il~f~:,~DV\!('.{::: 
,-'-t g a~, 2: ,, 
A,::coMin~,, i;o rv, porl.~ from Ik~ru, on l;h(; :17t;h Lh(! 
,J;+lp~lll(;~q(} HIIII)}'I~g~I{\]OF~; I (}+qi(l(~l\](;I.~ ill I ,\]liHi! {;1l(~ C~II)i 
lal, w~L.'; al.l;acked I)y o~ nmn,od r,~J'otlp~<;, allo/,,e,.lly 
hd'l,i::;I; {~ll{~l-ril\]alf;, ,~iil(\] i117i113 ~ p(X)l)\](; \[FOlll })ol;ll 
,Japali():;(,, ~lll(\] l~(H'll\'iail ~ddc,~ WClC held ill ho:d;H~'>c. 
2. ,'{)b, -7) t I ~£k,f~ltll'd~2Vl.:l, 7< :7t7~-4:" U :.7 i,: <t: ~s 
~f'Pli:.f'l= e.,~a:j~i~i: is H. ,'~;l,--}~;~:;i::#li ,A'.,'~>)'7:'<7 
~(d~JT~;~iT~'ig-<,< ~. P-i:A:. ~i4Jfi"~'o)JJ~ii4gI~s,~l,i.h)l< 
(3Oll('.(!l'lliill~" |;hi; ilosl,;il,;l; incidm~l; ill; I;lm 
Japaneso aii3bassadol°.~; i-e<~;ltl.c;nc{; c, atls(~.d \]~y 
armed F;lmrloilla, mii, he \]8/;h Lhe f';OV(Wliiil(!lll 
reqil0slJcd t;h0 I)crllvian ?jOV(}I'III|I(}\]Ii; l;O IIH~;III'(~ l:\]m 
,,mfel;.y of l;he hoed,agcs~ ~llld ,%(~,11\[ Mr. tI()IUUTI 
Taka,hil~o, (;oordinat~or, I)ivi:don cd Middle alld 
fToill;h Am(wica..-- 
LI~D, 5 t-". ~. l,_.;"~ ~ \]?7,?5 9. 
llre.<ddenl; l,'njiniori'.<; p,.)liliical aul:horil;y will ICCOV(W 
1)ccause lle mmceedod in IJic operal,ion to I)reak inlo 
l;\]le ,J:4pallO,~;o, atllba>,;Hntlor~.~; residence in Peru (ill 
the 22rid. 
Thcs(' ,scnt(;uce,% (;xlxa,clx;d \['rom dillS;rent a.i:- 
t;icles, ha,',;e be(;u I)a.ra,plwa.sed (m i;ho basis o\[ 
7'F\]IOSO ~Olll;(}llC,(;,'-; \vo, ro soloc\[,o(\] IIHHlll,~llly t;o dOll\]Oll- 
si;ral;e I;he possibility of cross docunmld; sulmnarizat;ion 
Imsed on o:)relcrence. 
896 
(:(,l~(;\[urellu(;,';. ~iin(:(: (,ll(,, il;ull(,, ()i i;h(.' Kuerilla 
,,;foul) i,~; i~()l; i(l(nil,ifie(I ill i,h(', I)(,,~,;illiiillj,; ()I: Hi(' 
ili(:i(h;ni,: i;h(; (;xt)r(;,~;,~;h)l) ~/'~t)}~QT"-" U "~ J'i/? 1");{ b,~O 
l(;l'i;i',;i; Ku(;rrilhu;) i~', u:;(',(I i)~ i,ll(,, \[in;i; :;(,.illX,ll(:(; 
(;h(;r('.. 'l'lii:; ('.:~l)r(;~;:;i()n hal; 1)('('.~ i('.l)ia(:('.(l \vi(,ll 
:Z :1~;~2 " U .; (i" 'J/\' ~/° 'Y':,'ql~): (1( ,i < ;(, ~,;uu,ritlu;: 
('l',,~,:,,(: A,,.:,. ,)) I):y ,t,,;i,u,; <:r,>:;;; d(,,:.,,,(,,,,i ,:(),..r 
()F(;IICO;;. ~\]'\]1(', (;(luiva,hum(; ()f l,ll(; lit:;(, ',;(;lli,Q, ll(:(', 
a,.d i;h(; tit,% houri !>lu-a,,'~(; uf 1;1~(' ~;(,,(:<;~l(I ,u('.~)(;(;~(:(;. 
'.¢.)b- u) I \[ >'i~ .'t~'7~6\i~i'( ''j'~zi: 1,/,:.i,;t{;;J,(i-f; + U ~., t,. • i' J %1. +, k¢-~l 7lt I 
"-" 7,:) ;U, \]++ ,:}: ~,.)_A.!..(: ,'I~I - ((,Iw \]i(),%a,v,(: hi(:i(huil, (:;t,~!,~;(:(! l).y 
;I./Ju(;(i ?;u(;rr\]lhu; u,i, I;1~(, ,\]H,\[;,il,ll(;,<q( ~, ;tud;iu'>,,-;mh)r:,<; 
r,:~:li(h':li(:(, iu I'('ru): ',.\, l , r( ,. 1,r(il)(~l'l.v ~1('(,(~(:1,(~(I ;I.il<l 
',',, /,<; V('l)lU.u('(i I)y ;l, ilOi;\[i(;i" (~X})i'(%>'-;i()li I)(~.<;I.tlH( ', i,1~('. 
(;(\[tiiv;t,/(;ii(:(~ ()1' (;V(;lli,:; ;i(q-()'H,<; 1)():;',',il)15 , difl'~u:eiil. 
(\]()('.lilil(;iit,'; (i\t(:l/.('.c)w~) (fl, .i.\].~ l{){)!J: l gu.i.,.il;i 5 ('.i, 
M., I,()!)9) ti;l,<; I)('(;ll ;i.l,<;() (h't,(~<:l,(~(} l).y (:(>itlli);ti+ilt~'; 
I)t(~di(:ub.'. ;I,l~,~lllli(;ll(, i;l,)il(;l, lii(;,<; ()\]' I'(;I(;V;I,ili, :;(:u 
ix;n(:(;:;. })~l.,l,(' (;~':l)r(',,~;,<;Joti:; :;ll(:\[i ~i,<; ~17' It' ((,li(; 
I 71,\]1) \]ilw(' t)(!(;il ;I,Ill';lil(',lil;(;(! iil,:(' 'i !)!)(i ffl t7 ) j 
I'( I \[' (\])('.(:. 1'(, J{){)(i). 'I'll(; l;(',,'-;lii\[,\]li~'; t);I,";,<;;I.t>~C,% 
.~/,1(; l)('.h)w (uli(h;i\]ill(,,:; ili(lit';l,l,JiW; i);ir;i.i)liru.<;(~,<0~ 
l,()p;(;l,h(;r wil,\]i I, he\]r I+;ul,;li,<',li i,i'a,ii,~;hl, l,i()im (I)(>ht 
l'a(:(; \]ii(Ih'.;t.1,\]li~,j i)u.r;~,l)}ll'm;('..;): 
I. ~ )b - - D' C, (j))714 }}t t..L ~5 ('t. i'i 7fit :J ," ~li l. 
;b d,> I\[ \]~ :)< {'1! $} J~l/ D I 1!)gu 41-12 )i l',ll, 
)il, 421/ (/)l,i,iiql~ffi~-~~f2.D~,'('{I, >" ~',>)~/~ ., 
h(:c<)r(iiu K 1,() m'~w:; I\]:om \])(!lu, on I)(~,c(~nlt)(n. :1 '(, 
;I 99(i tim ,}ul)ml(!.~(! unll>u~,uador':~ r(!sid(!H(x! iu I,inm, 
l;he (:ul)il,al, wa~ al,l,m:ked 1)y lef'l;i:;(; Ku(n'ri\]lu:s 
('.l'ulmC Amarn), and mmly l)u()t)l(! lr()lH 1)ol,h 
,hG)mmse mM Ih,ruvian M(le,'~ wet(,, lm\](1 iN h():;tage. 
(JOll(;(!rllillg l;ho. hostage iu('ident, tlm l.;m:(!rn- 
ili(;nL r(?qll(?;.;1,cd (;\]1(~ l"ci;ilViall ~t'~()VCI'IIIII(HI|~ 1,() ~lY;NIII'C 
l;he hosl;ages ~ ,~;al'cty (m \])(;c(;mhor (;he, \] St;h, an(t 
sent Mr. \]IOIHU'I'I Tal<ahiko, co()r(linal,or, Divi- 
sion of Middle, and Sou(;h America, Ministry o1 ln- 
t(;rnal;i(mal Aft'air.% I;o \]hn:u on (,hat, nigh(,. 
/)r(;side, nl, I?u.iil:nori's l)olil;i(:al authori(,y will l'()cov(}l" 
bocause he succe,(xlod in IJ~(; Ol)(',ral ion (;o IJr(~ak into 
L\]I(~ ,}ltI)allCS(: ~UIII);ISS~I(IOI':bi l'o,si(l(~n(;(~ in l)(n'u (ill 
April 22~ 1997. 
d .\] F, vuhutt;io.u 
!'Z*aluu,i,i()l~ ()\[ lIlll\]\[~i-.(\]()(;lllll(;lli, :;(Immm'iz;~.l,\]():~) 
(:~ll \]>; ti)~" i';~r~" J \]; ~'('; ~'ll(~}" ( :( )~; l; 1;hm~ I,li;((, oF ',;iu~',lu 
(h)(:tl,i)('}H, :;mtlt)tu.vizu.(,i()\]i. 'l~(;:;t:l)(,(t',; J'()r evu.\]im.- 
(,i{))l ()\[' illll\]\[i (\[()CilHI(;II\[; N/IlIlllI;)~i'iZD,\[,i(.)I/ }I;I,V(; I1()\[; 
I)('.ml dev(;hg)('xt 3'(%. S() (;l~(; t)r(;,u(',ul, (wuhm, l,i()u i:; 
\[i~il,(,(t i,(" ih(~ :;m)G)le ,uu(; ()1' ,~l:l'(;i(:\](;S lu('J!l,\]()il(;(:l 
;d )()v(', /)ul, (,h(; ()\])(,aine(l I (':~ul(,;; sn?;K(',% p;cu(;ra\] 
,g)l)li(:al)ilil,y (ff t, ll(, i)\]()/)(),~(~(l m(;l~hod mid :;ul) 
l)()cl;:; t,h(; c()~Li('(:i,ttr(, i;im,I, ,~;l):r('adi:uK :ml, ival,h)l! 
i:; (;\[l'e(:l;iv(' i'()r )uuli;i-d()(:ulH(;ul, mull;i/;()l)i(: :;mll- 
,i ~m'i zu,l,i( )u. 
/\:; di:;cu;~,",e(l ill I,Iw, i)re\,-i(m,~; ,~;eul, i(m, l,he IW() 
I'():;e(i iI~('l;ll()(I (:'ul (;xl,vm:(, i)ui)()rl,;uli, a,v(,i(;}(~;, 
I,Ilui h>. llle ()l)(UdiG! i al!(I :;el,L\]('!li('~ll, ar(,i('.l(~;, 
t'r()i,i lit(,y url,i(:lu,~; ;I,\])()/11; \]'(',I'll h(),%a,Ku i}miduul,. 
..'\\]:;(). :~.~ I';-I1. ~,?ul)h (:()ll,~;i:;I,ilt~,; ()\[ itu/)()rl,a.lli, r(',lu 
I,i()~ u; mu( )\] i?~ iu )1 )o\]t,a.\]l 1; (',\]J l,i i,i(,,% \] '(:?"u, ,J(rp(um:>;(: 
(VIl/,l)(t,$;,(;(I,(/O'l','; : 'I'CS;?I(/(YIIC(:~ '\["II,'\])(LC A'IH,(L'I"II~ ;/,li(\[ ,%0 
(hi. \]m:; \])('X'n :;U(:(:(',~;:d~U\]\].y (:O1L;d;I;ll(:l,e(t ()\]1 \[;}li,~; 
l)u:d:;. 'f'h(', u,1)(w('.-~w, ul;i(u~(;d :~('.(,1~()(1 al:;() u,~;u,;; 
(:r(),';:; (\]()(:u~(;n(, (:()r(fi'(',r(,,ll(:(,,u For r(,l)\]m',ilG,; ('X- 
I)I'(};;V;i()II,H Wi\[,ll II\]()I'(~ C()II(I\]'(',IiC ()ll(;~;. 
/\}1 (I1(;;;(' u)(; ;,,)'<:lliv(;d (~:~.~('n/;\]u.lly l)y u,~i}w; ili- 
\]'()rlll;ll,h)ll ill (,h(', (II)A I,a,J';g;ilIJ'£ ()lily, I)lli, lit)l, 
(h;(l iu (,('ull)\]u,(,(;,~ li)r int'()Him, t,i()li (;xi:ra(:l,h)u. 
'1'11(,, 1))'()1)(),,;(,(l u}(;i,h(>(l i,~; h(;n(:e (,Xl)('x:l;('xt (,(> (lu 
(u'(;a,(,(~ UN Ul)l)r()l)ri;d,(; I'2,,1{, gra,1)h whuil a l)l)lie(| 
1;() a,u()i,h(;r ,~('1, ol! do(:mu(,n(,,~ ~t)()(ll, mul(,ipl(; t,o\])-- 
4°2, qYanMi)rma I;i()n 
t\]'\[l(; \])I;()CC,HH ()V f;llllllllgtl'iZ~l,\[;i()ll c~l,l} })(; (1(;(:O111 
l)O,~;ud in{,o i;lu'(;(', ,~;l;~g(',~; (Sl)a,rc\]~ Jones, .I 999): 
\]. SO/IIC(; |,(;X(, ill,\[,(:1'\])l"(),\[,(tt'~O'\]l, l;O ~OllF(;(; \[;(;X(; 
r(' 4)r(',,~(ul LM;ion, 
2. s()/ll'C()r(,l)r(;,s(;nl,;~l;ion L?'a,'u,'tfo,r'm,(U, ion go 
,~llllllll~W.y ~OX1; \];('~\])l'(;S(~lli;~L(;iOll,, ~l,ll(| 
3. ,smmua.ry (;0x(, 9c',,c'mtion from ,smnmary 
r(; 1)r(;s('ntati()n. 
(~l)A-tagg(;d do(:mn(~ults arc rega, rded as sour(:u 
(;('xl; rct)reseni;ations. The m(;l;hod (lescril)(~'d 
at)or(, \['ocuse:~ on the trm~sforma, l;ion sl~g(;. \]b; 
mull, i-lingua.lii,y (:om('s from t, he mull;i-linguMii;y 
oJ' i;h(' ,%a,g(.',. 
897 
5 Conclusion 
Summarization of multiple documents about 
nmltiple topics has been discussed in this pa- 
pet'. The method proposed here uses spread- 
ing activation over documents syntactically and 
semanticMly annotated with GDA tags. It is 
capable of: 
• extraction of the opening and settlement 
articles from fifty articles about a hostage 
incident, 
• creation of an entity-relation graph of im- 
portant relations among important entities, 
• extraction and pruning of important sen- 
tences, gnd 
• substitution of expressions with more con- 
crete ones using cross-document corefer- 
ences. 
The inethod is essentially multilingual because 
it is based on GDA tags gild the GDA tagset 
is designed to address nmltilingual coverage. 
Since this tagset can en,bed various linguistic 
intbrination into documents, it could be a stan- 
dard tbrmat for the study of the transformation 
and/or generdtion stage of doculnent summa- 
rization, among other natural  process- 
ing tasks. 

References 

Saliha Azzam, Kevin Ihlmphreys, and Robert 
Gaizauskas. 1999. Using coretbrence chains 
for text sunnnarization. In A CL'99 Work- 
shop on Cor@;rcncc and Its Applications, 
pages 77 84. 

Amit Bagga and Breck Baldwin. 1998. Entity- 
based cross-document coreferencing using the 
vector space model. In COLING-A UL'98, 
pages 79 85. 

Regina Barzilay, Kathleen Ft. McKeown, and 
Michael Elhadad. 1999. information fltsion 
in the context of multi-document summariza- 
tion. In A CL'99, pages 550-557. 

K6iti Hasida, Syun Ishizaki, and Hitoshi Isa- 
hara. 1987. A connectionist approach to the 
generation of abstracts. In Gerard Keinpen, 
editor, Natural Langauge Generation: New 
Results in Artificial Intelligence, Psychology, 
and Linguistics, pages 149-156. Martinus Ni- 
jhoff. 

Kgiti Hasida. 1997. Global l)ocument Annota- 
tion. Ill NLPRS'9Z pages 505-508. 

Sadao Kurohashi and Makoto Nagao. 1998. 
Japanese morphological analysis sysl, eln JU- 
MAN manual. 

Sadao Kurohashi. 1998. Japanese syntactic 
analysis system KNP manual. 

Inderjeet Mani and Eric Bloedorn. 1999. Sum- 
marizing similarities and differences anlong 
related documents. Ill hlderjeet Mani and 
Mark T. Maybury, editors, ADVANCES IN 
A UTOMATIC TEXT SUMMARIZATION, 
chapter 23, pages 357 379. The MIT Press. 

Mark T. Maybury. 1999. Generating sum- 
maries from event data. \]n Indel:jeet IVIani 
and Mark T. Maybury, editors, ADVANCES 
IN A UTOMA2TC TEXT SUMMARIZA- 
TION, chapter 17, pages 265-281. The MIT 
Press. 

Kathleen McKeown and Dragolnir R. Radev. 
1995. Generating summaries of imfltiple news 
articles. In SIGIR'95, pages 74-82. 

Kathleen R. McKeown, Judith L. Klavans, 
Vasileios Itatzivassiloglou, Regina Barzilay, 
and Elezar Eskin. 1999. Towards multi- 
docuinent summarization by retbrmulation: 
Progress and prospects. In AAAI-99, pages 
453460. 

Katashi Nagao and KSiti IIasida. 1998. Au- 
tomatic Text Summarization B~Lsed on the 
Global Docmnent Annotation. In COLING- 
ACL'98, pages 917 921. 

Yoshiki Niwa, Shingo Nishiokg, Makoto 
Iwayama, Akihiko Takano, and Yosihiko 
Nitta. 1997. Topic graph generation for 
query naviagation: Use of fl:equency classes 
tbr topic extraction. In NLPRS'9Z pages 
95 100. 

Mark Sanderson and Bruce Croft. 1999. De- 
riving concept hierarchies from text. In SL 
GIR'99, pages 206 213. 

Karen Sparck Jones. 1999. Automatic summa- 
rizing: factors and directions. In Inderjeet 
Mani and Mark T. Maybury, editors, AD- 
VANCES IN A UTOMATIC TEXT SUMMA- 
RIZATION, chapter 1, pages 1-12. The MIT 
Press. 
