Toward the "At-a-glance" Summary: 
Phrase-representation Summarization Method 
YoshihiroUEDA, MamikoOKA, TakahiroKOYAMA and TadanobuMIYAUCHl 
Industry Solutions Company, Fuji Xerox~ Co., Ltd. 
430 Sakai, Nakai-machi, Kanagawa 259-0157, JAPAN 
{Ueda.goshihiro, oka.mamiko, Koyama.Takahiro, Miyauchi.Tadanobu}@fujixerox.co.jp 
Abstract 
We have developed a summarization method that creates a snmmary suitable for tim p,'ocess of 
sifting information retrieval results. Unlike conventional methods that extract ilnportant sen- 
tences, this method constructs short phrases to reduce the burden of reading long sentences. We 
have developed a prototype summarization system tbr Japanese. Through a rather large-scale task~ 
based experiment, the sumnmry this system creates proved to be effective to sift IR results. This 
summarization method is also applicable to other s such as English. 
Introduction 
Sulnmaries are used to select relevant information 
from information retrieval results. The goal of 
sunmmrization for such "indicative" use is to 
provide fast and accurate judgement. 
Most automatic summarization systems adopt 
the "sentence selection" metho& which gives a 
score to eve~ sentence on the basis of its charac-. 
teristics, such as word frequency, the position in 
which it appears, etc. and selects sentences with 
high scores. 
Tim sentences collected in such a way tend to 
be so long and complex that the reader must 
reconstruct tim structure while reading them. 
Reading such sentences involves some annoy- 
ante. 
Our aim is to reduce this burden by provkling 
an "at-a-glance" summary. 
Phrase-representation summarization is a 
method to create the "at-a-glance" summary for 
the Japanese , t tere we present the 
concept, the algorithm, and ewihiation of the 
efficacy of the summary produced by a prototype 
based on this method. Extension to English is 
also discussed. 
1 The Concept 
Examples of an "'at-a-glance'" summary are the 
headlines of news articles. The headline 
provides intbrmation tbr judging whether the 
article is to be read or not an& in this sense, it is 
really °°indicative." The characteristics are: 
• Brevity (short in length) 
• Simplicity (less embedded sentences) 
We use "'pllrases" to represent the simplicity 
characteristic I and set our goal to create phrase- 
represented summaries, which provide the reader 
with an outline of the document, avoiding reading 
stress by enumerating short phrases containing the 
important words and concepts composed from 
these words. 
The lnethod we adopted to achieve this goal is 
to construct such phrases from the relations 
between words rather than extracting important 
sentences fl'om the original document. 
2 Summarization Method 
2.1 Outline of the Algorithm 
Here we give a short description of the ot,tline of 
tiffs method using the example shown in Fig. 1. 2 
i The word "'phrase" used here is not of tile linguistic 
sense but an expression tbr "'short" and "sinaple." In 
Japanese, there is no rigid distinction between "phrase" 
and "'clause." 
-~ In tiffs paper, Japanese words are represented in 
English as much as possible. The words left in 
Japanese arc shown in italics, such as -~a" (a particle 
for AGENT), "jidai'" ("era"), etc. Each relation name 
is constructed from a Japanese pailicle and its function 
(shown as a case name or an equivalent l-;nglish 
878 
(a) original \[ "(9:.r~gl j sl~ translation) At the Green Fair held on 24th, a venture company PICORP 
ap.nounced it licenses its environment protection technology to B_MICO, the 
U.S. top compa:!v. PICORP's CEO Ken Ono said that... 
(b) analysis graph & (1) analysis of relations 
....... ~~--~ (;,',,enF~,ir } (2) selection of 
. /, • . ...... "nioite"-AT ~ core relation 
m -FQ \[ - "1 "'ha'-IHEM\[= ~'~ \] 
venture ~'\[ PICORP j 
'-- I 
...... e    ro m--71ent 4, I 
, ,"no-ur ! protection I /'- . ' ... " - "-I, , , , ~t"-~\] license \[ ,.-larmouncedl 
\[ t_ecnnomgy I".,o"-OBJ' ~ ' ' - I 
, II ~ (3) Addition of relations 
....... ni '-DAT 
(c) obtained phrase 
.~ (4) generation 
PICORP licenses environment protection technology to AMICO I 
Fig. 1 : Outline of phrase-representation summarization 
The method consists of the Efllowing tbur major 
steps: 
(1) Syntactic analysis to extract the relations 
between words 
(2) Selection of the core relation 
(3) Adding relations necessa D' for the unity 
of the phrase's meaning 
(4)Generating the surface phrase from the 
constructed graph 
First, the sentences in the given document are 
analyzed to produce directed acyclic graphs 
(DAGs) constructed fi'om relation units, each of 
which consists of two nodes (words) and an arc 
(relation between tim words). Each node is not 
only a single word but also can be a word 
sequence (noun group). 
Then an important relation is selected as a 
"core" relation. In F'ig. l, the arc connecting the 
two shaded nodes is selected as the "core." 
The core relation alone carries insufficient 
information to convey the content of the original 
docunaent. Additional arcs (represented by 
preposition). 
double lines) are attached to narrow the infornm-. 
tion the phrase supplies. 
The tbllowing short phrase can be generated 
fi'om the selected nodes and arcs in the graph: 
PICORP licenses (its) environment 
protection technology to AMICO. 3 
Phrase-representation summarization enutner- 
ates such short phrases to give the readers enough 
infornmtion to grasp the outline of a document. 
This algorithm is explained in the next section. 
2.2 Further description of each step 
The steps shown in the previous section consists 
of a cycle that produces a single phrase. The 
cycles are repeated until the generated phrases 
satisfy a predefined condition (e.g. the length of 
the summary). The scores of the words used in 
the cycle are reduced by a predefined cut-down 
; This short sentence can be expressed as a phrase in 
tt~e linguistic sense in \[.;nglish: 
I~IC()RI)'s licensing (its) environment protection 
technology to AMIC(). 
879 
ratio to avoid fi'equent use of the same words in 
the summaiT. 
The basic algorithm is shown in El,, "~ 
Relation AnM|,.~'is 
Syntactic analysis is applied to each sentence ill 
the document to produce a DAG of the relations 
of words. We use a simple parser based on 
pattern matching (Miyauchi, et al. 1995), one of 
whose rules always judges each case dependent 
on its nea,'est verb. Some of the misanalysis will 
be hidden by "ambiguity packing" ill the "addi- 
tional relation attachment" step. 
Relation Scoring 
All importance score is provided for each relation 
unit (two nodes and an arc connecting them). 
First, every word is scored by its importance. 
This score is calculated based on tile tf*IDF wdue 
(Salton, 1989) 4. 
Then, the relation score is calculated as fol- 
lows: 
Score = Srel * (Wl*S1 + W2"S2) 
Here, SI and $2 are tile scores of the two words 
connected by relations. The score of a word 
sequence is calculated by decreasing the sum of 
the scores of its constituent words according to 
tile length of the word sequence. 
Wl and W2 are the weights given to each word. 
Currently, all words are equally treated (WI --- 
W2 = 1). 
Srel is the importance factor of tile relation. 
The relations that play central roles ill the 
meaning, such as verb cases, are given high 
scores, and the surrounding relations, such as 
"'AND" relations, are scored low. Tile relation 
scores for modifier-modified relations such as 
adverbs are set to 0 to avoid selecting them as the 
core relations. 
Core relation selection 
The relation unit with tile highest score among all 
relations is selected as the "core relation." 
Additional relation attachment 
The inlbrmation that the core relation carries is 
usually insufficient. Additional relations arc 
attached to make the information tile phrase 
• ~ ll)F is calculated from I million WW~,V documcnts 
gathered by a Web search engine. 
Doctlnlent 
__. ~ Input 
Relation Analysis 1 
Relation Scoring \] 
I \[ Core relation 
1 \[ selection \[ Relation\[ 
\[ Generation of I \[_.surface ~hrases I 
Output 
\[ Snlnnlary \] 
Fig.2 Basic flow of the algorithm 
supplies rnore specific and to give the reader 
sufficient information to infer the content of the 
original doculnent. "File following relations are a 
part of the relations to be attached. 
@ Mandatory cases 
Relations that correspond to mandatory cases 
are attached to verbs. Mandatory case lists 
are defined for verbs except for those that 
share tile common mandatory case list, which 
includes °'ga'-AGENT, %vo"-OBJ and "ni"- 
DATIVE. "Ha"-'ftfEME, "mo'-.ALSO, and 
null-marker relations are also treated as man- 
datory, because they can appear in place of 
the mandatory relations. 
Ex.) AMICe "ga"-AGENT release 
-+ AMICe "ga'-AGENT 
PDA "wo'-OBJ release 
(AMICe releases PDA.) 
@ Noun modified by a verb 
In Japanese, the "verb - noun" structure repre-. 
sents an embedded sentence, and the noun 
usually fills some gap in the embedded sen- 
tence, l('the verb in the core relation (noun 
-- verb) consists ot'sucll a verb -noun relation, 
the modified noun is also assumed to carry 
important information, even if it does not t511 
the mandatory case (fllough the case is not 
880 
arialyzed in tlic ctlrrent algorithm)° Tim.<; the 
verb - llOtlll relation is attached to tile core. 
Ex.) PDA "wo".-OBJ release 
PDA "wo"-OBJ release 
0-THAT 5 AMICe 
(AMICe that releases I:>DA) 
PI)A "wo"-OBJ release 
-~ PDA"wo"-OBJ release 
0.-Tt4AT pDs! 
(a plan to release PDA) 
@ Anlbiguity packing 
The analysi.s trees often contain error.<; be-- 
cause the pattern-base parser doesn't resolve 
ambiguities. For exarnple, the strtlCttlre 
V 0-.TI-IAT N1 "no'-OV N2 (Ving Nl's N2) 
i,q ambiguous in Japanese (V can rnodil~,/ 
either N1 or N2 but the parser always aim-. 
lyzes N2 as modified)° lf'the V-.NI rehltion 
iv; selected as the cole, the N1-N2 rehition is 
always attached to the core to include the pos-. 
sible V-N2 relation. 
il Modifiers of generic llOUllS 
Tile concepts brought by generic rloun,; such 
as <~momf" (thing), +~koto" (<~that"' of that- 
clause), ~baai" (case), ~Tidai" (era) are not so 
specific that they usually acconlpany lnodifi-. 
ers to be infbrmative, tlere such modifiers 
are attached to make them intbrmatiw e. 
l';x.) era "ni".TIME emerge 
' ~U~ "no"-.OF era 
"ni".-TIME emerge 
(emerged irl the era of confi,isiorl) 
77,rmimgtian comlitio~, 
Judges whether tim surnnlarics created so far arc 
sufi-icient. Curreritly the termination coriditiori is 
defined by either the number of produced phrases 
or the total summary length. 
Re-scoring ojrelationu 
If the condition is not flllfilled, thes;e steps from 
selection of the core relation Must I.)e repeated to 
create another phrase, t}efi)re selecting a new 
core, the scores of the words used in this cycle are 
reduced to increase the possibility for other words 
to be used in the next phrase. Score reduction is 
achieved by multiplying tile predefined Ctll-dowll 
ratio R (0 < It < 1) by the scores of the words 
used. l,>,ehition scores are re-calculated usin.~, the 
nov, word scores. 
Generation o.f sur~we phrases 
Tiffs process produces I)AGs each of ~laich 
consists of one core relation and several attached 
iclations. In ,latmnesc, the surface phrases can 
be ea.,;il) obtained by connecthlg the still'ace 
string of the nodes in their original order. See 
Chapter 5 for the generatioil method for \]\[:,nglish. 
3 The Prototype 
Wc developed a prototype of the summarization 
system based on this algorithm. The development 
 is Java and the system is working on 
Windows 95/c)8/NT and Solaris 2.6 a. 
The time consumed by summarization process 
is in proportion to the text length and it takes 
about 700 rnsec to generate a surnmal T for an Ad 
sized document (2000 Japanese characters) using 
a PC with a Celeron processor (500 Mtlz). Over 
95% of the time is consumed in the relation 
analysis step. 
4 Evaluation 
We have conducted an experiment to evahiate the 
system. This section is a short sumrna W of the 
expei+iment reported iri (()ka and Uedar, 2000). 
The aim of a phrase--represented summary is to 
give fast and accurate sifting of lit results. To 
evahiate whether the aim was achieved° we 
adopted a task-based evahlation (Jing, et al. 1998, 
Mani, et al. 1998). One of the problems of those 
experiments using human subjects as assessors is 
inaccuracy caused by the diversity of assessment. 
To reduce the diversity, first we assign 10 
sub.iects (experiment participants) fbr each 
sulnnlary sample. The nunlber of subjects was 
just I or 2 in the previous task-based experiments. 
Second, we gave the subjects a detailed instruc-- 
tion including the situation that led them to search 
the WWW. 
4,1 Experiment Method 
The outline of the evahiation is as follows: 
5 '0'" shows that there ~ll'e i1() particle~; ur any other 
\~,ol'ds Collnccting two ;~,old:-;. ,lapttrics;e dticSll't 
require anything like relative pi+onoun+< 
~' .lava and Solaris are the tra(temarks of Sun 
Microsvstems. Windows and Ccleron tll'O the 
mldcmark!; of Microsoft and lntel, respedively. 
881 
® Assume an inlbrmation need and make a 
queIw for the information need 
® Prepare simulated WWW search results 
with different types of summaries: (A) first 
80 characters, (B) important sentence se- 
lection (Zechner, 1996), (C) phrase- 
represented summary, (I)) keyword enu- 
meration. The documents in the simulated 
search result set are selected so that the set 
includes an appropriate number of relevant 
documents and irrelevant documents. 
• Have subjects judge from the summaries 
the relevance between the search results 
and the given int'ormation need. The 
judgement is expressed in t'our levels (from 
higher to lower: L3, L2, LI, and L0, which 
is judged to be irrelevant). 
• Compare the relevance with the one that we 
assumed. 
The documents the user judges to be relevant 
compose a subset of the IR results and it should 
be more relevant to the information need than the 
IR results themselves. Because we have 
introduced three relevance levels, we can assume 
three kinds of the subsets; L3 only, L3+L2, and 
L3+L2+LI. The subset composed only from the 
documents with L3 judgement should have a high 
precision score and the subset including L1 
documents should get a high recall score. 
4.2 Result 
Because recall and precision are in a trade~off 
relation, here we show the result using f-measure, 
the balanced score of the two indexes. 
2 * precision* recall 
f -- meaXlll'e = 
precision + recall 
The fmeasure averages of the experiment 
result of three different tasks are shown in Fig. 3. 
It shows that the phrase-represented summaries 
(C) are more suitable tbr sifting search results 
than any other summaries in all cases. 
4.3 Discussion 
The result can be explained using the number of 
summaries that contain clues to the information 
need. Summaries consistin,, of short units 
(phrases (C) and keywoMs (D)) are gathered from 
the wide range of the original text and accord- 
in,.zlv have many chances to include the clues. 
The actual average numbers of summaries that 
phrase-represented Stltllnlal-~ 
E1A FIB EIC liD 
,¢ 
:::J 
if) 
t'u i:11 
E 
ut. 
0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.2 
0.1 
0 
T 
Only L3 L2 L3 L1 L2 + L3 
Fig.3 Experiment result 
contain the clues are 2.0, 4.3 and 4.7 for (B) 
sentence, (C) phrases and (D) keywords, respeco 
tively, in spite that (D) keywords include more 
clues than any other samples, they don't get a 
good t-score. The reason is considered to be due 
to the lack of information about the relations 
among keywords. 
5 Applicability to Other Languages 
Although this algoritlun was first developed for 
the Japanese , the concept of phrase~ 
representation stmunarization is also applicable to 
other s. Here we show the direction 
toward its extension to t'nglish. 
English has a clear concept of ~'phrase," and 
simply connected words do not produce well- 
formed phrases. I'his requires semantic analysis 
and generation from the semantic structure. 
We will consider the following example again. 
Ex.) A venture company PICORP announced 
to license their environment protection tech- 
nology to AMICO, a U.S. top company. 
lf"PICORP" and "license" must be included in 
the summary and "announce" is not so important, 
"PlCORP license(s)" is the core of the desired 
phrase. Generating it requires sub.iect resolution 
o\[" "license" and thus semantic level analysis is 
required. Moreover, predicate-argument struc- 
tures arc preferable to syntactic trees because the 
sub.iect and the object are represented in the same 
level, thlification gramtnar flameworks such as 
I,FG (Kaplan and P, restmn. 1082) and tlPSG 
(Pollard and Sag, t994) fulfill these requirements. 
Fig. 4 is a part of the analysis rcsuh represented in 
I.FG. 
882 
PRED 
SUBJ 
VCOMP 
'announce( 1" SUB J) ( ? VCOMP)' 
\[1\]\[PRED "PICORP"\] 
PRED 'license(i" SUB J) ( \[ OBJ) (t TO OBJ)' 
SUBJ \[11 
OBJ \[PRED 'environment protection technology'\] 
TO E PP TO 1 OBJ \[PRED 'AMICO' \] 
2$ 
SUBJ \[PRED "PICORP"\] 
OBJ \[PRED 'environment protection technology'\] 
TO ~ PP 10 7\] OBJ \[PRED 'AMICO' \] 
PICORP licenses erlvironment protection technology to AMICe. 
PICORP's licensing of environment protection technology to AMICe. 
PICORP to license environment protection technology to AMICe (headline style) 
Fig. 4: Analysis and generation of summary 
A score is calculated for each feature structure 
and the core feature structure will be selected by 
its score instead of selecting a core relation and 
attaching malldatory relations. In the core 
l~mture structure, index \[1\] is replaced by %I, JBJ of 
the top l\]eature structure. 
(}eneratin<,. > phrases t'rOlll the t\:ature structure 
requires templates ?. Several pattern,<; c, an be 
selected io generate phrases: 
V-ing (gerund) tbrm 
ARGI' s PRED--Ang ARG2 'co ARG3 
notin |'orm 
ARGI's noun (PRF, D) o£ ARG2 to ARG3 
to--infinitive l~}l-nl 
For ARGI to PRED ARG2 to ARG3 
In this case, tile herin fOFlll ~" lqC()RP's license 
c,f the protection technology to AMIC()" is 
avoided because tile noun "qicense" lacks the 
meaning of "action" or "'event. '" ()tiler rules 
specific to headlines such as ~'to-infinitive 
represents |'uture" Call alSO be hltroduced. 
6 Related Work 
bllOSt sumnmrization studies (including Zcchnero 
1996) arc based on inq3oitant sentence selection 
and seek belier selection methods. We have 
+' Generation el" articles is h.'ft to be considered. 
pointed out that sumnmries made by this method 
tend to be btndensome to read, and have proposed 
phrase-representation summarization as an 
alternative. The following studies bear some 
relation to our study. 
The summarization method by Boguraev and 
Kctmedy (1997) adopts ~phrasal expression" 
rather than sentences or paragraphs. However, it 
begins to create a phrase not from a core relation 
but a core word (in their words, "'topic stamp") 
and produces multipk; phrases containing the 
same core word; it is therefore not suitable for 
summaries for sifting IR results. In addition, 
because it does not consider the roles and 
importance of thc attaching arcs when enriching 
the core, less important words are often attached 
to the core. They aimed at supporting fast 
reading rather than sifting IR restllts. 
Some studies are similar to ours in that they 
make sentences short. Wakao, et al. (1998) and 
Mikami, ct al. (1998) aim to create closed 
captioning fl-om an announcer's manuscript by 
paraf~hrasing and renlovhlg nlodifiers. This 
method doesnh ronlove \[he "'{l'tlllk ~" o1" the 
analxsis tree and the sunlll~aries canilot be made 
as short as in phrase-representation. 
Na{~ao, el al. (1998) also proposed a ineti~od to 
create summarization based on the i'ehlthms 
883 
between words. They utilize GDA (Global 
Document Annotation), a tag set that the docu- 
ment author inserts into the document and that 
contains linguistic information such as sentence 
structures and reference infimnation. Althot@a 
this method is similar to ours in some points, the 
stlmmaw consists of sentences and thus does not 
have "at-a-glalme" capability. Most of all, the 
expectation that every doctmlent is tagged 
linguistically will not be fulfilled until special 
editors with automatic linguistic tagging beconte 
popular. 
Conclusion 
We introduced the concept of "at-a-glance" 
summary and showed an algorithm of phrase- 
representation SUlnmarization as a realization of 
the concept. An experiment shows that the 
summaries are effective for sifting IR results. 
We continue to fine-trine the prototype for 
timber efficacy. 
Acknowledgement 
We would like to thank our laboratory members 
who give us valuable suggestions and participated 
in the experiment. 

References 

Bougraev, t3. and Kennedy, C. (1997): "Salience-based 
Content Characterisation of Text Documents," Proc. 
Intelligent Scalable Text Summarization, pp. 2-9. 

Jing, H., Barzilay, R., McKeown, K. and Elhadad, M. 
(1998): "Summarization Evaluation Methods: 
Experiments and Analysis." In Intelligent Text 
Summarization. pp. 51-59. AAAI Press. 

Kaplan, R. M. and Bresnan, J. (1982): "'Lexical- 
Functional Grammar: A Forlnal System for Gram- 
matical Representation," in Bresnan, J. (ed.) The 
Mental Representation oJ" Gramnzatical Relalions, 
MIT Press. 

Mani, 1., House, D., Klein, G., ttirschman, L., Obrst, 
L., Firmin, T., Chizanowski, M., and Sundheim, B. 
(1998): "'The 77PSTER SL/MM:tC T~:vt Summariza- 
tion Evaluation." Technical P, eport MTR 
98W0000138, MITRE Technical Report. 

Mikami, M., Yamazaki, K., Masuyama, S. and 
Nakagawa, S. (1998): "Summarization of News 
Sentences for Closed Caption Generation," t'roc. 
llq>rksh~q) l)rogram The 4th Anmzal Meeting (71 Tim 
.-l.s'xociation ./br Natural Language l)roce.ssiny, pp. 
14-21 (in Japanese). 

Miyauchi, T., Ol<a, M. and Ueda. Y. (1995):-Key- 
relation technology for text rett+ieval. "" /'roe. the 
SDAIR '95, pp. 469-483. 

Nagao, K. and tlasida, K. (1998): "Autotnatic Text 
Sununarization P, ased on the Global Doctnnent 
Annotation," Proc. COLING-g& pp. 917-92 I. 

Oka, M. and Ueda, Y. (2000): "'Evaluation of Phrase- 
representation Summarization based on Information 
Retrieval Task," Proc. ANLP, NAACL 2000 Work- 
shop o,'z ,4zzlomatic Sumnzarization, pp. 59 -- 68. 

Pollard, C. and Sag, 1. A. (1994): ttead-Driven t'hrase 
Strltctm'e Grammar, The University of Chicago 
Press. 

Salton, G. (1989): :tulomalic 7Z, x/ l'rocessing: The 
7)'an.~/brmation, Analysis, and Retrieval of InJbrma- 
tion by Compttter, Addison-Wesley. 

Wakao, "F., Ehara, T. and Shirai, K. (1998): "Auto° 
matic Summarization for Closed Caption for TV 
News," Proc. Workshop Program The 4t\]l Annual 
Meeting (?/ The Association for Natural La~Nuage 
Processing, 7-13 (in Japanese). 

Zechner, K. (1996): "'Fast Generation of Abstracts 
from General Domain Text Corpora by Extracting 
Relevant Sentences." l'roc. COLING-96, pp. 986.,, 
989. 
