File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/w97-0707_metho.xml

Size: 12,117 bytes

Last Modified: 2025-10-06 14:14:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0707">
  <Title>I I I I I</Title>
  <Section position="4" start_page="42" end_page="43" type="metho">
    <SectionTitle>
4 Results and Discussion
</SectionTitle>
    <Paragraph position="0"> The following scenario was assumed for evaluauon of the automatic summaries * A user walks up to the system and presents an article for summanzauon * In the first case, the system asks another human to do the summanzaUon and presents it to the user The user compares tins summary to lus/her own nouon of an ideal summary * In the second case, the system automatically gener~ ates a summary and returns it to the user The user agmn compares this summary to his/her own nouon of an Ideal summary * The user saUsfaction m the above two cases Is measured by the &amp;quot;degree of overlap&amp;quot; between the summary presented by the system and the user's nouon of an ideal summary 3Different users could count paragraphs &amp;fferently Thus, for a few amcles, the lengths of the two manually generated summaries were different In such cases, the autoraaUc procedures took the average of these two lengths as the target length for the extract  If the user's sausfactton as about the same an the above two cases, then our automatic summanzataon schemes are summanzang as well as a human would summanze by paragraph extracUon For each automaue summanzaUon algonthm, four quanUties were computed 10pwmi.c evaluatmn Smce the two manual extracts for an amcle are chfferent, the amount of overlap between an automatic and a manual extract depends on which manual extract as selected for comparison The opUmisuc evaluauon for an algonthm as done by selecung the manual extract wath winch the automatic extract has a Ingher overlap,.and measuring tins overlap Tins as the same as using the human whose noUon of an ~deal extract ~s closer to the automauc extract as our user 2 Pesszrmstw evaluanon Analogously, a pessarmstic evaluation ~s done by Selecting the manual extract with winch the automauc extract has a lower overlap Thas as the same as using the human whose. notion of an adeal extract ~s more chss~rmlar to the automatac extract as our user Tins, in some sense. as the worst case scenario</Paragraph>
  </Section>
  <Section position="5" start_page="43" end_page="45" type="metho">
    <SectionTitle>
3 Intersection For each arucle, an antersecuon of
</SectionTitle>
    <Paragraph position="0"> the two manually generated summaries as computed The fact that the paragraphs an tins intersection were deemed amportant by both the readers suggests that they may, an fact, be the most irnportant paragraphs m the arucle We compute the percentage of these paragraphs that ~s included an the automaue extract 4 Umon We also calculate the percentage of automaucally selected paragraphs that as selected by at least one of the two users Tins as, an some sense, a precasmn measure, since at provades us wah a sense of how often an automatically selected paragraph ~s potenually amportant In our experimentation, we observed that many subjects tend to select paragraph 3 an the summaries Tins as because tins paragraph is the first content paragraph an an amcle and tends to be a chctionary-style defimtion for the amcle For example, for arUcle 15930 (Monopoly), tins paragraph reads Monopoly, economic snuauon in winch there as only a single seller or producer of a eommochty or a servace For a monopoly to be effective, there must be no prazucal subsututes for the product or servace sold, and no serious threat of the ency of a competitor into the market 'Flus enables the seller to control the pnce Such &amp;cUonary-style definmons are generally hked by readers and thus are usually included m a summary by our subjects In general, an written texts, the first content paragraph tends to be an introductory paragraph and ~s a good startmg paragraph for summargauon For the encyclopecha amcles, we use tins reformation and we always include paragraph 3 m the bushy and the depth-first summaries &amp;quot;Flus paragraph might be nussed by the segmented bushy paths but Is recaptured by the augmented segmented bushy paths In case such collection specific mfonnauon as not avadable, we use the first paragraph wath a * reasonable number of hnks to the rest of the paragraphs as the mtreductory paragraph (Salton and Smghal, 1995) Table 5 shows the overlap for the two manual extracts, and the dafferent evaluation measures averaged over all fifty amcles, for the bushy, depth-first, segmented bushy, and augmented segmented bushy extracts In adchtion to using these four methods, extracts were also generated for the amcles by selecting the reqmred number of paragraphs at random To ehnunate any advantage that the bushy, depth-first, and augmented segmented bushy extracts might have due to the presence of the introductory paragraph, paragraph 3 is always included m the * random paths The eValuauon results for these random extracts are also shown m the table Random selection of paragraphs serves as the weakest possible basehne If an algorithm does not perform noticeably better than a random extract, then at as certmnly doing a poor job of summanzauon Also, Brandow, Matze, and Rau found m (Brandow et al, 1995) that simply selecting the first few sentences (the lead sentences) produced the most acceptable summanes To test their findmgs m our envaronment, we also Selected the first 20% paragraphs of an arucle and used n as yet another automauc summary</Paragraph>
    <Section position="1" start_page="43" end_page="45" type="sub_section">
      <SectionTitle>
Manual Extracts
</SectionTitle>
      <Paragraph position="0"> The most unexpected result of our experiment was the low level of agreement between the two human subjects The overlap between the two manual extracts as only 46% on an average, z e, an extract generated by one person is hkely to cover 46% of the mformatmn that as regarded as most tmportant by anotherperson This ratto suggests that two humans dasagree on more than half the paragraphs that they consider to be critical In addmon, as re&amp;cared above, the first paragraph of these encyclopedm arucles ~s a general introduction to the amcle and ~s often selected by both subjects-- m 50% of the cases m w~ch the mtersection between the two users' extracts ~s a single paragraph, tins paragraph as the first one Tins increases the chances of overlap between the two manual extracts If we exclude tins specml paragraph from the arUcle, the overlap figures for two humans wall be even worse &amp;quot; The lack of consensus between users on winch paragraphs are miportant can be explained as follows On a first reading, users earmarked ceruun paragraphs as amportant Some of these paragraphs were then einmnated, m order to reduce the extract to the stipulated size Of.</Paragraph>
      <Paragraph position="1"> ten, the choace between winch paragraphs to keep and winch to exclude was a &amp;flicult one, and m such satuauons, some arbm'armess ts bound to creep m Tins facts casts some shadows on the utahty of automatac text summanzauon by text extraction It ~s possable that the user satisfactaon maght be Ingher m reabty when the true user does not read the poruon of an amcle not presented to Into/her by the sumraanzation system and does not get an opportumty to form has/her own adeal vaew of an extract  Table 5 mdtcates that global bushy paths and augmented segmented bushy paths produce the best extracts among the four paths considered m flus study 55% of the paragraphs selected by the process were considered important by at least one user OptmusUcally spealang, a global bushy or an augmented segmented bushy path may be expected to agree approximately 46% with a user Tlus number is at par with the agreement between two humans (45 81%) This result is reassunng m terms of * the method's viablhty for generating good extracts, since the scheme performs as well as a human About 47% of the paragraphs deemed important by both users are included m the bushy extract for an amcle This figure ms somewhat dlsbeartemng We expected a better coverage of these vital paragraphs by our extracts A further study of these paragraphs nught reveal some propemes that users look for in a paragraph to decide its importance It might then be possible to automate this selection process We also ldenufied the arucles for which the intersection of the two user summaries is a single paragraph For 78% of these amcles, this paragraph was included in the bushy path Segmented bushy paths perform worse than expected Tills Is because the first paragraph of an artacle Is very often selected by users, and segmented bushy paths occasronally omit flus paragraaph Tins results m a decrease m the overlap between automauc and manual extracts In contrast, the other paths are guaranteed to include the first paragraph, and perform better But, in general, the performance of segmented bushy paths was sausfactory (45 48% overlap with the user in the opunusuc method) Smularly, the performance of the depth-first path was also sausfactory All paths aclueved the trammum reqtmement of perfonmng significantly better than a random extract But more lnteresUngly, we observe that extracts produced by selecting the first few paragraphs of the amcles also performed comparably to the best paragraph extracUon scheme Adrmttedly, our evaluation methodology lacks the evaluaUon of the readabdlty aspect of a summary wluch was one of the mmn mouvauons of moving from a sentence-based extracuon strategy to paragraph-based extracUon With very high chances, the lead summary roll outperform all other automauc summaries m terms of readabthty We beheve this because automauc summaries are a forced concatenauon of paragraphs &amp;stnbuted all across a document, whereas a lead summary Is a mcely coherent sequence of paragraphs, as wntten by the author Overall, the lead summanes are comparable to the best summanzauon strategy and could be more readable than allother summaries Tlus troth is rather discouraging for the feasthlhty of automauc summanzauon by text extracUon but agrees wlth the observauons m (Brandow et al, 1995) News reports, used m (Brandow et al, 1995), frequently contmn a leading paragraph that summanzes the story contmned in the rest of the report Likewise, m the encyclope&amp;a amcles used m flus study, the first paragraphs usually define the topic, and provide a general outline about It To sum up * The goocl news is that Interpreted m light of the fact that the overlap between the two manual extracts is, on an average, 46%, and given the enormous reducuon m the amount of resources reqmred 4, our results indicate that automauc methods for extracuon compare very favorably with manual extracUon * But the bad news ms that a summary formed by extracung the mltmi paragraphs of an arUcle IS as good as the best automatic summary and might just be more readable from a user's perspective Tins bnngs into question the overall uUhty of automatic text summanzatmn by text (sentence or paragraph) extracUon It ms possible that the nature of the articles used m thls study (encyclope&amp;a amcles) and m (Brandow et al, 1995) (news articles) have a structure that yields very good summanes, stmply by extracting the initial part of an ar~cle It wdl be interesting to see lfobservattons from flus study and from (Brand0w et al, 1995) carry over to other, more non-encyclopedia like and non-news like dommns (for example legal documents or U S Patents) In our stu&amp;es with text summanzauon (by text extracuon), we have always felt a very strong need for a good evaluauon test-bed Lack of good objecuve evaluauon techniques for text summanzauon has always been the biggest problem in all our work, an~ has consistently 4The system took about 15 nunutes to generate 3 summaries for each of 50 amcles A human would reqmre about 10 nunutes to produce a summary for a typical amcle from tlus set  &amp;scouraged more expenmentaUon and exploration of mteresung research posslbihlaes 0tke the one menuoned above regardmg amcles from other domains)</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML