File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/w00-1110_evalu.xml

Size: 3,380 bytes

Last Modified: 2025-10-06 13:58:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1110">
  <Title>Automatic summarization of search engine hit lists</Title>
  <Section position="8" start_page="105" end_page="107" type="evalu">
    <SectionTitle>
6 Experimental results
</SectionTitle>
    <Paragraph position="0"> Our system was evaluated using the task-based extrinsic measure as suggested in (Mani et al.</Paragraph>
    <Paragraph position="1"> 1999). The experiment was set up as follows: Three sets of documents on different topics were selected prior to the experiment. The topics and their corresponding document information are shown in Table 1.</Paragraph>
    <Paragraph position="3"> The term data mining is then this high-level application techniques / tools used to 5^9..</Paragraph>
    <Paragraph position="4"> present and analyze data for decision makers ~</Paragraph>
    <Paragraph position="6"> As Table 1 shows, the articles in topic set $1 are longer than both these in $2 and $3. The articles in $3 are the shortest, with each 32k in average.</Paragraph>
    <Paragraph position="7"> The number of documents in each topic set is  also different. The variations of document length and different number of documents in each topic set will help test the robustness of our summarization algorithms.</Paragraph>
    <Paragraph position="8"> We used SNS to generate both 10% and 20% summaries for each topic. A sample of the 10% summary for topic $2 is shown in Figure 8. Four users were selected for evaluation of these summarization results. Each user was asked to read through the set of full articles for each topic f'wst, followed by its corresponding 10% and 20% summaries. After these 4 users finished each set, they were asked to assign a readability score (1-10) for each summary. The higher the readability score is, the more readable and meaningful for comprehension is the summary.</Paragraph>
    <Paragraph position="9"> The time of reading both full articles and summaries was tracked and recorded.</Paragraph>
    <Paragraph position="10">  topic S1, is 8, 8 respectively. For topic $3, the average readability score for 10% and 20% summaries is 7.75, and 8.75, respectively.</Paragraph>
    <Paragraph position="11"> Similarly, for $2 the average readability score for 10% and 20% summaries is 8 and 8.5, respectively. The differences in the average readability score also suggest that (a) our summarizer favors longer documents over shorter documents; (tO 20% summaries are generally favorable over 10% summaries. The difference in the readability score between 10% and 20% summaries is bigger in $3 (diff = 1.0) than in S1 (diff = 0). These interesting findings raise interesting questions for future research. As can be seen from Table 3, the 20% summary achieves better readability score in overall than the 10% summary. The speedup of the 10% summary over full articles is 6.87. That is, with reading material reduced by 900%, the speedup in reading is only 687%. This suggests that there may be a little bit difficulty in reading the 10% summary result. This may be due to the simple sentence boundary detection algorithm we used.</Paragraph>
    <Paragraph position="12"> The feedback from users in the evaluation seems to corffirm the above reason. As more sentences were included in the 20% summaries, the speedup in reading (4.22) almost approached the optimal speedup ratio (5.0)L</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML