File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-3805_evalu.xml

Size: 5,961 bytes

Last Modified: 2025-10-06 13:59:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3805">
  <Title>A Study of Two Graph Algorithms in Topic-driven Summarization</Title>
  <Section position="7" start_page="29" end_page="31" type="evalu">
    <SectionTitle>
5 Experiments and results
</SectionTitle>
    <Paragraph position="0"> We produce a summary for each topic and each experimental con guration. We take the most highly ranked (complete) sentences for which the total number of words does not exceed the 250-word limit. Next, we gather SCU data for each sentence in each summary from the SCU information les. For a speci c experimental con guration topic representation, graph algorithm we produce summaries for the 20 documents with the weight factor values 0, 1, 2, ..., 15, 20, 50, 100. Each experimental conguration generates 19 sets of average results, one  (GM) and path search (PS) with different topic representations null per weight factor. For one weight factor, we generate summaries for the 20 topics, and then average their SCU statistics, including SCU weight, number of unique SCUs and total number of SCUs. In the results which follow we present average SCU weight per summary. The number of unique SCUs and the number of SCUs closely follow the presented graphs. The overlap of SCUs (number of SCUs / number of unique SCUs) reaches a maximum of 1.09. There was no explicit redundancy elimination, mostly because the SCU overlap was so low.</Paragraph>
    <Paragraph position="1"> We compare the performance of the two algorithms, GM and PS, on the two topic representations with all open-class words and only with nouns and verbs. Figure 1 shows the performance of the methods in terms of average SCU weights per summary for each weight factor considered 1.</Paragraph>
    <Paragraph position="2"> The results allow us to make several observations.</Paragraph>
    <Paragraph position="3"> Keyword-only match performs worse that either GM or PS. The points corresponding to keyword (node) match only are the points for which the weight factor is 0. In this case the dependency pairs match and paths found in the graph do not contribute to the overall score.</Paragraph>
    <Paragraph position="4"> Both graph algorithms achieve better performance for only the nouns and verbs from the  tor, so we include only the non- at part of the graph. topic than for all open-class words. If, however, the topic requests entities or events with speci c properties, described by adjectives or adverbs, using only nouns and verbs may produce worse results.</Paragraph>
    <Paragraph position="5"> GM performs better than PS for both types of topic descriptions. In other words, looking at the same words that appear in the topic, connected in the same way, leads to better results than nding pairs of words that are somehow connected.</Paragraph>
    <Paragraph position="6"> Higher performance for higher weight factors further supports the point that looking for word connections, instead of isolated words, helps nd sentences with information content more related to the topic.</Paragraph>
    <Paragraph position="7"> For the following set of experiments, we use the topics with the word list containing only nouns and verbs. We want to compare graph matching and path search further. One issue that comes to mind is whether a combination of the two methods will perform better than each of them individually. Figure 2 plots the average of SCU weights per summary.</Paragraph>
    <Paragraph position="8">  We observe that the combination of graph matching and path search gives better results than either method alone. The sentence score combines the number of edges matched and the number of connections found with equal weight factors for the edge match and path score. This raises the question whether different weights for the edge match and path would lead to better scores. Figure 3 plots the results produced using the score computation formula S = SN +</Paragraph>
    <Paragraph position="10"> where both WeightFactorE and WeightFactorP are integers from 0 to 30.</Paragraph>
    <Paragraph position="11"> The lowest scores are for the weight factors 0, when sentence score depends only on the keyword score. There is an increase in average SCU weights  with different weight factors towards higher values of weight factors. A transparent view of the 3D graph shows that graph match has higher peaks toward higher weight factors than path search, and higher also than the situation when path search and graph match have equal weights.</Paragraph>
    <Paragraph position="12"> The only sentences in the given documents tagged with SCU information are those which appeared in the summaries generated by the competing teams in 2005. Our results are therefore actually a lower bound more of the sentences selected may include relevant information. A manual analysis of the summaries generated using only keyword counts showed that, for these summaries, the sentences not containing SCUs were not informative. We cannot check this for all the summaries generated in these experiments, because the number is very large, above 1000. An average summary had 8.24 sentences, with 3.19 sentences containing SCUs. We cannot say much about the sentences that do not contain SCUs. This may raise doubts about our results. Support for the fact that the results re ect a real increase in performance comes from the weights of the SCUs added: the average SCU weight increases from 2.5 when keywords are used to 2.75 for path search algorithm, and 2.91 for graph match and the combination of path search and graph match. This shows that by increasing the weight of graph edges and paths in the scoring of a sentence, the algorithm can pick more and better SCUs, SCUs which more people see as relevant to the topic. It would be certainly interesting to have a way of assessing the SCU-less sentences in the summary. We leave that for future work, and possibly future developments in SCU annotation. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML