File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/c04-1094_evalu.xml
Size: 1,466 bytes
Last Modified: 2025-10-06 13:59:11
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1094"> <Title>Using Syntactic Information to Extract Relevant Terms for Multi-Document Summarization</Title> <Section position="7" start_page="0" end_page="0" type="evalu"> <SectionTitle> 5.2 Results </SectionTitle> <Paragraph position="0"> TFSYNTAX measure, which performs better than the others for TT topics. Note that the TFSYNTAX measure only considers 10% of the vocabulary, which are the words immediately preceding verbs in the texts.</Paragraph> <Paragraph position="1"> In order to check whether this result is consistent across topics (and not only the effect on an average) we have compared recall for term lists of size 50 for individual topics. We have selected 50 as a number which is large enough to reach a good coverage and permit additional filtering in an interactive summarization process, such as the iNeast terminological clustering described in (Leuski et al., 2003).</Paragraph> <Paragraph position="2"> Figure 5 shows these results by topic. TFSYNTAX performs consistently better for all topics except one of the IE topics, where the maximum likelihood measure is slightly better.</Paragraph> <Paragraph position="3"> Apart from the fact that TFSYNTAX performs better than all other methods, it is worth noticing that sophisticated weighting mechanisms, such as Okapi and the likelihood ratio, do not behave better than a simple frequency count (TF).</Paragraph> </Section> class="xml-element"></Paper>