File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-2604_evalu.xml

Size: 1,573 bytes

Last Modified: 2025-10-06 13:59:58

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2604">
  <Title>Basque Country ccpzejaa@si.ehu.es I~naki Alegria UPV-EHU Basque Country acpalloi@si.ehu.es Olatz Arregi UPV-EHU Basque Country acparuro@si.ehu.es</Title>
  <Section position="6" start_page="30" end_page="30" type="evalu">
    <SectionTitle>
5 Experimental Results
</SectionTitle>
    <Paragraph position="0"> In Table 3 microaveraged F1 scores obtained in our experiment are shown. As it could be expected, a simple stemming process increases slightlyresults, anditcanbeobservedthatthebest result for the three category subsets has been obtained for the stemmed corpus, even though gain is low (less than 0.6).</Paragraph>
    <Paragraph position="1"> The evaluation for the Top-10 category subset gives the best results, reaching up to 93.57%. In fact,thisistheexpectedbehavior,asthenumberof categories to be evaluated is small and the number of documents in each category is high. For this subset the best result has been obtained for 100 dimensions, although the variation is low among results for 100, 300 and 500 dimensions. When using higher dimensions results become poorer.</Paragraph>
    <Paragraph position="2"> According to the R(90) and R(115) subsets, the best results are 87.27% and 87.01% respectively.</Paragraph>
    <Paragraph position="3"> Given that the difficulty of these subsets is quite similar, their behavior is also analogous. As we can see in the table, most of the best results for these subsets have been obtained by reducing the dimension of the space to 500.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML