File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/05/w05-0403_evalu.xml
Size: 3,211 bytes
Last Modified: 2025-10-06 13:59:28
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-0403"> <Title>Temporal Feature Modification for Retrospective Categorization</Title> <Section position="5" start_page="19" end_page="20" type="evalu"> <SectionTitle> 4 Results </SectionTitle> <Paragraph position="0"> Table 4 shows the parameter combinations, chosen by ten-fold cross-validation, that exhibited the greatest increase in categorization performance for each corpus.</Paragraph> <Paragraph position="1"> Using these parameters, Figure 1 shows the improvement in accuracy for different percentages of terms modified on the test sets. The average accuracies (across all parameter combinations) when no terms are modified are less than stellar, ranging from 26.70% (SIGCHI) to 37.50% (SIGPLAN), due to the difficulty of the task (2022 similar categories; each document can only belong to one). Our aim here, however, is simply to show improvement. A baseline of 0.0 in the plot indicates accuracy without any temporal modifications.</Paragraph> <Paragraph position="2"> Figure 2 shows the accuracy on an absolute scale when TFM is applied to the full text SIGIR corpus. Performance increased from the atemporal baseline of 28.85% TFM for the SIGIR full text corpus.</Paragraph> <Paragraph position="3"> correct to a maximum of 38.46% when only 1.11% of the terms were modified. The ModifyLists for each category and year averaged slightly fewer than two terms each.</Paragraph> <Paragraph position="4"> In most cases, the technique performs best when making relatively few modifications: the left sides of each figure show a rapid performance increase, followed by a gradual decline as more terms are modified. After requiring the one-time computation of odds ratios in the training set for each category/year, TFM is very fast and requires negligible extra storage space. This is important when computing time is at a premium and enormous corpora such as the ACM full text collection are used. It is also useful for quickly testing potential enhancements to the process, some of which are discussed in Section 6.</Paragraph> <Paragraph position="5"> The results indicate that L in PreModList(C,t,L) need not exceed single digits, and that performance asymptotes as the number of terms modified increases. As this happens, more infrequent terms are judged to have been produced by perturbed generators, thus making their true distributions difficult to compute (for the years in which they are not modified) due to an insufficient number of examples.</Paragraph> <Section position="1" start_page="20" end_page="20" type="sub_section"> <SectionTitle> 4.1 General description of results </SectionTitle> <Paragraph position="0"> A quantitative average of all results, using all parameter combinations, is not very meaningful, so we provide a qualitative description of the results not shown in Table 4 and Figures 1 and 2. Of the 96 different parameter combinations tested on four different corpora, 83.33% resulted in overall increases in performance. The greatest increase peaked at 40.82% improvement over baseline (atemporal) accuracy, while the greatest decrease dropped performance by only 8.31%.</Paragraph> </Section> </Section> class="xml-element"></Paper>