File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-1026_concl.xml
Size: 1,552 bytes
Last Modified: 2025-10-06 13:53:25
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-1026"> <Title>Manipulating Large Corpora for Text Classification</Title> <Section position="5" start_page="0" end_page="0" type="concl"> <SectionTitle> 5 Conclusions </SectionTitle> <Paragraph position="0"> We have reported an approach to text classification which manipulates large corpora using NB and SVMs. Our main conclusions are: a198 Our method outperforms the baselines, since the micro-averaged a189a98a190 score of our method was 0.704 and the baselines were 0.519 for NB and 0.285 for SVMs.</Paragraph> <Paragraph position="1"> a198 As shown in previous researches, hierarchical structure is effective for classification, since the result of our method using hierarchical structure led to as much as a 10.8% reduction in error rates, and up to 1.3% with NB.</Paragraph> <Paragraph position="2"> a198 There is no significant difference between the F1 scores of our method and the a0a7a1a166a3a5a0a7a6a9a8a26a10a12a0 method with hierarchical structure. However, the computation of our method is more efficient than the a0a7a1a166a3a5a0a7a6a9a8a26a10a12a0 method in the experiment. Future work includes (i) extracting features which discriminate between categories within the same top-level category, (ii) investigating other machine learning techniques to obtain further advantages in efficiency in the manipulating data approach, and (iii) evaluating the manipulating data approach using automatically generating hierarchies(Sanderson and Croft, 1999).</Paragraph> </Section> class="xml-element"></Paper>