File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-1105_concl.xml
Size: 1,852 bytes
Last Modified: 2025-10-06 13:53:46
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1105"> <Title>Poisson Naive Bayes for Text Classification with Feature Weighting</Title> <Section position="15" start_page="2" end_page="2" type="concl"> <SectionTitle> 5 Conclusion and Future Work </SectionTitle> <Paragraph position="0"> In this paper, we propose a Poisson naive Bayes text classification model with feature weighting. Our new model uses the normalized and smoothed term frequencies for each document, and Poisson parameters are calculated by weighted averaging the frequencies over all training documents. Experimental results show that the proposed model is quite useful to build probabilistic text classification systems without requiring any extra cost compared to the traditional simple naive Bayes or unigram language model classifiers.</Paragraph> <Paragraph position="1"> Further improvement is achieved by a feature weighting technique. In our experiments, three measures including chi-square statistics, information gain, and newly introduced probability ratio are adopted to weigh each term feature. The results show that feature weighting considerably improves the performances for the classes with a small number of training documents, but not for the classes with the sufficient training documents. Probability ratio also performs well, especially in the classes with the great number of training documents where other feature weighting methods show the unsatisfactory performances.</Paragraph> <Paragraph position="2"> For the future work, we will try to develop some automatic methods of selecting proper feature weighting measures and determining the interpolation parameters for the different classes. Furthermore, we will explore applications of our approach in other tasks such as adaptive filtering and relevance feedback.</Paragraph> </Section> class="xml-element"></Paper>