File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-1106_concl.xml
Size: 1,253 bytes
Last Modified: 2025-10-06 13:53:47
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1106"> <Title>Text Classi cation in Asian Languages without Word Segmentation</Title> <Section position="9" start_page="0" end_page="0" type="concl"> <SectionTitle> 6 Conclusion </SectionTitle> <Paragraph position="0"> We have presented a simple language model based approach without word segmentation for Chinese and Japanese text classi cation. By comparison to three standard text classi ers, the language modeling approach consistently demonstrates better classi cation accuracies while avoiding word segmentation and feature selection. Although straightforward, the language modeling approach appears to give state of the art results for Chinese and Japanese text classi cation.</Paragraph> <Paragraph position="1"> It has been found that word segmentation in Chinese text retrieval is tricky and the relationship between word segmentation and retrieval performance is not monotonic (Peng et al., 2002). However, since text classi cation and text retrieval are two different tasks, it is not clear whether the same relationship exists in text classi cation context. We are currently investigating this issue and interesting ndings have already been observed.</Paragraph> </Section> class="xml-element"></Paper>