File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-1701_concl.xml
Size: 1,145 bytes
Last Modified: 2025-10-06 13:53:48
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1701"> <Title>Unsupervised Training for Overlapping Ambiguity Resolution in Chinese Word Segmentation</Title> <Section position="6" start_page="2" end_page="2" type="concl"> <SectionTitle> 5 Conclusion and Future work </SectionTitle> <Paragraph position="0"> Our contributions are two-fold. First, we propose an approach based on an ensemble of adapted naive Bayesian classifiers to resolving overlapping ambiguities in Chinese word segmentation. Second, we present an unsupervised training method of constructing these Bayesian classifiers on an unlabeled training corpus. It thus opens up the possibility for adjusting this approach to a large variety of applications. We perform evaluations using a manually annotated test set. Results show that our approach outperforms a lexicalized rule-based system. Future work includes investigation on how to construct more powerful classifier for further improvements. One promising way is combining our approach with Sun's (1997), with a core set of context free OASs manually labeled to improve accuracy. null</Paragraph> </Section> class="xml-element"></Paper>