File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-0131_abstr.xml
Size: 942 bytes
Last Modified: 2025-10-06 13:45:18
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-0131"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics POC-NLW Template for Chinese Word Segmentation</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> In this paper, a language tagging template named POC-NLW (position of a character within an n-length word) is presented. Based on this template, a two-stage statistical model for Chinese word segmentation is constructed. In this method, the basic word segmentation is based on n-gram language model, and a Hidden Markov tagger based on the POC-NLW template is used to implement the out-of-vocabulary (OOV) word identification. The system participated in the MSRA_Close and UPUC_Close word segmentation tracks at SIGHAN Bakeoff 2006. Results returned by this bakeoff are reported here.</Paragraph> </Section> class="xml-element"></Paper>