File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/05/i05-3025_abstr.xml
Size: 899 bytes
Last Modified: 2025-10-06 13:44:21
<?xml version="1.0" standalone="yes"?> <Paper uid="I05-3025"> <Title>A Maximum Entropy Approach to Chinese Word Segmentation</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> We participated in the Second International Chinese Word Segmentation Bakeoff. Specifically, we evaluated our Chinese word segmenter in the open track, on all four corpora, namely Academia Sinica (AS), City University of Hong Kong (CITYU), Microsoft Research (MSR), and Peking University (PKU). Based on a maximum entropy approach, our word segmenter achieved the highest F measure for AS, CITYU, and PKU, and the second highest for MSR. We found that the use of an external dictionary and additional training corpora of different segmentation standards helped to further improve segmentation accuracy.</Paragraph> </Section> class="xml-element"></Paper>