File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/i05-3026_concl.xml

Size: 878 bytes

Last Modified: 2025-10-06 13:54:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-3026">
  <Title>Description of the HKU Chinese Word Segmentation System for Sighan Bakeoff 2005</Title>
  <Section position="6" start_page="166" end_page="166" type="concl">
    <SectionTitle>
5 Conclusions
</SectionTitle>
    <Paragraph position="0"> This paper presents a two-stage statistical word segmentation system for Chinese. We participated in all testing tracks at the second Sighan bakeoff. The scored results show that our system can achieve a F-measure of 0.940-0.967 as a whole for different corpora. This indicates that the proposed system is effective for most ambiguous segmentations and unknown words in Chinese test. For future work, we hope to improve our system by incorporating some pattern rules to handle complicated ambiguous fragments and non-standard words in Chinese text.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML