File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/i05-3019_concl.xml

Size: 990 bytes

Last Modified: 2025-10-06 13:54:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-3019">
  <Title>Unigram Language Model for Chinese Word Segmentation</Title>
  <Section position="6" start_page="140" end_page="140" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> We presented a word segmentation system that uses unigram language model to select the most probable segmentation among all possible candidates for an input text. The system is augmented with proper name recognizers, numeric expression recognizers, and post-processing modules to extract new words. Overall the recognizers and the post-processing modules substantially improved the baseline performance.</Paragraph>
    <Paragraph position="1"> The larger training data set used in the PKU open task also significantly increased the performance of our PKU open run. The additional user dictionary is another major contributor to our better performance in the open tasks over the closed tasks.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML