File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-1722_intro.xml
Size: 1,476 bytes
Last Modified: 2025-10-06 14:02:07
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1722"> <Title>Chinese Word Segmentation at Peking University</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> On behalf of the Institute of Computational Linguistics, Peking University, we would like to thank ACL-SIGHAN for sponsoring the First International Chinese Word Segmentation Bakeoff, which provides us an opportunity to present our achievement of the past decade.</Paragraph> <Paragraph position="1"> We know for sure that it is very difficult to settle on a scientific and appropriate method of evaluation, and it might be even more difficult than word segmentation itself. We are also clear that each step in Chinese information processing requires great efforts, and a satisfactory result in word segmentation, though critical, does not necessarily guarantee good results in the following steps.</Paragraph> <Paragraph position="2"> From the test results of this evaluation, we are very gratified to see that we have done a good job both as a test corpus provider and as a participant. According to the rule, we did not test on the corpus we provided, but it is quite encouraging that our supply tops the test corpus list to be elected by other participants. Section 2 and Section 3 describes our work in the Bakeoff as the test corpus provider and the participant respectively.</Paragraph> </Section> class="xml-element"></Paper>