File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/05/i05-3023_evalu.xml

Size: 2,438 bytes

Last Modified: 2025-10-06 13:59:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-3023">
  <Title>Perceptron Learning for Chinese Word Segmentation</Title>
  <Section position="7" start_page="155" end_page="156" type="evalu">
    <SectionTitle>
6 Results on Test Data
</SectionTitle>
    <Paragraph position="0"> Table5presents ourofficialresults ontest corpora for both close and open tests. First, comparing with the results in Table 4, the results on test set are significantly different from the result using 4-fold cross validation ontraining setfor allthe four corpora. The test result was better than the results on training set for the msr corpus but was worse for other three corpora, especially for the pku corpora. We suspected that this may be caused by difference between training and test data, which needs further investigation.</Paragraph>
    <Paragraph position="1">  95% confidence interval on the 4-fold cross-validation on the training sets of four corpora and the computation time (in hour) for each experiment. &amp;quot;English&amp;quot; means only collapsing English texts and &amp;quot;E &amp; N&amp;quot; means collapsing both English texts and Arabic numbers.</Paragraph>
    <Paragraph position="2"> close test English E &amp; N  Secondly, the test results for close and open tests are close to each other on other three corpora except the pku corpora, for which the result for open test is clearly better than that for close test. This was mainly because of different encoding of Arabic number in training and test sets of the pku corpus. Since Arabic number was encoded in three bytes in training set but was encoded in one byte in test set for the pku corpora, for close test the trained model for Arabic number was not applicable to the Arabic numbers in test set. However, for open test, as we replaced Arabic number with one symbol in both training and test sets, the different encoding of Arabic number in training and test sets could not cause any problem at all, which led to better result. On the other hand, ourpre-processing withrespect totheEnglish text and Arabic numbers seemed have slightly effect on the F-measure for other three corpora.</Paragraph>
    <Paragraph position="3"> Finally, comparing with the results of closed test from other participants, our F1 figures were no more than 0.008 lower than the best ones on the as, cityu and msrcorpora, but was0.023 lower than the best one on the pku corpus.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML