XML Viewer - n06-1001

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/n06-1001_evalu.xml
Size: 2,541 bytes
Last Modified: 2025-10-06 13:59:40
<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1001">
  <Title>Capitalizing Machine Translation</Title>
  <Section position="10" start_page="6" end_page="6" type="evalu">
    <SectionTitle>
6.3 Results
</SectionTitle>
    <Paragraph position="0"> The performance comparisons between our CRF-based capitalizer and the two LM-based baselines are shown in Table 3 and Table 4. Table 3 shows the BLEU scores, and Table 4 shows the precision.</Paragraph>
    <Paragraph position="1"> The BLEU upper bounds indicate the ceilings that a perfect capitalizer can reach, and are computed by ignoring the case information in both the capitalizer outputs and the reference. Obviously, the precision upper bounds for all language pairs are 100%.</Paragraph>
    <Paragraph position="2"> The precision and end-to-end BLEU based comparisons show that, for European language pairs, the CRF-based bilingual capitalization model outperforms significantly the strong LM-based baseline. We got more than one BLEU point improvement on the MT translation between English and French, a 34% relative reduction in capitalization error rate for the French-to-English language pair, and a 42% relative error rate reduction for the English-to-French language pair. These results show that source language information provides significant help for capitalizing machine translation outputs. The results also show that when the source language does not have case, as in Chinese, the bilingual model equals a monolingual one.</Paragraph>
    <Paragraph position="3"> The BLEU difference between the CRF-based capitalizer and the trigram one were larger than the precision difference. This indicates that the CRF-based capitalizer performs much better on non-grammatical texts that are generated from an MT system due to the bilingual feature of the CRF capitalizer. null</Paragraph>
    <Section position="1" start_page="6" end_page="6" type="sub_section">
      <SectionTitle>
6.4 Effect of Training Corpus Size
</SectionTitle>
      <Paragraph position="0"> The experiments above were carried out on large data sets. We also conducted experiments to examine the effect of the training corpus size on capitalization precision. Figure 4 shows the effects. The experiment was performed on the E!F corpus. The bilingual capitalizer performed significantly better when the training corpus size was small (e.g., under 8 million words). This is common in many domains: when the training corpus size increases, the difference between the two capitalizers decreases.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML