File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/p98-2221_evalu.xml

Size: 4,638 bytes

Last Modified: 2025-10-06 14:00:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2221">
  <Title>Modeling with Structures in Statistical Machine Translation</Title>
  <Section position="7" start_page="1360" end_page="1362" type="evalu">
    <SectionTitle>
5 Evaluation and Discussion
</SectionTitle>
    <Paragraph position="0"> We used the Janus English/German scheduling corpus (Suhm et al., 1995) to train our phrase-based alignment model. Around 30,000 parallel sentences (400,000 words altogether for both languages) were used for training. The same data were used to train Simplified Model  \[I he she itself\] \[have propose remember hate...\] \[eleventh thirteenth...\] \[after before around\] \[one two three...\]  lation gets one credit, an okay translation gets 1/2 credit, an incorrect one gets 0 credit. Since the IBM Model 3 decoder is too slow, its performance was not measured on the entire test set.</Paragraph>
    <Paragraph position="1"> ity mass is more scattered in the structure-based model, reflecting the fact that English and German have different phrase orders. On the other hand, the word based model tends to align a target word with the source words at similar positions, which resulted in many incorrect alignments, hence made the word translation probability t distributed over many unrelated target words, as to be shown in the next subsection.</Paragraph>
    <Section position="1" start_page="1361" end_page="1361" type="sub_section">
      <SectionTitle>
5.3 Model Complexity
</SectionTitle>
      <Paragraph position="0"> language model. A preprocessor splited German compound nouns. Words that occurred only once were taken as unknown words. This resulted in a lexicon of 1372 English and 2202 German words. The English/German lexicons were classified into 250 classes in each language and 560 English phrases were constructed upon these classes with the grammar inference algorithm described earlier.</Paragraph>
      <Paragraph position="1"> We limited the maximum sentence length to be 20 words/15 phrases long, the maximum fertility for non-null words to be 3.</Paragraph>
    </Section>
    <Section position="2" start_page="1361" end_page="1361" type="sub_section">
      <SectionTitle>
5.1 Translation Accuracy
</SectionTitle>
      <Paragraph position="0"> Table 1 shows the end-to-end translation performance. The structure-based model achieved an error reduction of around 12.5% over the word-based alignment models.</Paragraph>
    </Section>
    <Section position="3" start_page="1361" end_page="1362" type="sub_section">
      <SectionTitle>
5.2 Word Order and Phrase Alignment
</SectionTitle>
      <Paragraph position="0"> Table 2 shows the alignment distribution for the first German word/phrase in Simplified Model 2 and the structure-based model. The probabil-The structure-based model has 3,081,617 free parameters, an increase of about 2% over the 3,022,373 free parameters of Simplified Model 2.</Paragraph>
      <Paragraph position="1"> This small increase does not cause over-fitting, as the performance on the test data suggests.</Paragraph>
      <Paragraph position="2"> On the other hand, the structure-based model is more accurate. This can be illustrated with an example of the translation probability distribution of the English word 'T'. Table 3 shows the possible translations of 'T' with probability greater than 0.01. It is clear that the structure-based model &amp;quot;focuses&amp;quot; better on the correct translations. It is interesting to note that the German translations in Simplified Model 2 often appear at the beginning of a sentence, the position where 'T' often appears in English sentences. It is the biased word-based alignments that pull the unrelated words together and increase the translation uncertainty.</Paragraph>
      <Paragraph position="3"> We define the average translation entropy as m n F_. P(ei) F_, -t(gs I ei)logt(gs l ei).</Paragraph>
      <Paragraph position="4"> i=O j=l  in the structure-based model. The second distribution reflects the higher possibility of phrase reordering in translation.</Paragraph>
      <Paragraph position="5">  is more uncertain in the word-based alignment model because the biased alignment distribution forced the associations between unrelated English/German words.</Paragraph>
      <Paragraph position="6"> (m, n are English and German lexicon size.) It is a direct measurement of word translation uncertainty. The average translation entropy is 3.01 bits per source word in Simplified Model 2, 2.68 in Model 3, and 2.50 in the structured-based model. Therefore information-theoretically the complexity of the word-based alignment models is higher than that of the structure-based model.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML