File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/w00-0707_evalu.xml

Size: 2,584 bytes

Last Modified: 2025-10-06 13:58:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-0707">
  <Title>Incorporating Position Information into a Maximum Entropy/Minimum Divergence Translation Model</Title>
  <Section position="4" start_page="39" end_page="40" type="evalu">
    <SectionTitle>
3 Results
</SectionTitle>
    <Paragraph position="0"> I tested the models on the Canadian Hansard corpus, with English as the source language and French as the target language. After sentence alignment using the method described in (Simard et al., 1992), the corpus was split into disjoint segments as shown in table 1.</Paragraph>
    <Paragraph position="1"> To evaluate performance, I used perplexity: 5Defined in the next section l~erformance for the TransType application described in the introduction, and it has also been used in the evaluation of full-fledged SMT systems (A1-Onaizan et al., 1999). To ensure a fair comparison, all models used the same target vocabulary. For all MEMD models, I used 20,000 word-pair features selected using the method described in (Foster, 2000); this is suboptimal but gives reasonably good performance and facilitates experimentation.</Paragraph>
    <Paragraph position="2"> Figures 1 and 2 show, respectively, the path taken by the MEMD2B partition search, and the validation corpus perplexities of each model tested during the search. As shown in figure 1, the search consisted of 6 iterations. Since on all previous iterations no increase in position partitions beyond the initial value of 10 was selected, on the 5th iteration I tried decreasing the number of position partitions to 5. This model was not selected either, so on the final step only the number of word-pair partitions was augmented, yielding an optimal combination of 10 position partitions and 4000 word-pair partitions.</Paragraph>
    <Paragraph position="3"> Table 2 gives the final results for all models. The IBM models tested here incorporate a reduced set of 1M word-pair parameters, selected using the method described in (Foster, 2000), which gives slightly better test-corpus performance than the unrestricted set of all 35M word pairs which cooccur within aligned sentence pairs in the training corpus.</Paragraph>
    <Paragraph position="4"> The basic MEMD1 model (without position parameters) attains about 30% lower perplexity than the model 2 baseline, and MEMD2B with an optimal-sized set of position parameters achieves in a further drop of over 10%. Interestingly, the difference between IBM1 and  model word-pair position perplexity improvement parameters parameters over baseline</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML