File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-3108_concl.xml

Size: 2,814 bytes

Last Modified: 2025-10-06 13:55:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3108">
  <Title>Discriminative Reordering Models for Statistical Machine Translation</Title>
  <Section position="7" start_page="60" end_page="61" type="concl">
    <SectionTitle>
6 Conclusions
</SectionTitle>
    <Paragraph position="0"> We have presented a novel discriminative reordering model for statistical machine translation. This model is trained on the word aligned bilingual corpus using the maximum entropy principle. Several types of features have been used: * based on the source and target sentence</Paragraph>
    <Section position="1" start_page="61" end_page="61" type="sub_section">
      <SectionTitle>
System Translation
</SectionTitle>
      <Paragraph position="0"> Distance-based I would like to check out time one day before.</Paragraph>
      <Paragraph position="1"> Max-Ent based I would like to check out one day before the time.</Paragraph>
      <Paragraph position="2"> Reference I would like to check out one day earlier.</Paragraph>
      <Paragraph position="3"> Distance-based I hate pepper green.</Paragraph>
      <Paragraph position="4"> Max-Ent based I hate the green pepper.</Paragraph>
      <Paragraph position="5"> Reference I hate green peppers.</Paragraph>
      <Paragraph position="6"> Distance-based Is there a subway map where? Max-Ent based Where is the subway route map? Reference Where do they have a subway map?  We have evaluated the performance of the re-ordering model on a held-out word-aligned corpus. We have shown that the model is able to predict the orientation very well, e.g. for Arabic-English the classification error rate is only 2.1%.</Paragraph>
      <Paragraph position="7"> We presented improved translation results for three language pairs on the BTEC task and for the large data track of the Chinese-English NIST task. In none of the cases additional features have hurt the classification performance on the held-out test corpus. This is a strong evidence that the maximum entropy framework is suitable for this task.</Paragraph>
      <Paragraph position="8"> Another advantage of our approach is the generalization capability via the use of word classes or part-of-speech information. Furthermore, additional features can be easily integrated into the maximum entropy framework.</Paragraph>
      <Paragraph position="9"> So far, the word classes were not used for the translation experiments. As the word classes help for the classification task, we might expect further improvements of the translation results. Using part-of-speech information instead (or in addition) to the automatically computed word classes might also be beneficial. More fine-tuning of the reordering model toward translation quality might also result in improvements. As already mentioned in Section 4.3, a richer feature set could be helpful.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML