File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/e06-1005_intro.xml

Size: 3,577 bytes

Last Modified: 2025-10-06 14:03:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="E06-1005">
  <Title>Computing Consensus Translation from Multiple Machine Translation Systems Using Enhanced Hypotheses Alignment</Title>
  <Section position="2" start_page="0" end_page="33" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In this work we describe a novel technique for computing a consensus translation from the outputs of multiple machine translation systems.</Paragraph>
    <Paragraph position="1"> Combining outputs from different systems was shown to be quite successful in automatic speech recognition (ASR). Voting schemes like the ROVER approach of (Fiscus, 1997) use edit distance alignment and time information to create confusion networks from the output of several ASR systems.</Paragraph>
    <Paragraph position="2"> Some research on multi-engine machine translation has also been performed in recent years. The most straightforward approaches simply select, for each sentence, one of the provided hypotheses. The selection is made based on the scores of translation, language, and other models (Nomoto, 2004; Paul et al., 2005). Other approaches combine lattices or N-best lists from several different MT systems (Frederking and Nirenburg, 1994). To be successful, such approaches require compatible lattices and comparable scores of the (word) hypotheses in the lattices. However, the scores of most statistical machine translation (SMT) systems are not normalized and therefore not directly comparable. For some other MT systems (e.g. knowledge-based systems), the lattices and/or scores of hypotheses may not be even available.</Paragraph>
    <Paragraph position="3"> (Bangalore et al., 2001) used the edit distance alignment extended to multiple sequences to construct a confusion network from several translation hypotheses. This algorithm produces monotone alignments only (i.e. allows insertion, deletion, and substitution of words); it is not able to align translation hypotheses with significantly different word order. (Jayaraman and Lavie, 2005) try to overcome this problem. They introduce a method that allows non-monotone alignments of words in different translation hypotheses for the same sentence. However, this approach uses many  heuristicsandisbasedonthealignmentthatisperformed to calculate a specific MT error measure; the performance improvements are reported only in terms of this measure.</Paragraph>
    <Paragraph position="4">  Here, we propose an alignment procedure that explicitly models reordering of words in the hypotheses. In contrast to existing approaches, the context of the whole document rather than a single sentence is considered in this iterative, unsupervised procedure, yielding a more reliable alignment. null Based on the alignment, we construct a confusion network from the (possibly reordered) translation hypotheses, similarly to the approach of (Bangalore et al., 2001). Using global system probabilities and other statistical models, the voting procedure selects the best consensus hypothesis from the confusion network. This consensus translation may be different from the original translations.</Paragraph>
    <Paragraph position="5"> This paper is organized as follows. In Section 2, we will describe the computation of consensus translations with our approach. In particular, we will present details of the enhanced alignment and reordering procedure. A large set of experimental results on several machine translation tasks is presented in Section 3, which is followed by a summary. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML