File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/p01-1050_intro.xml

Size: 4,791 bytes

Last Modified: 2025-10-06 14:01:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="P01-1050">
  <Title>Towards a Unified Approach to Memoryand Statistical-Based Machine Translation</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Over the last decade, much progress has been made in the fields of example-based (EBMT) and statistical machine translation (SMT). EBMT systems work by modifying existing, human produced translation instances, which are stored in a translation memory (TMEM). Many methods have been proposed for storing translation pairs in a TMEM, finding translation examples that are relevant for translating unseen sentences, and modifying and integrating translation fragments to produce correct outputs. Sato (1992), for example, stores complete parse trees in the TMEM and selects and generates new translations by performing similarity matchings on these trees.</Paragraph>
    <Paragraph position="1"> Veale and Way (1997) store complete sentences; new translations are generated by modifying the TMEM translation that is most similar to the input sentence. Others store phrases; new translations are produced by optimally partitioning the input into phrases that match examples from the TMEM (Maruyana and Watanabe, 1992), or by finding all partial matches and then choosing the best possible translation using a multi-engine translation system (Brown, 1999).</Paragraph>
    <Paragraph position="2"> With a few exceptions (Wu and Wong, 1998), most SMT systems are couched in the noisy channel framework (see Figure 1). In this framework, the source language, let-s say English, is assumed to be generated by a noisy probabilistic source.1 Most of the current statistical MT systems treat this source as a sequence of words (Brown et al., 1993). (Alternative approaches exist, in which the source is taken to be, for example, a sequence of aligned templates/phrases (Wang, 1998; Och et al., 1999) or a syntactic tree (Yamada and Knight, 2001).) In the noisy-channel framework, a mono-lingual corpus is used to derive a statistical language model that assigns a probability to a sequence of words or phrases, thus enabling one to distinguish between sequences of words that are grammatically correct and sequences that are not.</Paragraph>
    <Paragraph position="3"> A sentence-aligned parallel corpus is then used in order to build a probabilistic translation model  and target languages according to the jargon specific to the noisy-channel framework. In this framework, the source language is the language into which the machine translation system translates.</Paragraph>
    <Paragraph position="4">  that explains how the source can be turned into the target and that assigns a probability to every way in which a source e can be mapped into a target f. Once the parameters of the language and translation models are estimated using traditional maximum likelihood and EM techniques (Dempster et al., 1977), one can take as input any string in the target language f, and find the source e of highest probability that could have generated the target, a process called decoding (see Figure 1).</Paragraph>
    <Paragraph position="5"> It is clear that EBMT and SMT systems have different strengths and weaknesses. If a sentence to be translated or a very similar one can be found in the TMEM, an EBMT system has a good chance of producing a good translation. However, if the sentence to be translated has no close matches in the TMEM, then an EBMT system is less likely to succeed. In contrast, an SMT system may be able to produce perfect translations even when the sentence given as input does not resemble any sentence from the training corpus.</Paragraph>
    <Paragraph position="6"> However, such a system may be unable to generate translations that use idioms and phrases that reflect long-distance dependencies and contexts, which are usually not captured by current translation models.</Paragraph>
    <Paragraph position="7"> This paper advances the state-of-the-art in two respects. First, we show how one can use an existing statistical translation model (Brown et al., 1993) in order to automatically derive a statistical TMEM. Second, we adapt a decoding algorithm so that it can exploit information specific both to the statistical TMEM and the translation model.</Paragraph>
    <Paragraph position="8"> Our experiments show that the automatically derived translation memory can be used within the statistical framework to often find translations of higher probability than those found using solely the statistical model. The translations produced using both the translation memory and the statistical model are significantly better than translations produced by two commercial systems.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML