XML Viewer - n04-4003

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/n04-4003_metho.xml
Size: 6,048 bytes
Last Modified: 2025-10-06 14:08:55
<?xml version="1.0" standalone="yes"?>
<Paper uid="N04-4003">
  <Title>Example-based Rescoring of Statistical Machine Translation Output</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Translation Example Retrieval
</SectionTitle>
    <Paragraph position="0"> Translation examples consist of pairs of pre-translated sentences, either by humans (high quality) or automatically using MT systems (reduced quality). A collection of translation examples can be used directly to obtain a translation of a given input sentence. The similarity of the input to the source part of the translation examples enables us to identify translation candidates that might be close to the actual translation.</Paragraph>
    <Paragraph position="1"> A common approach to measure the distance between sequences of words is the edit distance criteria (Wagner, 1974). The distance is defined as the sum of the costs of insertion (INS), deletion (DEL), and substitution (SUB) operations required to map one word sequence into the other. The edit distance can be calculated by a standard dynamic programming technique.</Paragraph>
    <Paragraph position="3"> An extension of the edit-distance-based retrieval method is presented in (Watanabe and Sumita, 2003). It incorporates the tf idf criteria as seen in the information retrieval framework by treating each translation example as a document. For each word of the input, its term frequency tfi;j is combined with its document frequency dfi into a single weight wi;j, which is used to select the most relevant ones out of N documents (= example targets).</Paragraph>
    <Paragraph position="4"> Another possibility for obtaining translation examples is simply to utilize available (off-the-shelf) MT systems by pairing the input sentence with the obtained MT output. However, the quality of those translation examples might be much lower than manually created translations.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Statistical Decoding
</SectionTitle>
    <Paragraph position="0"> (Germann et al., 2001) presents a greedy approach to search for the translation that is most likely according to previously learned statitistical models. An extension of this approach that can take advantage of translation examples provided for a given input sentence is proposed in (Watanabe and Sumita, 2003). Instead of decoding and generating an output string word-by-word as is done in the basic concept, this greedy approach slightly modifies the target part of the translation examples so that the pair becomes the actual translation.</Paragraph>
    <Paragraph position="1"> The advantage of the example-based approach is that the search for a good translation starts from the retrieved translation example, not a guessed translation resulting in fewer search errors. However, since it uses the same greedy search algorithm as the basic method, search errors cannot be avoided completely. Furthermore, the parameter estimation problem still remains.</Paragraph>
    <Paragraph position="2"> The experiment discussed in Section 5.1 indeed shows a large degradation in the system performance when the greedy decoder is applied to already perfect translations, indicating that the decoder may modify translations wrongly based on its statistical models (IBM model 4).</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Example-based Rescoring
</SectionTitle>
    <Paragraph position="0"> Therefore we have to validate the quality of translation candidates selected by the decoder and judge whether problems in the SMT models or search errors resulted in an inaccurate translation or not.</Paragraph>
    <Paragraph position="1"> Our approach extends the example-based concept of (Watanabe and Sumita, 2003). It compares the decoder output with the seed sentence, i.e., the target part of the translation example that forms the input of the decoder.</Paragraph>
    <Paragraph position="2"> Given a translation example whose source part is quite similar to the input, we can assume that the fewer the modifications that are necessary to alter the corresponding example target to the translation candidate during decoding, the less likely it is that there will be a problem in the statistical models.</Paragraph>
    <Paragraph position="3"> The decision on translation quality is based on the edit distance criteria, as introduced in Section 2. For each translation candidate, we measure the edit distance between the word sequence of the decoder output and the seed sentence. The proposed method rescores the translation candidates of the SMT decoder by combining the statistical probabilities of the translation and language models with the example-based translation quality hypothesis and selects the translation candidate with the highest revised score as the translation output.</Paragraph>
    <Paragraph position="4"> The rescoring function rescore has to be designed in such a way that almost unaltered translation candidates with good translation and language model scores are preferred over those with the highest statistical scores that required lots of modifications to the seed sentence.</Paragraph>
    <Paragraph position="5"> For the experiments described below we defined two different rescoring functions. First, the edit distance of the seed sentence sd and the decoder output d is used as a weight to decrease the statistical scores. The larger the edit distance score, the smaller the revised score of the respective translation candidate. The scaling factor scale depends on the utilized corpus and can be optimized on a development set reserved for parameter tuning.</Paragraph>
    <Paragraph position="7"> (2) The second rescoring function assigns a probability to each decoder output that combines the exponential of the sum of log probabilities of TM and LM and the scaled negative ED scores of all translation candidates TC as follows.</Paragraph>
    <Paragraph position="9"/>
  </Section>
class="xml-element"></Paper>
Download Original XML