File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-3225_intro.xml

Size: 4,731 bytes

Last Modified: 2025-10-06 14:02:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-3225">
  <Title>Adaptive Language and Translation Models for Interactive Machine Translation</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Cache-based language models were introduced by Kuhn and de Mori (1990) for the dynamic adaptation of speech language models. These models, inspired by the memory caches on modern computer architectures, are motivated by the principle of locality which states that a program tends to repeatedly use memory cells that are physically close.</Paragraph>
    <Paragraph position="1"> Similarly, when speaking or writing, humans tend to use the same words and phrase constructs from paragraph to paragraph and from sentence to sentence. This leads us to believe that, when processing a document, the part of a document that is already processed (e.g. for speech recognition, translation or text prediction) gives us very useful information for future processing in the same document or in other related documents.</Paragraph>
    <Paragraph position="2"> A cache-based language model is a language model to which is added a smaller model trained only on the history of the document being processed. The history is usually the last N words or sentences seen in the document.</Paragraph>
    <Paragraph position="3"> Kuhn and de Mori (1990) obtained a drop in perplexity of nearly 68% when adding an unigram POS (part-of-speech) cache on a 3g-gram model. Martin and al. (1997) obtained a drop of nearly 21% when adding a bigram cache to a trigram model. Clarkson and Robertson (1997) also obtained similar results with an exponentially decaying unigram cache.</Paragraph>
    <Paragraph position="4"> The major problem with these theoretical results is that they assume the correctness of the material entering the cache. In practice, this assumption does not always hold, and so a cache can sometimes do more harm than good.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.1 Interactive translation context
</SectionTitle>
      <Paragraph position="0"> Over the last few years, an interactive machine translation (IMT) system (Foster et al., 2002) has been developed which, as the translator is typing, suggests word and phrase completions that the user can accept or ignore. The system uses a translation engine to propose the words or phrases which it judges the most probable to be immediately typed.</Paragraph>
      <Paragraph position="1"> This engine includes a translation model (TM) and a language model (LM) used jointly to produce proposals that are appropriate translations of source words and plausible completions of the current text in the target language. The translator remains in control of the translation because what is typed by the user is taken as a constraint to which the model must continually adapt its completions. Experiments have shown that the use of this system can save about 50% of the keystrokes needed for entering a translation. As the translation and language models are built only once, before the user starts to work with the system, the translator is often forced to repeatedly correct similar suggestions from the system.</Paragraph>
      <Paragraph position="2"> The interactive nature of this setup made us believe that it is a good prospect for dynamic adaptive modeling. If the dynamic nature of the system can be disadvantageous for static language and translation models, it is an incomparable advantage for a cache based approach because human correction intervenes before words go in the cache. As the translator is using the system to correctly enter his translation progressively, we can expect the theoretical results presented in the literature to be obtainable in practice in the IMT context.</Paragraph>
      <Paragraph position="3"> The first advantage of dynamic adaptation would be to help the translation engine make better predictions, but it has a further psychological advantage: as the translator works and potentially corrects the proposals of the engine, the user would feel that the software is learning from its errors.</Paragraph>
      <Paragraph position="4"> The next section describes the models currently embedded within our IMT prototype. Section 3 describes the cache-based adaptation we performed on the target language model. In section 4, we present the different types of adaptations we performed on the translation model. Section 5 then puts the results in the context of our IMT application. Section 6 discusses the implications of our experiments and suggests some improvements that could be made to the system.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML