File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-1163_intro.xml

Size: 3,203 bytes

Last Modified: 2025-10-06 14:01:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1163">
  <Title>Machine Translation by Interaction between Paraphraser and Transfer</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Humans generally have language capability, mostly for their mother tongue and to a lesser extent for foreign languages. This leads us to making the most of our mother language, even in conducting translation. That is, when we translate our language into a foreign one unfamiliar to us, we may try to paraphrase the source input into easier expressions we can translate.</Paragraph>
    <Paragraph position="1"> In contrast, there is no such machine translation (MT) model so far proposed where the source language module is biased over the bilingual language module. All of the MT models are either those where the bilingual processor takes the initiative over the source language analyzer (conventional analyze-transfer-generate model) or integration models of analyzer and transfer, such as example-based or statistical models. Although some MT models have a paraphraser (also called a 'pre-editor'), such as that of Shirai et al. (1993), paraphrasing is performed in these models because it is necessary to prepare for the subsequent bilingual process. In other words, the paraphraser operates as a sub-module for successful transfer.</Paragraph>
    <Paragraph position="2"> We have proposed a new MT model that is more similar to the human translation process than other MT systems (Yamamoto et al., 2001). This model, called the Sandglass model, is designed so that the system can generate a translation through source language paraphrasing, even if the system does not have sufficient bilingual knowledge. In this sense, our model design can be considered a non-professional translator's model.</Paragraph>
    <Paragraph position="3"> From the engineering point of view, our model has an advantage in language portability; it is easy to construct an MT for a new language, since our model depends only on source language and thus can reduce dependence on bilingual knowledge. Moreover, the better source language paraphraser we make, the easier the implementation of other language MT becomes.</Paragraph>
    <Paragraph position="4"> Another advantage is task portability, since all of the paraphrasing knowledge, except for lexical paraphrasing knowledge, is independent of the task, so we do not need to fit most of the paraphrasing knowledge to the required task. It is also significant that this model's paraphraser can be employed not only for MT but also for most natural language processing (NLP) applications. This is possible because both the input and output of a paraphraser is the same natural language.</Paragraph>
    <Paragraph position="5"> We have been building the Sandglass MT system for the Japanese-Chinese, Chinese-Japanese language pairs (Yamamoto et al., 2001; Zhang and Yamamoto, 2002). We have already constructed a prototype for Japanese-Chinese. In this paper, we report the core concepts of this prototype and discuss issues of both our principle and our implementation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML