File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/02/w02-1604_abstr.xml

Size: 3,857 bytes

Last Modified: 2025-10-06 13:42:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-1604">
  <Title>English-Japanese Example-Based Machine Translation Using Abstract Linguistic Representations</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> This presentation describes an example-based English-Japanese machine translation system in which an abstract linguistic representation layer is used to extract and store bilingual translation knowledge, transfer patterns between languages, and generate output strings.</Paragraph>
    <Paragraph position="1"> Abstraction permits structural neutralizations that facilitate learning of translation examples across languages with radically different surface structure characteristics, and allows MT development to proceed within a largely language-independent NLP architecture. Comparative evaluation indicates that after training in a domain the English-Japanese system is statistically indistinguishable from a non-customized commercially available MT system in the same domain.</Paragraph>
    <Paragraph position="2"> Introduction In the wake of the pioneering work of Nagao (1984), Brown et al. (1990) and Sato and Nagao (1990), Machine Translation (MT) research has increasingly focused on the issue of how to acquire translation knowledge from aligned parallel texts. While much of this research effort has focused on acquisition of correspondences between individual lexical items or between unstructured strings of words, closer attention has begun to be paid to the learning of structured phrasal units: Yamamoto and Matsumoto (2000), for example, describe a method for automatically extracting correspondences between dependency relations in Japanese and English. Similarly, Imamura (2001a, 2001b) seeks to match corresponding Japanese and English phrases containing information about hierarchical structures, including partially completed parses.</Paragraph>
    <Paragraph position="3"> Yamamoto and Matsumoto (2000) explicitly assume that dependency relations between words will generally be preserved across languages. However, when languages are as different as Japanese and English with respect to their syntactic and informational structures, grammatical or dependency relations may not always be preserved: the English sentence &amp;quot;the network failed&amp;quot; has quite a different grammatical structure from its Japanese translation equivalent netutowakuniZhang Hai ga Fa Sheng sita 'a defect arose in the network.' One issue for example-based MT, then, is to capture systematic divergences through generic learning applicable to multiple language pairs. In this presentation we describe the MSR-MT English-Japanese system, an example-based MT system that learns structured phrase-sized translation units. Unlike the systems discussed in Yamamoto and Matsumoto (2000) and Imamura (2001a, 2001b), MSR-MT places the locus of translation knowledge acquisition at a greater level of abstraction than surface relations, pushing it into a semantically-motivated layer called LOGICAL FORM (LF) (Heidorn 2000; Campbell &amp; Suzuki 2002a, 2002b). Abstraction has the effect of neutralizing (or at least minimizing) differences in word order and syntactic structure, so that mappings between structural relations associated with lexical items can readily be acquired within a general MT architecture.</Paragraph>
    <Paragraph position="4"> In Section 1 below, we present an overview of the characteristics of the system, with special reference to English-Japanese MT. Section 2 discusses a class of structures learned through phrase alignment, Section 3 presents the results of comparative evaluation, and Section 4 some factors that contributed to the evaluation results. Section 5 addresses directions for future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML