File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/90/w90-0123_intro.xml
Size: 6,434 bytes
Last Modified: 2025-10-06 14:04:59
<?xml version="1.0" standalone="yes"?> <Paper uid="W90-0123"> <Title>Relational-Grammar-Based Generation in the JETS Japanese-English Machine Translation System</Title> <Section position="2" start_page="0" end_page="174" type="intro"> <SectionTitle> 1- Introduction </SectionTitle> <Paragraph position="0"> This paper discusses relational-grammar-based generation in the context of JETS, a Japanese-English machine translation (MT) system that is being developed at the IBM Research Tokyo Research Laboratory.</Paragraph> <Paragraph position="1"> To put our work in perspective, we first explain the motivation for basing JETS on relational grammar (RG) and then sketch the processing flow in translation. With this background, we (i) describe and illustrate certain aspects of the rule-writing language, GEAR, in which the GENIE English generator has been written; (ii) comment on key aspects of the generator shell, GEN-SHELL, in which GENIE has been developed; and (iii) discuss the design and functioning of the GENIE English generator.</Paragraph> <Paragraph position="2"> With few exceptions such as the work being done at CMU (cf. KBMT-89 (1989), Nirenburg (1987), and Nirenburg, et. al. (1988)), in the SEMSYN project at the University of Stuttgart (Rosner (1986)), and the joint work between the ISI Penman project and the University of Saarbrticken (Bateman, et. al. (1989)), generation within the area of machine translation has received very little attention. Typically, MT systems have no independently functioning, linguistically justified generation grammar. In the case of transfer systems, much of the target language grammar is typically built into the transfer component, resulting in a non-modular, rigid and linguistically inadequate system.</Paragraph> <Paragraph position="3"> It is the norm in MT systems for the linguistic complexities inherent in robust generation to be simply ignored, contributing to the inadequacy of MT systems.</Paragraph> <Paragraph position="4"> In contrast, we have sought to shift more of the processing burden from transfer onto generation, allowing our system to incorporate a variety of results coming from theoretical linguistics. GENIE is an application-and-language-independent generator embodying a robust, linguistically justified RG grammar of English. Moreover, GENIE incorporates a syntax planner that applies a set of planning rules determining which rules in the execution grammar should be applied. As long recognized in work on text generators, the incorporation of a syntax planner introduces the kind of flexibility required for robust generation.</Paragraph> <Paragraph position="5"> JETS is a so-called limited transfer system, i.e., a system in which structural transfer is kept to a minimum. The key RG notion in our work is that of canonical (relational) structure (CS), an abstract level of syntactic structure representing the basic predicate-argument structure of clauses in terms of a universal set of primitive (grammatical) relations such as subject, direct object, indirect object, chomeur. 1 Given the basic assumption that one is developing a limited transfer system, implying deep analyses of both the source and target languages which converge on structurally similar internal representations for translation equivalents in a wide range of cases, it is critical to select a linguistic framework which supports the required analyses, enabling one to conceptualize the linguistic processing in a uniform manner. As discussed in Johnson (1988b), with respect to MT, RG is a logical choice of linguistic framework since CSs provide a natural syntactic bridge between languages as diverse in structure as Japanese and English. This is so for two reasons: (1) within one language, the CSs of paraphrases are typically the same or highly similar and (2) translation equivalents often have structurally similar if not isomorphic CSs.</Paragraph> <Paragraph position="6"> One of the key advantages of RG comes from its explicit representation of grammatical relations like subject and direct object, which are argued to be universal. In contrast, structure-based frameworks such as transformational-generative grammar (TG) at best only implicitly represent grammatical relations such as subject and direct object in terms of linear precedence and dominance, which are language particular. If one considers the task of transfer, for instance, it is clear that representing basic clause structure in terms of explicitly marked, order-independent relations rather than in terms of language-dependent structural relations reduces the amount of structure changing to be done in the transfer component. This is especially true for languages like Japanese and English, which differ greatly in superficial structural properties (not to mention the fact that Japanese has very free word order, which arguably makes it even less suited to structure-based frameworks). null 2- Processing Flow in JETS and GENIE As in all transfer systems, linguistic processing in JETS can be divided into three phases: analysis, which consists of lexical analysis and parsing, transfer and generation. The output of analysis is a Japanese CS, which represents the basic predicate-argument structure of the Japanese sentence. 2 Transfer produces an English CS, which is often, but not always, isomorphic to the Japanese CS. The English CS is passed to the GENIE generator, whose task is to generate a grammatically correct and stylistically appropriate English sentence given a well-formed CS.</Paragraph> <Paragraph position="7"> To illustrate, consider the following Japanese sentence and two of the possible English translations: 1. karera wa Tookyoo e itta rashii they top Tokyo to went seem 2. They seem to have gone to Tokyo.</Paragraph> <Paragraph position="8"> 3. It seems that they went to Tokyo.</Paragraph> <Paragraph position="9"> In translating (1), analysis maps the input string into the Japanese CS shown at the left in Figure 1 on the next page. Transfer then maps the Japanese CS into the English CS shown at the right in Figure 1.</Paragraph> <Paragraph position="10"> I For theoretical background on RG, see the many articles listed in the bibliographic reference Dubinsky and Rosen (1987). Note that the following abbreviations are used in glosses of Japanese examples: top (topic), nm (nominalize), and pp (postposition). null</Paragraph> </Section> class="xml-element"></Paper>