File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/80/c80-1063_metho.xml

Size: 40,631 bytes

Last Modified: 2025-10-06 14:11:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="C80-1063">
  <Title>A MACHINE TRANSLATION SYSTEM FROM JAPANESE INTO ENGLISH -- ANOTHER PERSPECTIVE OF MT SYSTEMS --</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
A MACHINE TRANSLATION SYSTEM FROM JAPANESE INTO ENGLISH
-- ANOTHER PERSPECTIVE OF MT SYSTEMS --
</SectionTitle>
    <Paragraph position="0"> M. Nagao, J. Tsujii, K. Mitamura, H. Hirakawa, M. Kume</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Department of Electrical Engineering
Kyoto University
</SectionTitle>
      <Paragraph position="0"> Sakyo, Kyoto, 606, JAPAN Summary A machine translation system from Japanese into English is described. The system aims at translation of computer manuals, and basically follows to the transfer approach. The design principles of the system are discussed in detail, together with the overall constructions of the system. Especially, the effectiveness of lexicon-based procedures, i.e. lexicon-based analysis, transfer, and synthesis, is emphasized. Most of the linguistic phenomena are treated by using lexical descriptions and lexical rules, instead of by general syntactic rules. Because Japanese and English belong to quite different language families, much more structural transfers are necessary than in other MT systems among European languages. Special cares have been paid for designing the transfer component. Some translation results are also given to illustrate the current abilities of the system.</Paragraph>
      <Paragraph position="1"> i. Introduction This paper is the first progress report of a machine translation system from Japanese into English being developed at Kyoto University.</Paragraph>
      <Paragraph position="2"> The project currently aims at the translation of computer manuals, in which vocabulary is rather limited and less ambiguous than in other subject fields. However, this system is a good example of MTsvste~swhose SL's and TL's belong to quite different language families, in which a lot of interesting problems have arisen that have been concealed in the systems whose language pairs are rather close in language families. We will discuss in the paper some of the design principles without referring to the detailed linguistic phenomena.</Paragraph>
      <Paragraph position="3"> The system has been implemented on FACOM M-200  LISP. Only exception is the morphological analysis of Japanese, which is done by PL/i program.</Paragraph>
      <Paragraph position="4"> The system basically follows to 'transfer approach' advocated by several other groups such as TAUM, GETAetc. (1) The overall system consists of the three major components; Japanese analysis~ transfer, and English synthesis components ss shown in Fig. i. The system is based on several guiding principles. Among these, the followings would distinguish our system from the other MT systems.</Paragraph>
      <Paragraph position="5"> i. It is highly lexicon-driven. Every component including analysis, transfer and synthesis components is highly dependent on lixical descriptions of individual words. In other words, most of the linguistic phenomena are treated by lexical descriptions and lexical rules, instead of general syntactic rules such as 'structure dependent rules'in Chomskian grammar. We completely agree with J. Bresnan, an MIT linguist, when she claimed as follows: (2) 'Finally, I assume that it is easier for us to look something up than it is to compute it. It does in fact appear that our lexical capacity -- the long-term capability to remember lexical information -- is very large.' 2. The approach becomes closer to the inter-lingual approach. Because Japanese structures can be adequately captured by dependency structures based on case notions, we adopted this structure as the intermediate representation for Japanese. On the other hand, the structures from which synthesis of English will start are ordinary phrase structures. It is well known that dependency structures require semantically deeper analyses than usual phrase structures.</Paragraph>
      <Paragraph position="6"> Therefore, our approach becomes closer to the interlingual approach, and even undistinguishable with it in some cases. Especially, because the two languages have quite different systems for expressing tenses, modals, aspects etc., these expressions are analyzed into much deeper levels, that is, almost the i~terlingu~llevel.</Paragraph>
      <Paragraph position="7"> Considering the fact that the two languages belong to quite different language families, our approach seems to be inevitable.</Paragraph>
    </Section>
  </Section>
  <Section position="2" start_page="0" end_page="414" type="metho">
    <SectionTitle>
3. Stereotyped or semi-stereotyped expressions
</SectionTitle>
    <Paragraph position="0"> found in computer manuals are effectively utilized. Stereotyped expressions here mean not only idioms in a usual sense, but also certain stylistic prototypes which can often be found in manuals. Special cares have been taken to utilize them effectively in our system.</Paragraph>
  </Section>
  <Section position="3" start_page="414" end_page="414" type="metho">
    <SectionTitle>
2. Japanese Sentence Analysis
</SectionTitle>
    <Paragraph position="0"> The analysis proceeds as follows:  segmentation of an input sentence into a set of simple sentence fragments (each fragment contains only one predicative term such as verb, predicative adjective, copula, etc.) recognition of relationships among sentence fragments noun phrase analysis ~ performed simple sentence analysisJ intermixedly Because Japanese is a typical agglutinative language, many useful sorts of information can be obtained by morphological analysis, It is undoubtedly true in both cases, Japanese analysis and other European language analysis, typically in English analysis, that morphological and syntactic analyses should work co-operatively. However, the co-operation should be done in different ways. Generally speaking, English morphological analysis needs much help from its syntactic analysis. English homograms can rarely be resolved by intra-word processings. Therefore, morphological analysis alone will produce highly ambiguous results in English. Syntactic and even semantic information is required to resolve them. On the contrary, Japanese morphological analysis offers much help to its syntactic analysis. This implies that Japanese morphological analysis can be done in a separate phase with syntactic and other succeeding processings.</Paragraph>
    <Paragraph position="1"> Because Japanese morphological analysis is closely related to both the writing system and detailed word inflection rules of Japanese, we shall omit the discussion of this phase, only noting that certain composite expressions are treated in our system as single morphemes. Some examples are shown in Fig. 2. A detailed discussion about this phase can be found in \[5 \].</Paragraph>
    <Paragraph position="3"> auxiliary verb/conjunet~ ver~auxiliary verb ~ for negation postposition for negation \[our system\] ' ~ ~ ~ ~ ~ 6 ~ ~&amp;quot; is treated as a single morpheme (post-verbial suffix -see 2-2) which expresse the modality 'OBLIGATORY'.</Paragraph>
    <Paragraph position="5"> case suffix verb suffix for (to use) sentence -conjunction \[our system\] , ~ ~o ~, is treated as a single morpheme (case suffix) for INSTRument.</Paragraph>
    <Paragraph position="6"> Fig. 2 Examples of Composite Morphemes 2-1. Lexicon Based Analysis Procedure for Japanese In order to discuss the other analysis steps, we have to mention certain syntactic aspects of Japanese. Among those, it should be noted that case relationships between noun phrases and verbs are usually marked by case suffixes attached to noun phrases. An example is shown in Fig. 3.</Paragraph>
    <Paragraph position="8"> user program data to modify meaning : (The) user modifies (the) data by (a) program. Note : ~(ga), %&amp;quot;(de), and ~(wo) are the case suffixes. In Japanese, noun phrases which bear some grammatical relationships with a verb always precede the verb in a surface sentence.</Paragraph>
    <Paragraph position="9"> Fig. 3 Case Suffixes in Japanese ~(ga) usually marks AGENTIVE, ~(wo) OBJECTIVE and %~(de) INSTRUMENTAL cases, respectively.</Paragraph>
    <Paragraph position="10"> However, this direct correspondence between surface case suffixes and deep cases may not be preserved in actual sentences. In other words, case suffixes indicate only surface grammatical relationships between noun phrases and a verb, and these grammatical relationships may not coincide with deep semantic cases. We should distinguish them carefully, as C. Fillmore did in English. He tried to set up general rules to relate deep cases with surface grammatical relationships in English. Unfortunately, his model is based on generating sentences and gives us no clue as to how to parse them. Moreover, we observed that, at least in Japanese, this surface and deep correspondence ismore or less specific to individual verbs. The same phenomena have been observed in English by J. Bresnan and other linguists.(2)They have treated these phenomena by setting up 'lexical interpretation rules' which are specific to individual verbs, and which translate the surface grammatical structures into deep semantic ones. From computational view points, this fremework leads us to lexicon-based analysis procedures. Instead of general syntactic rules, we describe specific surface-deep mappings for individual verbs in the analysis dictionary, as shown in Fig. 4.</Paragraph>
    <Paragraph position="11"> One of the main purposes to establish transformation rules was to relate surface structures with deep ones by the rules. In our framework, most of this task is done by surface-deep mappings described in the dictionary. Therefore, a simple pattern matching is sufficient to analyze sentence fragments, that contain only one verb.</Paragraph>
    <Paragraph position="12"> However, there still remain certain sets of transformations which seem not to be well captured by the surface-deep mappings of individual verbs. We also treat them as lexical rules. We will discuss this point in the next section.</Paragraph>
    <Paragraph position="13"> - 415-I (to moaify) surface pattern : i ~, 2 ~ 3 (ga) (de) (wo) deep structure : (* MODIFY (AGENT (1))(INST (2))(OBJ (3))) In the actual inplementation, sets of semantic restriction are described here.</Paragraph>
    <Paragraph position="14"> Fig. 4 A Surface~Deep Mapping 2-2. Transformations as Lexical Rules Transformations treated by our system can be classified into the following categories (Notice that we use here the term 'transformations' in a broader sense than in traditional TG. And also notice that, though 'scrambling' operations are very conspicuous in Japanese which are applied after transformation cycles in traditional TG's, we do not consider them as transformations here, because they can be embodied in pattern matching operations, i.e., pattern matchings without considering orders of elements), i. Transformations dependent on a set of specified case elements (Fig. 5, Ex. i) : These correspond to the Fillmore's examples, 'John broke the window with a hammer,' 'A Hammer broke the window,' 'The window broke' 2. Transformations caused by adverbial suffixes (Fig. 5, Ex. 2) : As shown in Ex. 2, a case suffix can be replaced by an adverbial suffix. Careful investigation reveals that a certain class of case suffixes can be replaced by an adverbial suffix without any traces (TSPI, TSP2 in Ex. 2) and another class of case suffixes cannot be, but just be follwed by an adverbial suffix (TSP3in Ex.2). In fact, a Relative ordering of case suffixes exists and higher case suffixes in the ordering can easily be replaced with an adverbial suffix without any surface traces. Moreover, this relative ordering of case suffixes depends on individual verbs, depending on how intimate a relationship the concept expressed by each noun phrase bears to the action expressed by the verb. We may be able to capture this intimacy hierarchy by setting up several different levels of connections between noun phrases and verbs, as Chomsky does in his X-theory~3)However, from computational view points, especially from recognition view points, it is convenient to mark in each surface pattern what ordering exists and which case suffixes can be replaced by which adverbial suffixes.</Paragraph>
    <Paragraph position="15">  3. Transformations caused by post-verbial expressions (Fig. 5, Ex. 3) : Post-verbial expressions also cause surface pattern transformations. These expressions specify tenses,</Paragraph>
    <Paragraph position="17"> program (de) data (~.a) to modify \[(The) data is modified by (the) program.\] * The post-verblal expression ' ~ ~ ~ ' changes the aspectual feature of 'modify' from 'ACTION' into 'STATE'.</Paragraph>
    <Paragraph position="18"> Note : Th0~gh the same case elements appear in this Japanese sentence as in TSP in Ex.l, passive construction should he chosen in this case because English passives also change the aspectual feature of the verb.</Paragraph>
    <Paragraph position="19"> TSSP : Same as Ex.l program (de)&amp;quot; data ~) modify \[(The) data is modified by the orogram.\] * The post-verbial expression '~ ~ ' changes the voice of the sentence from 'ACTIVE' into 'PASSIVE'.</Paragraph>
    <Paragraph position="21"> Note : The case suffix 'h, (ga)) in SSP shows that 'he' has the direct grammatical relationship 'SUBJECT' with the complement 'be right'. On the other hand, '% (wo)' in TSP shows that 'he' has the grammatical relationship with the main verb 'believe', but should be semantically interpreted in relation to the verbal complement 'be right'. This interpretation rule is described in the verb dictionary for '.~,, ) -to believe'.</Paragraph>
    <Paragraph position="22">  &amp;quot;~&gt; Ex.5 (Relative Clause) -SSP : Samz. as Ex.l TSPi: ~J~ ~ ~! ~N!~ ~ ~ user ~ modify data \[(The) data which (the) user modifies\] TSP2: ~- ~ ~ ~iE~r ~ ~JNq~ data (wo) modify user \[(The) user who modifies (the) data\] ?TSP3: ~\]~.~. ~, ~'- ~' % ~PS-~- .~ ~. ~'~ ~ user (ga) data (wo) modify program \[(The) program by which ~he) user modifies (the) data\] .TSP4: ~,j~ ~. ~ ~ ~ ~-user (no) modify data \[(The) data which (the) user modifies\] Note : TSP4 expresses the same as TSPi. However. the case suffix ~ ~, (ga) ' is changed into ' ~ (no) ' This phenomena is observed only in a relativized construction. Fig. 5 Transformed Patterns  aspects, models, and voices of sentences. We now have about 50 such post-verbial expressions. Some of them are shown in Table I, in which * indicates the expression causes transformations. Notice that, though both the post-verbial expressions 'Z t 0&amp;quot;~'i&amp;quot;@$ ' and '~&amp;quot;.~' give the modality 'POSSIBLE' to the sentences, only '~&amp;quot;@$ ' changes the surface patterns. Also notice that activepassive transformations in Japanese are included in this category.</Paragraph>
    <Paragraph position="24"> 4. Transformations caused by verbal comple- null ments (Fig. 5, Ex. 4) : A certain class of Japanese verbs require verbal complements, as English verbs 'promiss', 'expect', 'believe', 'want' etc. As shown in Ex. 4, certain noun phrases, which bear grammatical relationships to such verbs, should he semantically interpreted i~ relation to the verbs in the verbal complements. In the standard theory of TG, these phenomena were also treated by general transformation rules such as raising transformations. 5. Transformations in relative clauses (Fig.</Paragraph>
    <Paragraph position="25"> 5, Ex.5) : Relativization in English is atypical construction which can be adequately explained by structure dependent transformations such as wh-movement rules. However, a relativized construction in Japanese causes not only noun phrase movement but also the other surface transformations as shown in TSP4 of Ex. 5. Moreover, the noun phrases which can be moved are the phrases that are followed by particular case suffixes in the surface patterns. That is, which noun phrases can be moved is dependent on the case suffixes in the surface patterns, and, therefore, dependent on individual verbs.</Paragraph>
  </Section>
  <Section position="4" start_page="414" end_page="414" type="metho">
    <SectionTitle>
6. General Transformations : Clefted
</SectionTitle>
    <Paragraph position="0"> constructions, for example, also appear in Japanese.</Paragraph>
    <Paragraph position="1"> Because the transformations in the above are more or less dependent on individual verbs which govern the transformed structures, we treat them by lexical rules, i.e., we assume that transformations of surface patterns have been done beforehand, and that the transformed patterns are also stored in the individual verb entries in the analysis dictionary.</Paragraph>
    <Paragraph position="2"> In the conventional approaches, there are a set of general transformational rules, which will be inversely applied in turn to input sentences, in order to obtain appropriate 'deep' structures. It has been well known that this inverse application of rules results in combinatorial proliferation of possible structures, partly because such rules are not general rules and only applicable to specific classes of verbs. (consider 'promiss him to go' and 'want him to go' example).</Paragraph>
    <Paragraph position="3"> Our approach is to avoid such inverse applications of general rules. We regard most of transformation rules as word specific, and assume that pre-applied, already transformed patterns are stored in the individual verb dictionaries. The schematic view of our analysis procedure is shown in Fig. 6. During the analysis, it only Selects appropriate surface patterns (transformed or not) from the dictionary and matches them with the input sentences. You may object to us that such a configuration requires a large memory space for the dictionary. However, it is possible to reduce the dictionary size by using macro expressions, if you can classify verbs and decide which transformations are applicable to which verb classes. These macro expressions will be expanded when the dictionary entries containing the macros are retrieved. When you find a spcific verb behaves quite differently from others, you can specify both its surface patterns and transformed patterns directly in the dictionary without using macros. Our approach is: First, we assume that every verb is specific , and exceptional, i.e,, it has its own usages and transformed usages and, if we can find some classes of verbs which behave in the same way, then it is possible to generalize them by using macros.</Paragraph>
    <Paragraph position="4">  In the current version of our system, transformations i, 2, 3, 4, and 5 can be analyzed. That is, dictionary descriptions for them are prepared (However, because our system is an experimental prototype, the dictionary contains only about 80 verbs).</Paragraph>
    <Paragraph position="5"> The information for i, 2 and 4 is directly coded in the surface patterns. Various transformed patterns for 1 and 4 are stored in the dictionary. As for 2, information as to which one can be replaced by adverbial suffixes are indicated in each surface pattern. As for 3 and 5, each transformed patterns is accompanied with the markers that indicate when the patterns should - 417be used (See 2-3).</Paragraph>
    <Paragraph position="6"> 2-3. Selection of Surface Patterns As described at the beginning of this chapter, the analysis proceeds in the sequence such as morphological analysis, segmentation of a sentence, recognition of relationships among sentence fragments,and finally, simple sentence and noun phrase analyses. The analysis of simple sentences, the last step, is done by pattern matchings. In this section, we will discuss how to select appropriate (transformed) surface patterns.</Paragraph>
    <Paragraph position="7"> At the second step of the analysis, the segmentation step, the input sentence is divided into several sentence fragments so that each of them contains only one predicative term. At the same time, post-verbial suffixes which follow the predicative terms are processed, and the appropriate markers of tenses, aspects, medals, and voices are selected. Moreover, if the suffixes are the ones which cause transformations, the appropriate surface patterns are selected. This selection process is performed in the way similar to Rieger's word exper parser (6) (Fig. 7).</Paragraph>
    <Paragraph position="8"> nY\ lexico lexicon *deg , iminat i .... t Irransf?r I ITransfer \] fur surface-deep I.t~e voice_marker \[Ithe aspect marker / mappings I?PASSlVEi and \] I'STATIC' and | p~ 0 line mapping s e let- l Ithe mapping selec-I  The third step is to recognize the global structure of the input sentence. The relative clauses, clefted sentences, conjunctions of sentence~ etc. are recognized at this step, by utilizing the inflection information of each predicative term in the sentence. Generally speaking, several numbers of global structures are produced for an input sentence. Fig. 8  the same input. In CPTi, the first relative clause is embedded in the second. In GPT2, on the other hand, both the two relative clauses are embedded in the main sentence. Fig. 8 GPT's Which Correpond to the Same</Paragraph>
    <Section position="1" start_page="414" end_page="414" type="sub_section">
      <SectionTitle>
Inflection Pattern
</SectionTitle>
      <Paragraph position="0"> shows such an example. The global structure is represented by a tree called GPT (Global Plan Tree), whiGh guides the succeeding analyses, That is, a node of GPT indicates what kind of transformed patterns should be used to analyze the corresponding fragment, and in what oder.</Paragraph>
      <Paragraph position="1"> A certain class of transformations can be applied, whenever certain syntactic constructions are found. They do not depend on individual verbs. In relativized constructions, for example, the case suffix '~' (ga) can be optionally replaced with the other suffix '~ ' (no). (Fig. 5, TSP4 in Ex. 5). This rule is not dependent on individual verbs, and moreover, it is not dependent on deep cases. The rule is considered as 'structure dependent'. Because a GPT explicitly indicates by RC nodes where relativized constructions appear, the analysis program transforms the patterns in the dictionary into appropriate forms, when it analyzes fragments governed by a RC node, that is, if a pattern in the dictionary contains the suffix '~' (ga), the program automatically generates the transformed patterns. Such structure dependent rules are also found in sentence conjunctions, that are similar to the gapping rules in English (sentence conjunctions cannot be analyzed by the current system from the other reasons. We are now designing the procedures for sentence conjunctions).</Paragraph>
      <Paragraph position="2"> Because of space considerations we completely omitted the discussions about the noun phrase analysis, the semantic aspects of the processing, the analysis of tenses, medals, aspects and some other troublesome expressions such as adverbial modifiers in Japanese etc. The detailed discussions are found in (5).</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="414" end_page="414" type="metho">
    <SectionTitle>
3. Transfer Step
</SectionTitle>
    <Paragraph position="0"> The transfer is also guided by a lexicon as the analysis procedure is, -- in this case, by the bi-lingual dictionary. We will first describe the two structures over which the transfer phase bridges, ~.e. intermediate structures for Japanese and English.</Paragraph>
    <Paragraph position="1"> 3-1. Japanese Intermediate Structures--JiS Japanese intermediate structures produced by the analysis component are basically dependency structures of input sentences, based on case notions. As a usual dependency structure, each node is not labelled by a category symbol like NP, VP, PP etc., but by a word. The word attached to a node is an intermediate word which has a unique entry in the bi-lingual dictionary.</Paragraph>
    <Paragraph position="2"> It may happen that a single Japanese surface word corresponds to multiple entries in the bi-lingual dictionary. In these cases, the disambiguation among them is to be done during the analysis phase. However, it may also happen that, during the transfer phase, a single intermediate word should be mapped into several different English words.</Paragraph>
    <Paragraph position="3"> Though we claimed that nodes in a JIS was labeled only by an intermediate word that corresponded to a surface Japanese word, there are some exceptions. In order to remedy computational defects of dependency structures, we introduce the other kinds of nodes which do not directly correspond to surface words, but to certain syntactic constructions in Japanese (we call such kinds of nodes 'relation descriptors'). In this sense, our JIS is a mixed form of dependency structures and phrase structures. In principle, our intermediate structures are organized in such a way that a governing node can always determine how to arrange the transferred sub-structures of its dependents. As will be described in 3-3, a JIS will be evaluated recursively, and the corresponding English i~termediate structure will be built up from the bottoms (See Fig. 9).</Paragraph>
    <Paragraph position="4"> The transfer prOcedure for this node arranges Ithe transfer results of the lower level into ~single ElS's, and return them to the higher</Paragraph>
    <Paragraph position="6"> General View of the JIS-EiS Transfer In a dependency structure, a noun phrase modified by a relative clause is usually represented by a structure like Fig. 10-(1).However, this structure expresses only implicitly the relationship between the head noun and the modifying clause (* indicates the head noun).</Paragraph>
    <Paragraph position="7">  Note : Actually, the node label REL-CON-1 has a unique entry in the Bi-lingual dictionary, which contains the 'transfer procedure' that is responsible for transferring Japanese relative constructions of type 1 into correspondlng English ones.</Paragraph>
    <Paragraph position="8"> Fig. I0 Comparison of an Ordinary Dependency Structure and JIS Tree traversing rules would be necessary to recognize that an embedded relative clause exists. Moreover, it is always difficult to determine when to invoke such structure recognition rules, and how to transfer such syntactic structures in the source language into their correspondences in the target. In our JIS, suchasyntactic construction is also explicitly marked by 22 node REL-CON-I in Fig. 10-(2). (Relative clauses in Japanese are subclassified into four different types, according to the relationships between the modified noun and the role which it plays in the modifying clause.</Paragraph>
    <Paragraph position="9"> Only three of these have direct corresponding relative clause constructions in English).</Paragraph>
    <Paragraph position="10"> Table 2 shows examples of node labels used in JIS.</Paragraph>
    <Paragraph position="11"> node label role node label'role node label role  Another comment would be necessary on case representation. Many researchers agree that cases are useful in describing linguistic structures, especially semantics of sentences. However, no two agree with each other as to what is the complete set of cases. Our approach is very pragmatic and highly oriented to machine translation. We don't have a 'complete' set  --419-of cases in any sense. We always have only a tentative set. If we observe something wrong, we are ready to revise the current set of cases. Moreover, the definition of each case is highly dependent on individual verbs. As discussed in (4), we divide the cases into two types (this classification is also dependent on individual verbs). One is the type of cases which are intrinsic to the verb. As to the intrinsic cases, the mappings from Japanese surface to JIS relations are specified in the analysis dictionary, and moreover, the mappings from JIS relations to EIS structures are described in the bi-lingual dictionary (see Fig. ii). To put it in another way, Japanese surface structures that express these cases are mapped into corresponding English structures by the lexical rules in the two dictionaries. There are no general rules which refer to general case notions.</Paragraph>
    <Paragraph position="12">  The other type of cases, called extrinsic type, is treated differently. For this type of cases, general rules are prepared to transfer them.</Paragraph>
    <Paragraph position="13"> These rules are independently formulated of individual verbs and show how to express the deep cases in English. Therefore, in contrast to the intrinsic cases, the cases of this type are explicitly expressed by nodes in JIS's (see Fig. 12.) These case labels have their own entries in the bi-lingual dictionary, in which rules for selecting appropriate prepositions are described, ~ $ This part is not containedpp ~ in the v~rb dictionary j</Paragraph>
    <Paragraph position="15"> This selection rule is specified in the dictionary for the extrinsic case INST.</Paragraph>
    <Paragraph position="16"> The rule generally selects an appropriate English preposition, depending on the noun fitted in 3.</Paragraph>
    <Paragraph position="17"> Fig. 12 JIS-EIS mapping for Extrinsic Cases 3-2. English Intermediate Structure -- EIS The EIS's are similar to conventional phrase structures. The main difference is that; each node in the tree is characterized not only by a category symbol like S, NP, VP, etc., but also by a set of attribute - value pairs. EIS plays almost the same role of 'starting phrase structure' in Chomsky. Successive transformations are applied cyclically on this structure'during the English synthesis. However, the transformation component in our system includes a set of rules which are not 'structure dependent' and, therefore, not considered as 'transformation' in TG's sense. For example, passivized constructions are generated not through transformations in Chomsky's current framework, but they are considered as base-generated. In our system, however, they should be treated during English synthesis phase, whether they are structure dependent or not. The main purpose of transformations in the English synthesis is to generate adequate English surface structures from 'Japanese-generated' structures, instead of 'base-generated' ones. Passivization transformation, for example, is indispensable in our system, because it is common in Japanese to state sentences in active voice without any agents. In order to support such transformations, information other than syntactic categories and structures is necessary. They are expressed in EIS's as a set of attribute-value pairs attached to a node.</Paragraph>
    <Paragraph position="18"> 3-3. The Transfer Procedure The general algorithm for the transfer phase changes a given JIS into the corresponding EIS by 'evaluating' the nodes in the JIS recursively. Each JIS node is labelled by an intermediate word of Japanese which has a unique entry in the bi-lingual dictionary. The description in the dictionary contains a set of transfer procedures which show how to transfer the JIS substructures whose roots are the entry word. Each transPSer procedure may be accompanied with a set of preconditions, if necessary. These preconditions are expressed by user defined LISP functions to examine the surrounding JIS as to whether the transfer procedure is appropriate or not. Some built-in LISP functions are provided to facilitate encoding these preconditions. If a JIS word has several English equivalents (i.e. it is polysemy relative to English), these pre-conditions are used to choose an appropriate one. Though deep semantic checking should be performed in this precondition part in more advanced systems, this part is currently used to examine certain syntactic environments or simple semantic markers.</Paragraph>
    <Paragraph position="19"> A transfer procedure usually works as follows: (i) A transfer procedure defined for a governing word (verb, relation-descriptor,etc.) will invoke the main program in order to transfer the JIS substructures governed by the current node.</Paragraph>
    <Paragraph position="20"> (2) When these substructure transfers are completed, the transfer procedure attached to the governing node will arrange the substructures (in EIS) into single structures and return them to the higher level. Because transfer procedures  -420-at the lower level generally return several possible EIS structures, the procedure at the higher level selects feasible combinations and returns them in parallel, if several combinations are feasible.</Paragraph>
    <Paragraph position="21">  (3) A transfer procedure for a dependent word  (typically noun) will not invoke the main program, but only choose the appropriate English equivalents. So the recursive process terminates. Notice that the whole process is highly lexicon driven. Because the main program only checks the preconditions and invokes transfer procedures defined in the dictionary, we can easily change and augment the transfer step by adding new descriptions in the dictionary. Several standard transfer procedures are provided as shown in Table 3. Because these standard procedures are parametorized, most of Japanese intermediate words can be defined by supplying them with appropriate parameters. Fig. 13 shows an example of a verb dictionary which uses the standard procedure VBi (specified in PNAME). VBi transfers an input JIS to the EIS as shown in Fig. 13. Moreover, whenever we. recognize that a certain intermediate word requires a special treatment, we can tailor a transfer procedure applied only for that word, and put it in the dictionary. This gives us a flexible framework for dealing exceptional words that cannot be managed by general procedures.</Paragraph>
    <Paragraph position="22">  We will pick up an example to illustrate this point.</Paragraph>
    <Paragraph position="23"> The Japanese compound word '~-- ' roughly means 'the best in Japan', and consists of two words, B Ak (Japan) and ~ (the first or one). Because the word behaves syntactically as a noun, the analysis procedure treats it as a usual noun. As usual nouns in Japanese, it can be used as a noun modifier.</Paragraph>
    <Paragraph position="24">  the same a single noun as above which means 'runner' the best runner in Japan \] The above two phrases are simply represented in JIS's as shown in Fig. 14. However, these phrases should be paraphrased in English. A special procedure is tailored and put in the lexicon for such a kind of words like B~q-- (the best in Japan), ~--(the best in the world) etc.  The modified noun (noun phrase) will be inserted here.</Paragraph>
    <Paragraph position="25"> Fig. 14 Structural Transfer for the Noun ~- -- (the Best in Japan) The procedure works as follows:  i. It checks whether the modified noun (or noun phrase) contains an adjective or not.</Paragraph>
    <Paragraph position="26"> 2. If it contains, the procedure attaches the superlative indicator to the adjective.</Paragraph>
    <Paragraph position="27"> 3. If it does not, the procedure supplies to the noun the default adjective 'good' with the superlative indicator.</Paragraph>
    <Paragraph position="28">  -421-4. It embeds the modified noun (or noun  phrase) in the parametorized EIS structure as shown in Fig. 14-(3).</Paragraph>
    <Paragraph position="29"> Notice that both the superlative transformation and the 'the' attachment to the superlative adjective will be done at the last step of the English synthesis phase.</Paragraph>
  </Section>
  <Section position="6" start_page="414" end_page="414" type="metho">
    <SectionTitle>
4. English SynthesiE
</SectionTitle>
    <Paragraph position="0"> Because an EIS is generated directly from the corresponding JIS, it preserves many characteristics of Japanese syntax. In this sense, it is 'Japanese-generated' but not 'base-generated'.</Paragraph>
    <Paragraph position="1"> We should transform this structure to obtain a correct English syntactic structure. Japanese 'wh'-questions, for example, are stated in the forms similar to their declarative ones, except that wh-words are marked by special prefix words.</Paragraph>
    <Paragraph position="2"> The wh-movement rule is undoubtedly necessary to produce correct English sentences. Moreover, though passivization is not considered as a transformation from Lexicalists' point of view, it is indispensable in our system. Therefore, much information other than structural matching is necessary to determine whether the transformation rule is applicable or not.</Paragraph>
    <Paragraph position="3"> 4-1. The Generation Dictionary At the first step of the generation, the system retrieves the lexical description of each word in the EIS from the generation dictionary. The generation dictionary contains information such as shown in Table 4. It contains not only trivial indicators necessary for morphological synthesis, but also some other indicators which are examined during the transformation process.</Paragraph>
    <Paragraph position="4"> marker meaning UN- Verbs which can not PASSIVE be used in passive STATE Verbs whose aspectual feature are tSTATE t 'UNC  The words that begin with vowels The last characters Of the words are 'ses'~etc The words which has irregular inflection forms Table 4 Markers in the Synthesis Dictionary 4-2. Transformation Rule A transformation rule is represented in our system by a 9-tuple as shown in Fig. 15. A transformation rule is essentially a tree-to-tree mapping expressed by MP -&gt;CP. Each rule is specified as either OB or OP. OB means that the rule is obligatory; if the rule is applicable, it should be applied. If a rule is marked as OP(tional), it may or may not be applied. At present, when an applicable optional rule is encountered, two alternative Structures with equal feasibilities will be generated. To select</Paragraph>
  </Section>
  <Section position="7" start_page="414" end_page="422" type="metho">
    <SectionTitle>
(NAME COM TYPE MP BPL RP PL IAL INAL)
</SectionTitle>
    <Paragraph position="0"> NAME : The name of the rule.</Paragraph>
    <Paragraph position="1"> COM : Comment. This does not have any actual effects. Only for later references and debuggings.</Paragraph>
    <Paragraph position="2"> TYPE : This indicates whether the rule is obligatory (OB) or optional (OP).</Paragraph>
    <Paragraph position="3"> MP : Matching Pattern which shows the tree schema on which the rule is to be applied.</Paragraph>
    <Paragraph position="4"> BPL : Procedural descriptions for checking the applicability of the rule.</Paragraph>
    <Paragraph position="5"> RP : Resultant pattern which shows the transformed tree structure.</Paragraph>
    <Paragraph position="6"> IAL : If-applied list. This list contains the names of the rules that are to be applied if this rule is successfully applied.</Paragraph>
    <Paragraph position="7"> INAL : If-not-applied llst. This list contains the names of the rules which are to be applied if this rule fails.</Paragraph>
    <Paragraph position="9"> which are applied to the transformed structure after the rule application succeeds.</Paragraph>
    <Paragraph position="10"> Fig. 15 Format of a Transformation Rule the most appropriate one would require, certain stylistic considerations, which is beyond our current scope.</Paragraph>
    <Paragraph position="11"> The applicability of a rule is checked not only by pattern-matching but also by user-defined checking procedures specified in BPL. Because an MP contains several variables and the pattern-matching between MP and the current Free structural binds the variables to appropriate substructures, these user-defined procedures can investigate the relationships between substructures in arbitrary ways, including attribute checkings, by utilizing this variable binding.</Paragraph>
    <Paragraph position="12"> The whole algorithm works cyclically from bottom to top, as usual transformations. According to the rule map as illustrated in Fig. 16, transformation rules are applied to every cyclic node (VP, NP, S) at the lowest in a tree, then at one level higher, and so on.</Paragraph>
    <Paragraph position="13"> I' I Check wheteher Check whether SUBJ is nat the S is in a  The system currently has about 200 rules which are selected from (~). After the major transformation cycle is finished, English morphological synthesis will begin which traverses the resultant tree structures to generate appropriate morphological variants. No special comments would be necessary for this phase.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML