File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/82/c82-1044_metho.xml
Size: 12,703 bytes
Last Modified: 2025-10-06 14:11:30
<?xml version="1.0" standalone="yes"?> <Paper uid="C82-1044"> <Title>AN ENGLISH-JAPANESE MACHINE TRANSLATION SYSTEM BASED ON FORMAL SEMANTICS OF NATURAL LANGUAGE</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> INTERMEDIATE REPRESENTATION </SectionTitle> <Paragraph position="0"> Intermediate representation of this model is EFR (English-oriented Formal Repre~entatlon) and CPS (Conceptual Phrase Structure).</Paragraph> <Paragraph position="1"> EFR is a logical language based on Cresswell's lambda eategorial language (Cresswell (1973)), which can be considered to be a notationally simplified version of Montague Grammar (Montague (1974), Dowry (1981)). From an engineering point of view, EFR can be regarded as an artiflcial language in which each expression is unambiguous. So, there may be the cases in which more than one EFR expression can be associated with a given sentence. In such cases, ambiguities are resolved using inference, knowledge, or by human assistance.</Paragraph> <Paragraph position="2"> CPS is an extended phrase structure in that (I) CPS is a more general element including syntactic knowledge on the concept, so (2) CPS is implemented as a framo and (3) CPS is not only a data structure which is an object under operation but also a function which can operate on other CPS's.</Paragraph> <Paragraph position="3"> A CPS formula is a functional notation (lambda formula) of the operation sequence on CPS's. A CPS formula is evaluated to be a CPS or a functional value. The evaluation process is defined by a (pure) LISP like interpreter.</Paragraph> </Section> <Section position="4" start_page="0" end_page="277" type="metho"> <SectionTitle> SOURCE LANGUAGE ANALYSIS </SectionTitle> <Paragraph position="0"> Engllsh sentence analysis is done using~wo layered rules, pattern directed 278 T. NISHIDA and S. DOSHITA augmented context free rules (AUGCF rules) and production type procedural rules. AUGCF rule is a descriptive rule. Context free rule is extended in several points, (1) attached function for checking syntactic details and semantic acceptability, (2) direct notation of gap in relative clauses or interrogative sentences. An AUGCF rule describes what EFR formula is associated with a given syntactic pattern and in what condition the pattern is acceptable. Some examples look like:</Paragraph> <Paragraph position="2"> Although lots of syntactic phenomena can be easily formelized with AUGCF rules, the computer cannot efficiently analyze input sentences only with them. One reason is that the computer nmst examine which rules are applicable in a given situation and determine which one is plausible. Such processings make the computer very much slow and inefficient. Another reason is that some kind of heuristic knowledge, which is sometimes referred to as knowledge on control (Davis (1980)), cannot be effectively incorporated into the AUGCF rules. The knowledge on control provides heuristics on when and how to use each rule.</Paragraph> <Paragraph position="3"> Condition -> actio~ formalism (production rule formalism) is considered to be suitable to write such level of knowledge.</Paragraph> <Paragraph position="4"> Our second level rule is obtained by attaching control informetion to each AUGCF rule and transforming the rule format. The type of procedural rules are: E-rule, U-rule, B-rule, and L-rule.</Paragraph> <Paragraph position="5"> - E-rule (expansion rule) is invoked when a goal is expected. E-rule specifies subgoal decomposition of the given goal.</Paragraph> <Paragraph position="6"> - U-rule (up-ped rule) is invoked when a parse tree node is generated. This rule further specifies additional goals and if all of them succeed, a new node will be constructed. This rule is used mainly for left recurslve type AUGCF rules.</Paragraph> <Paragraph position="7"> - B-rule (Bottom-up rule) is referred to by a bottom-up parser incorporated in the rule interpreter.</Paragraph> <Paragraph position="8"> - L-rule (Lexicon rule) is embedded in a dictionary and invoked when a key word is encountered in the given text.</Paragraph> <Paragraph position="9"> The rules RI and R2 are rewritten into procedural type rules as follows:</Paragraph> <Paragraph position="11"> if it succeedej then apply R2. t~ Where RI', for example, says that: given a goal S then expand it into subgoals NP and VP; if both of them succeed then reduce them into an S node; at that time, a function subjvp checks subject-verb agreement; +10 is the score for S; *seml(*sem 2) is a p~ttern of the EFR expression for the S node, where *sem 1 denotes the EFR expression for its first son (NP), etc. If some anomaly is detected by those functional attachments, the application of the rule is rejected (functional augmentation of CF rule).</Paragraph> <Paragraph position="12"> A notion of a frame is employed in order to implement feature semantics. A frame is an extended property list in which syntactic and semantic features are described. By passing and checking consistency among such features, (mainly semantic) constraints are implemented.</Paragraph> <Paragraph position="13"> AN ENGLISHJAPANESE MACHINE TRANSLATION SYSTEM 279 In practice, the knowledge incorporated in a system can never be total and complete, so human being ~hould help computer analyze input sentences. The human halp is limited to resolving ambiguities. In order to make the human diagnosis efficient, some diagnostic facilities are implemented.</Paragraph> <Paragraph position="14"> It is also important to construct and manage dictionaries. Dictionary manager is implemented to make human modification of dictionary flexible by use of pattern directed dictionary editing commands.</Paragraph> </Section> <Section position="5" start_page="277" end_page="277" type="metho"> <SectionTitle> INTERPRETATION OF EFR AND TARGET LANGUAGE GENERATION </SectionTitle> <Paragraph position="0"> The interpretation of an EFR expression can be defined in the conceptual level.</Paragraph> <Paragraph position="1"> For example, given an EFR expression: a(%y\[a*(communication))(~x\[(((*ap(for)(x))(facility))(y)\])\]), which corresponds to a noun phrase &quot;a facility for communication&quot;. A detailed description of the conceptual interpretation in our conceptual model (Nishida (1980)) is given below.</Paragraph> <Paragraph position="2"> (I) conceptual interpretation of a(~y\[ ... \]) associates a conceptual element &quot;something&quot; (individual concept) with the variable y. (2) conceptual interpretation of a*(communication)(~x\[ ... \]) associates a conceptual element &quot;(a) communication&quot; with the variable x. (3) (*ap(for))(x) is interpreted as an adjective concept &quot;for the sake of x&quot;, which becomes &quot;for the sake of (a) communication&quot; from (2). (4) the adjective concept obtained in (3) is applied as a function to the interpretation of &quot;facility&quot; (i.e., a noun concept &quot;facility&quot;). Thus we obtain a complex noun concept &quot;system for the sake of (a) facility&quot; for ((*ap(for))(x))(facility).</Paragraph> <Paragraph position="3"> (5) the application of a noun concept p to an individual concept q yields a sentence concept: &quot;q is a p.&quot; This interpretation rule is used for the fragment: (((*ap(for))(x))(facility))(y). The result is a sentence concept: &quot;something (y) is a facility for the sake of (a) communication.&quot; (6) Finally the interpretation of a given EFR expression results in a noun phrase concept: &quot;something y: such that y is a facility for the sake of (a) communication.&quot; This noun phrase concept is a higher order concept which gives a name to an individual: &quot;a facility for the sake of (a) co~m~unication.&quot; This higher order concept will be reduced if it is applied to a one place predicate (roughly speaking, a property like &quot;being constructed&quot;, &quot;being an x such that the paper is concerned with x&quot;, etc.).</Paragraph> <Paragraph position="4"> The above process of interpretation is stepwise and includes no &quot;gap&quot; nor &quot;skip&quot;. Such property is crucially important i n constructing large and complex systems including machine translation systems. This process can be simulated in the &quot;linguistic&quot; domain; our idea of target language generation is this: - each conceptual element is accompanied with a target language phrase structure which gives the name of the concept.</Paragraph> <Paragraph position="5"> - each semantic interpretation of a complex structure is accompanied with a syntactic operation of creating new phrase structure from those for function part and argument part conceptual elements.</Paragraph> <Paragraph position="6"> Two types of Japanese phrase structure manipulating rule can be associated with functional application: - embedding one phrase into another phrase as a modification part (generate Fig.l. Outline of a sample generation from an EFR expression.</Paragraph> <Paragraph position="7"> Thus, a functional application corresponds to a primitive syntactic operation of Japanese language.</Paragraph> <Paragraph position="8"> CPS is defined to be a structure which conveys not only conceptual information on a concept but also syntactic infbrmation about the concept. All those information is structured as a frame. The descendant slot of a CPS is either a terminal value (lexicon frame) or a list of CPS's. Thus CPS can be linked as a tree structure. A CPS corresponding to a noun phrase: &quot;the typewriter&quot; looks like: \[NP \[DET 'the' with Q=DEFINITE\] r 'typewriter' with CLASS=PHYSOBJ \] with NBR=SGL \]. LNOUN ......</Paragraph> <Paragraph position="9"> A CPS works both as a data and as a function; it is sometimes applied to other CPS's to result in another CPS or functional value, or it sometimes is a data structure under some operation. Thus CPS is a higher order object. The semantics can be modeled in the notion of a categorial grammar. A CPS of an adjective concept, for example, meps a CPS of a noun concept into another (compound) CPS of a modified noun. This principle can he written as: ADJ=NOUN/NOUN. On the other hand, the adjective CPS can be modified by an adverbial CPS. Thus ADV=ADJ/ADJ.</Paragraph> <Paragraph position="10"> A CPS formula specifies a seqfience of operations on given CPS's. A CPS formula involves CPS as a data. Other elements of CPS formula are: variable (with</Paragraph> </Section> <Section position="6" start_page="277" end_page="277" type="metho"> <SectionTitle> AN ENGLISH-.IAPANESE MACHINE TRANSLATION SYSTEM 281 </SectionTitle> <Paragraph position="0"> coersion specification)~ lambda expression, functional application formula, transformatlonal rules, conditional expression, and composition function. The evaluation process of a CPS formula is defined as a function like LISP interpreter.</Paragraph> <Paragraph position="1"> Fig.l illustrates an outline of target language generation process for a phrase &quot;a facility for communication&quot;. (CPS formula is onmited there.) In practice, our system involves one step called the REFORM step after the CPS evaluation process. This step is needed mainly because, (I) some direct output is not readable; the content can be understood without ambiguity, but iPS is much redundant or not commonly used, or much mere worse (2) the output is semantically wrong. Such cases arises where the EFR expression extracted from the source language is not well defined to the language expression in question. This case occurs when the system designer commits misconception or fails to correctly capture the phenomenon. In principle, the second case is obviously bad but no theory has ever succeeded in medelling all phenomena in natural language. So in practice, the second case is unavoidable.</Paragraph> <Paragraph position="2"> The REFORM process uses heuristic rules to 'reform' those CPS structure into reasonable one. Pattern directed transformation rules are used. Those rules are applied until no rule is applicable to the given CPS structure.</Paragraph> </Section> <Section position="9" start_page="277" end_page="277" type="metho"> <SectionTitle> ACKNOWLEGDEMENT </SectionTitle> <Paragraph position="0"> This research was partially supported by Grant-in-Aid for Scientific Research.</Paragraph> <Paragraph position="1"> The authors want to thank Mr. Kiyoshi Agusa and Mr. Shigeo Sugimoto for providing conveniences of editing and printing this material.</Paragraph> </Section> class="xml-element"></Paper>