File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/82/c82-1043_metho.xml
Size: 16,083 bytes
Last Modified: 2025-10-06 14:11:29
<?xml version="1.0" standalone="yes"?> <Paper uid="C82-1043"> <Title>PRED-PTRANS AG-LIVING OBJ-PHYS.OBJ GO-PHYS-LOC u PHYS-OBJ PRED-POSS-TRANS AS-HUM OBJ-OBJ RECIP-HUM PRED-MTRANS AG-HUM OBJ-MENT-OBJ RECIP-HUM PRED-ASSOC.ACT AG-HUM OBJ-HUM PARTIC-HUM Table 2 A Part of the Item &quot; i~,, in the Word Dictionary</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> JAPANESE-ENGLISH TRANSLATION THROUGH INTERNAL EXPRESSIONS </SectionTitle> <Paragraph position="0"> This paper describes an approach to Japanese-Englishtranslation through internal expressions which are similar to those used in our recent approach to English-Japanese translation \[2\]. Attention is focused on construction of ~he internal expressions of Japanese sentences based on case structures of predicates and also conversion of the Japanese internal expressions to the English ones for generating good English sentences in conventional use. Finally, associated with translation, extraction of specified translated information from Japanese patent claim sentences is described briefly.</Paragraph> <Paragraph position="1"> i. CASE STRUCTURES AND PARSING In every Japanese sentence a predicate such as the main verb is located at the end of the sentence and takes a part of the governor of the preceding dependants as follows: tl Jl t2 J2 .... tn 3n P &quot; In the above t. (i=1,2~.-.,n) denotes a term depending on the predicate p and j. denotes a pos~posltlon or a ease suffix such as &quot;Kakujoshi&quot; and other aux- .</Paragraph> </Section> <Section position="2" start_page="0" end_page="271" type="metho"> <SectionTitle> 1 ll!ary words. </SectionTitle> <Paragraph position="0"> Japanese is an allutinative language* A postposition j. of a term t. clari- l lies the syntactic role of the term t. in a sentence governed by the mai~ verb.</Paragraph> <Paragraph position="1"> * . l Conversely, a set of these postposltlons or syntactic case labels determines the syntactic function of the main verb in a sentence and is called a verb pattern.</Paragraph> <Paragraph position="2"> The verb pattern plays an important role to identify the syntactic structures of sentences in Japanese like Hornby's verb patterns in English* From a semantic view point, Japanese predicates are classified to about 20 main categories and case structures are defined based on the verb categories and the verb patterns. The case labels and the categorical names used here are chosen the same as those of English case structures in the previous paper \[2\].</Paragraph> <Paragraph position="3"> Table 1 shows case frames constructed on the verb pattern VP9 of a verb which takes syntactic cases of SUBJect, OBJect and DATIVE accompanied with representative postpositions &quot;ga&quot;, &quot;wo&quot; and &quot;hi&quot; respectively. Table 2 shows an instance of description of a verb in the word dictionary constructed on the case frame basis. The first and second columns show the names of verb patterns and semantic main categories taken bythe verb respectively. They give the token of the case frame of the verb. The third and forth columns show the equivalents of the verb and the names bf their Hornby's verb patterns. If there are some equivalents in a row of categories, some key tokens such as pairs of a ease-label and a subcategory taken by their dependants are added in order that a better equivalent can be chosen.</Paragraph> <Paragraph position="4"> The next example shows some Japanese sentences which have the main verb illustrated in Table 2, the internal expressions and the corresponding English sentences , where the postpositions in Japanese sentences are enclosed with parentheses, and the symbol &quot; * &quot; in the internal expressions denotes the term which is in front of the f~ame including the s~r~ol.</Paragraph> <Paragraph position="5"> ears -(no) number -(ga) rapidly is increasing (2) (P~gD-ATTR-T~ANS: increase, TENSE; present, ASPECT: progressive, OBJ-NUMBer= nu~er(NUM: ~ , OBJ-PHYS,OBJ: cars), MANN: rapidly) (3) The number of cars is rapidly increasing.</Paragraph> <Paragraph position="6"> (b) VP8 (l) ~(c) ~ (c) @~ ~ ~ this dictionary -(no) vocabulary -(wa) fifty thousands words -(ni) was extended (2) (PRED-ATTR-TRANS: extend, TENSE: past, VOICE: passive, OBJ-MENT-OBJ: vocabulary(OBJ: e I , LOC: this dietlonary), GO-QUANT: words(UNIT: ~z, NUM: fifty thousands)) (3) The vocabulary in this dictionary was extended to fifty thousands words.</Paragraph> <Paragraph position="7"> _ Parsing is performed by using the case frame. First, all the words involved in a sentence are retrieved from the di~tlonary. Then all the possible parsings under respective frame conditions are carried out in parallel from the left of the sentence. The case labels of dependants are determined by referring to the syntactic and semantie categories of the dependants and the case frame of the governor. The internal expressions sometimes lack the agent ease or the object case corresponding to the original Japanese sentence. When the internal form has a predicate affecting on some objects and lacks the agent in the active voice, it is expanded into an English sentence in the passive voice. When the main predicate belongs to category of state such as existence and attribute and lacks the object case, an appropriate pronoun to the context such as &quot;it&quot;, &quot;we&quot; or &quot;they&quot; is substituted for the thematized object case as the general environment or the general experlencer. Japanese noun words are not obliged to have articles which give a measure of definiteness to an object indicated by a noun word. Since English noun words take determiners such as articles obligatorily in many cases, the articles must be restored from the context of the Japanese sentences in,construction of their internal expressions. The restoration is carried out in many cases by anaphorie analysis and reference to information about the conventional use of articles described for every noun phrase in the word dictionary.</Paragraph> </Section> <Section position="3" start_page="271" end_page="271" type="metho"> <SectionTitle> JAPANESE-ENGLISH TRANSLATION THROUGH INTERNAL EXPRESSIONS 273 2. CASE CONVERSION </SectionTitle> <Paragraph position="0"> The case structures of Japanese and English are partly different from each other at the level of the internal expressions currently used. Though constituents of case structures such as categorical names and case labels are chosen the same in both languages, there are different expressions due to logically possible combinations of a governor and the dependants for the same event or the same action, and the preferable combination depends on the individual language.</Paragraph> <Paragraph position="1"> On the other hand, it is desired that a thematic term is kept unchanged through translation. Syntactically the thematic term takes the front part of a sentence in both English and Japanese, and furthermore, takes the subject case in English.</Paragraph> <Paragraph position="2"> Hence if the internal expression obtained from a Japanese sentence does not satisfy these conditions, some conversions are tried to yield a more suitable English internal expression. Some of them are shown in the following.</Paragraph> <Section position="1" start_page="271" end_page="271" type="sub_section"> <SectionTitle> 2.1 Conversion of Existence and Attribute Expression </SectionTitle> <Paragraph position="0"> As well knoWn, Japanese is a BE language while English is a HAVE language. In Japanese the possessive expression is almost confined to the case where a human has something in his hand, and the other possessive expressions in English are generally described in existence and attribute expressions.</Paragraph> <Paragraph position="1"> Example 2 Let us consider the following sentences: (a) He has a daughter. (b) Copper has high electric conductivity. They are usually expressed in Japanese as follows: he -(ni wa) a -(no) daughter -(ga) is (2) (PRED-EXIST: is, OBJ-HUM: a daughter, LOC-HUM: ~) (3) With hi=~m is a daughter.</Paragraph> <Paragraph position="2"> (b) (i) ~ ~ i~m ~ ~ copper -(wa) electric conductivity -(ga) is high (2) (PRED-ATTR: is high, OBJ-PHYS.QUANT: electric conductivity, LOC-PHYS.OBJ: as for copper) (3) As for copper eiectric conductivity is high.</Paragraph> <Paragraph position="3"> In each illustration above, (i), (2) and (3) denote a Japanese sentence, the internal expression with terms replaced with English equivalents and a literal translation obtained by rewriting the internal expression respectively. The part with a double underline denotes a thematic part in the source sentence. As seen from the above the literal translation does not preserve the thematic term of the original Japanese sentence within the standard English sentential form. The translation with the thematic term unchanged requires replacement of the main predicate by HAVE verb and the accompanied case conversion as shown in Table 3. Rules 1 and 2 are conversion rules of Japanese EXISTence and ATTRibute expres null The applications of the above rules to the internal expressions in Example 2 yield the following expressions.</Paragraph> <Paragraph position="4"> (a) (PRED-POSS: has, POSSESSOR-HUM: he, OBJ-HUM: a daughter) (b) (PRED-POSS: has, POSSESSOR-PHYS'OBJ: copper, OBJ-PMYS.QUANT: electric conduct~vlty(PRED-ATTR: be high, OBJ: e )) These intern&l expressions can be rewritten to the sentences (a) and (b) near the heading of Example 2.</Paragraph> </Section> <Section position="2" start_page="271" end_page="271" type="sub_section"> <SectionTitle> 2.2 Conversion of State-Orlented Expressions </SectionTitle> <Paragraph position="0"> Japanese is a state-oriented language and often takes a type of description that thing A changes to thing B owing to thing C in an event. If C is a non-llving object, C is interpreted as a cause or an instrument rather than an agent from the standpoint of translation, and usually does not take the subject case even if C is emphasized.</Paragraph> <Paragraph position="1"> On the other hand, English is an action-oriented language and often uses ex u pressions such that thing C makes thing A change to B even if the category of C is a non-living object.</Paragraph> <Paragraph position="2"> The following shows some conversion rules between the above expressions. Rule 3 (PRED: TA , OBJ: t~ MEANSuCAUSEUINSTR: ~I ) ~PRED. t o , AG: ~i ' OBJ: t 2 ) Rule'~ (PRED: t o ~ MODAL: capable, AG: t 2 , \[ OBJ: t. \], MEANS u CAUSE uINS~R: ~I ) (PRED-ENABLE: t c , AG: t. , OBJ: (p~: t o , AG: t 2 , \[ OBJ: t 3 \] )) , where the left sides of the rules are Japanese case structures and the right sides are those of English, categorical names are omitted except the case particularly required, and the contents enclosed with brackets denote some optional items, t' is a transitive verb such as VP6A corresponding to an intransitive verb or aOtransltive verb in the passive voice t 0 in Rule 3.</Paragraph> <Paragraph position="4"> this signal -(niyori) machine -(ga) reliably start (2) (PRED-PTRANS: start, OBJ-PHYS. OBJ: machine, INSTR-THINGS: this signal, MANN: reliably) (3) By this signal the machine starts reliably. (4) (PRED-PTRANS: start, VOICE: active, AG-THINGS: this signal, OBJ-PHYS,OBJ: machine, MANN: reliably) (5) This signal starts the machine reliably. % 3. SOME TRANSLATION RESULTS Along the line described in the preceding sections, some experiments on translation have been carried out in an interactive mode. The average translation time excluding word retrieval and interaction is about 0.35 seconds per a Japanese word* Some experimental results are shown in this section, where names of categories in the inter~al expressions are omitted for simplicity* ~Comparin~ Two slgnals gives a measure of force exerted by fluid, and then an ~electronlc circuit convemts this measured value To a scale of a flow rate. In the above translation, Rule 3 was applied to the underlined parts ~ and in paragraph (B) to yield ~)~ and ~ in paragraph (C) respectively in order to keep the thematic or emphatic temms unchanged.</Paragraph> <Paragraph position="5"> Example 6 (A) Input Japanese patent claim sentences (B) The inTemnal expression semiconductor devi~e(DET:indef, NUM:SINgular, OBJ:el)(PRED:compPis~ 0BJ:~!, SO: { metal electPod{(DET:indef, NUM:SIN, OBJ:~a)(PRED~be high~ OBJ:work func~io~(DET:indef, NUM:SIN, OBJ:~$), LOC:~), selenium laye~(DET:indef, NUM:SIN, OBJ:~, LOC:metal elect~od~e(DET:def$ NUM:SIN, OBJ:~I)), semiconducto= lay~(DET:indef, NUM:SIN, OBJ:~, CHAR:c~ysta~) (PRED:exis~, OBJ:~6, LOt:selenium laye~(DET:def~ NUM:SIN, OBJ:~z)) ~(PRED:confo~mlZto~ OBJ:lattice constan~DET:indef. NUM:SIN. OBJ:~I), It PARTIC : selenium ( DET ~indef, NUM : UnCountable, OBJ : ~f ), LOC : ~6 ), metal electrod~8(DET:indef, NUM:SIN, OBJ:W~)(PRED:for~ VOICE:actlve, AG:_, 0BJ: ~I~. LOC: semiconductor lay~(DET :~de/f NUM: SIN, OBJ:~I! ) ~) (C) The output English sentences A semlconductom device comprising ~.ametal electrode having a high work function, a selenium layer on the metal electrode, a cmystal semiconductor layer on the selenium layer ~havin~ a lattice constant which conforms to selenlum, and ~a metal electrode fol-med on the semiconductor layem.</Paragraph> <Paragraph position="6"> In pamsing The system asked wheThem the underlined part (a) in para~aph (A) depended on the pal-is (b) or (c) and obtained the answer fmom the user. FoP const~uctlon of a better English internal expression, Rule 2 was applied To the attribute expressions of The underlined parts ~) and ~ in paraETaph (B) to yield the possessive expmesslons ~ and ~/in paraEi~aph (C) respectively. Furthermore, the active voice expression of the underlined pamt ~ in pamagraph (B) which lacks the AGent case was converted into The passive voice expression shown in the underlined paPt ~ in paragraph (C).</Paragraph> <Paragraph position="7"> ~. EXTRACTION OF SPECIFIED TRANSLATED INFORMATION By using a method similar to the above case conversion, specified structural information w~itTen in English can be extracted f~om Japanese Texts in parallel with their translation. Each specification Table fo~ information extraction used here consists of several case frames, and each case frame has a standard case structure comresponding To a simple sentence or a phPase. Table ~ is an-example 276 F. NISHIDA and S. TAKAMATSU of specification tables used for patent claim sentences.</Paragraph> <Paragraph position="8"> (PRED-ACT:_, OBJ-D: rut. ,-IAG:_, INSTR:_, SO:_, GO:_, LOC:_, MANN:_, MEANS:__ ) (PRED-ATTR:_, OBJ:_, LOC: rut. , COMPAR:_, PARTIC:_, DEGR: __ ) (OBJ: __t i , LOC-on: t. ) -3 The internal expression obtained by parslng is standardized according to the normal fomm of the frame detel-mined by the category of the predicate. The standardlzatlon consists of case structure conversions such as clausal to phrasal structure conversion hy removal of a kind of copula predicate and also case-set conversion such as (OBJ,USED) versus (INSTR,OBJ).</Paragraph> <Paragraph position="9"> Table 5 shows the information extracted fmom the internal expression (B) in Example 6 by the specification table shown in Table 4. The extracted informaT_ion is moved to a relational data base and used for relational retrieval and others. The Infommation Extracted from (B) in Example 6 Table 5</Paragraph> </Section> </Section> <Section position="4" start_page="271" end_page="271" type="metho"> <SectionTitle> COMPOSITION OBJ </SectionTitle> <Paragraph position="0"> Japanese and English are fairly different languages from each other. However, if the object field of processing is confined to some technical fields, it is expected from the above consideration that semi-automatic multilingual translation and extraction of specified structural information are realizable though there are left various refinement problems such.as restoration of articles.</Paragraph> </Section> class="xml-element"></Paper>