File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/82/c82-1039_metho.xml
Size: 14,005 bytes
Last Modified: 2025-10-06 14:11:24
<?xml version="1.0" standalone="yes"?> <Paper uid="C82-1039"> <Title>AN ENGLISH JAPANESE MACHINE TRANSLATION SYSTEM OF THE TITLES OF SCIENTIFIC AND ENGINEERING PAPERS Makoto Nagao, Jun-ichi Tsujii (Kyoto University)</Title> <Section position="2" start_page="245" end_page="245" type="metho"> <SectionTitle> 2. SPECIAL CHARACTERISTICS OF TITLE SENTENCES </SectionTitle> <Paragraph position="0"> Title sentences of scientific and engineering papers in English have the following properties from the point of view of translation.</Paragraph> <Paragraph position="1"> (I) Nouns in the titles are usually specific terminological words in a particular field. The translation of these words into Japanese is almost unique. This makes avoid a difficult problem of the selection of proper translation words from several candidates, which we encounter in ordinary words.</Paragraph> <Paragraph position="2"> (2) Many colloquial expressions exist in the titles. These are regarded as idioms, and their internal structures are not analyzed. The whole expressions are stored in a dictionary with their Japanese translations.</Paragraph> <Paragraph position="3"> (3) A simple noun phrase in English can be translated into Japanese by replacing each word into Japanese without any word order change.</Paragraph> <Paragraph position="4"> (4) Many of the special terminological words in science and engineering are com null pound words. They are treated as such in a dictionary. When the translation of a simple noun phrase according to (3) is not acceptable, the phrase is stored in a dictionary as a compound word with its translation. Therefore the dictionary look-up is done by the longest match principle.</Paragraph> <Paragraph position="5"> (5) The word order change in the translation is 0nly possible in the cases where verbs and prepositions are used. This word order change can be done at the level of skeleton patterns.</Paragraph> </Section> <Section position="3" start_page="245" end_page="245" type="metho"> <SectionTitle> 3. DICTIONARY LOOK-UP </SectionTitle> <Paragraph position="0"> The block diagram of our title translation system is shown in Fig. i. The first step is the dictionary look-up of words and idioms. We gathered a lot of specific expressions as idioms, such as &quot;time varying (mechanism)&quot;, &quot;based on ...&quot;, and so on. &quot;verb-ing&quot; can be a noun, adjective, and present participle, but there are many verb-ing's whose grammatical function is almostunique: accounting, bonding, engineering and so on as nouns, superconducting as adjective, and using, determining as verbs which demand objects or complements. The dictionary has this information. null</Paragraph> </Section> <Section position="4" start_page="245" end_page="245" type="metho"> <SectionTitle> 4. CONJUNCTIVE PHRASE </SectionTitle> <Paragraph position="0"> The second step is the parsing of conjunctive phrases by &quot;and&quot; and &quot;or&quot;. As is well known there is an ambiguity for the conjunctive phrases of the forms: A and B of C , Adjective + noun + and + noun, and so on. It is very difficult to determine the scope of conjunctive phrases, and to get the correct parsing without the detailed semantic analysis. The present program parses simply the nearest two terms which have the same parts of speech, such as: adj. + and + adj. ~ adj.</Paragraph> <Paragraph position="1"> verb + and + verb ~ verb verb-ing(-ed) + and + verb-ing(-ed) ~ verb-ing(-ed) noun + and + noun ---> noun Special consideration is given to the following specific conjunctive phrase: MACHINE TRANSLATION SYSTEM OF TITLES 247 step i @ 1 i dictionary look-up and idiom finding step 2 \[ parsing of program ~ phrases conjunctive I step 3 transitidegn Inetwork lparslng degf simple ndegunl phrase by transition network step 4 I I handling degf special ' program ~ structures and semantic disamblguation step 5 I I skeleton pattern matching and word order change for Japanese step 6 I I synthesis of Japanese I title sentence Fig. i. Flow of Title Translation.</Paragraph> <Paragraph position="2"> prep. + noun + and + prep. + no~1 ---> (prep. + noun) + and + (prep. + noun) ---> prep. + noun Conjunctive structures such as, (noun + prep. + noun) + and + (noun + prep. + noun) (adj. + noun) + and + noun can not be analyzed correctly.</Paragraph> </Section> <Section position="5" start_page="245" end_page="245" type="metho"> <SectionTitle> 5. SIMPLE NOUN PHRASE </SectionTitle> <Paragraph position="0"> Next step is the parsing of a simple noun phrase, which may include some other parts of speeches. The recognition of a simple noun phrase is done by the finite automaton model shown in Fig. 2. The recognition starts from the initial state, and the proper transfer of the state is done for the sequential input of words.</Paragraph> <Paragraph position="1"> When the automaton reaches to the final state the recognition of the end of a simple noun phrase is ended. The word order of the corresponding Japanese is the same as English within the scope of a simple noun phrase.</Paragraph> <Paragraph position="2"> 248 M. NAGAOetM.</Paragraph> <Paragraph position="3"> (n, sig, pl, pn, hum) (n, sig, pl, pn, num) (ing, n, (det, pn, adJ) ~ pl, pn, num) (n, sig, pl, pn, ing, num)</Paragraph> <Paragraph position="5"/> </Section> <Section position="6" start_page="245" end_page="466" type="metho"> <SectionTitle> 6. SPECIAL WORD SEQUENCE </SectionTitle> <Paragraph position="0"> There are some particular word sequences which must be treated separately. Typical ones are as follows.</Paragraph> <Paragraph position="1"> (a) n I + of + n 2 : This word sequence is regarded as a noun after parsing. This is translated into the Japanese word order : n 2 + 6) + n I. (b) prep. + n (at the beginning of the titles) : An example is &quot;On pattern recognition&quot;. In this case, very tricky treatment is done as &quot;prep. + n ~ n&quot;. This means that prep. is an accessory to the noun phrase (n) which follows it, and the structure of this noun phrase is the main part of the analysis. The translation is first done to the noun phrase, and at the final stage the translation of the preposition is attached to the end of the translated noun phrase.</Paragraph> <Paragraph position="2"> (c) verb-ed + prep. : This structure is just parsed to prep. which has the modifying term of verb-ed. The Japanese translation is &quot;prep. + verb-ed + ~L~ (passive particle). An example is : MACHINE TRANSLATION SYSTEM OF TITLES 249 a paper presented to a conference ---> a paper to a conference (presented) (conference) (to) (present) (passive) (paper) (d) verb-ing + prep. (at the beginning of the titles). An example is &quot;concerning to ...&quot;. Syntactically &quot;verb-ing&quot; in this case plays the same role as a noun. So it is replaced by noun.</Paragraph> <Paragraph position="3"> (e) noun + verb-ing + prep. : The determination of the grammatical role of verb-ing in this case is very difficult. By the title sentences it is frequent that verb-ing is used as a noun (gerund), and the interpretation of noun + verb-ing ---> noun + noun ---> noun is adopted.</Paragraph> <Paragraph position="4"> 7. SEMANTICDISAMBIGUATION After the parsing of the above particular structures there still remain some more difficult structures which require semantic treatment. &quot;verb-ing + noun&quot; is a typical such structure. Verb-ing can be either a modifying element to the noun, or a present participle which requires the noun as an object. An example is: measuring temperature ---> ~ ~ ~,~ ~</Paragraph> <Paragraph position="6"> Therefore the check must be done between the verb and the Noun which follows as to whether the noun can be a subject or an object to the verb.</Paragraph> <Paragraph position="7"> For this purpose five semantic elements are introduced. These are shown in Table \] ith some nouns classified by these elements. The same semantic information is u, ~ to denote what kind of nouns can be a subject or an object to a verb. For e~ ?le the subject nouns to the verb &quot;measure&quot; have the semantic categories of too+- and theory, and the object nouns for the verb have the semantic categories of physical object, and aspect. By checking these semantic relations the syntactic structure and the translation word order are determined.</Paragraph> <Paragraph position="9"> aspect : velocity, temPerature , resistance, etc.</Paragraph> <Paragraph position="10"> physical object : metal, water, oil, waveguide, etc.</Paragraph> <Paragraph position="11"> theory : principle, technique, approach, etc.</Paragraph> <Paragraph position="12"> unit : cm, degree, etc.</Paragraph> <Paragraph position="13"> 250 M. NAGAO et ai Such semantic checking is performed in the following syntactic structures. (i) n + verb-ing : if semantic check does not work, verb-ing is regarded as a gerund and is modified by the noun.</Paragraph> <Paragraph position="14"> (2) verb-ing + n : if the noun phrase (n) has an article, it is an object of the verb. If semantic check does not work, n is regarded as an object.</Paragraph> <Paragraph position="15"> (3) n I + verb-ing + n 2 : Semantic check between the verb and n , and the verb I and n 2 is done. If semantic check does not work, the interpretation is that n I is an object of the verb, and verb-ing modifies n 2.</Paragraph> <Paragraph position="16"> (4) prep. + verb-ing +prep. : verb-ing is understood as a gerund. in INSPEC translation.</Paragraph> <Paragraph position="17"> English skeleton pattern Japanese word order Frequency for</Paragraph> </Section> <Section position="7" start_page="466" end_page="466" type="metho"> <SectionTitle> MACHINE TRANSLATION SYSTEM OF TITLES 251 8. SKELETON PATTERN </SectionTitle> <Paragraph position="0"> The parsing process thus far produces a skeleton pattern for each title sentence.</Paragraph> <Paragraph position="1"> For example: # An Automated General Purpose Test System for Solid State Oscillators. ! (Skeleton) Syste~m for Oscillators (n + prep. + n) # A Laser Doppler Technique for Measuring Flow Velocities in High Current Arc Discharge.</Paragraph> <Paragraph position="2"> (Skeleton) Technique for Measuring Velocities in Discharge. (n + prep. + ver-ing + n + prep. + n) The skeleton patterns obtained from ten thousand title sentences are astonishingly few. These are shown in Table 2, wit h the frequency of occurrence of each pattern for about a thousand title sentences of physics and mathematics in INSPEC database. The Japanese word order is also given to each skeleton patterns. The translation of prepositions is set unique by the present program as shown in Table 3. There are of course several cases where different Japanese expressions should be adopted for a preposition depending on the context. This is an important problem to be solved in the future, e</Paragraph> </Section> <Section position="8" start_page="466" end_page="466" type="metho"> <SectionTitle> 9 TEST RESULT </SectionTitle> <Paragraph position="0"> A test result of the title translation from INSPEC database is shown in Table 4.</Paragraph> <Paragraph position="1"> Average time necessary for the translation of a title is 0.i second. After the translation of i000 titles, the dlctlpnary was updated by the new words which appeared in the input data and which were absent in the dictionary. Then the same i000 titles were again translated, and the rejection was checked. Then the next i000 titles were handled in the same way, and so on.</Paragraph> <Paragraph position="2"> With 3000 titles from INSPEC the ~ejected were only 42 titles (1.4%). Many of the rejected titles had the structures which the system can not accept, such as normal sentential structures, and qhestion forms. The system can only accept the noun phrases without any embedded sentential structures by relative pronouns.</Paragraph> <Paragraph position="3"> Among the translated titles, about 5% were wrong or ununderstandable. Many of these errors came from the wrong parsing of conjunctive phrases. Some examples of the translation are shown in Table 5. For some other databases in English the correct translation rate was about 80%. This rate depends heavily on the dictionary contents.</Paragraph> <Paragraph position="4"> i0. CONCLUSION The translation system is now being used on trial basis at Tukuba Research Information Processing System (RIPS) of the Agency of Industrial Science and Technology. The titles, keywords, and some other journal information of INSPEC database are translated into Japanese, and a new database in Japanese language is created. Retrieval can be done by Japanese language by using Chinese characters and Kana letters to this database of INSPEC Japanese version.</Paragraph> <Paragraph position="5"> The system seems to be practically usable, and the program is being transferred to a few other database centers for their use in the conversion of English database into Japanese database.</Paragraph> </Section> <Section position="9" start_page="466" end_page="466" type="metho"> <SectionTitle> THERMOHYDRAULIC ANALYSIS OF GAS-COOLED ROD ASSEMBLIES IN NUCLEAR REACTORS BEHAVIOR OF DRAG DISC TURBINE TRANSDUCERS IN STEADY-STATE TWO-PHASE FLOW VOID FRACTION CORRELATION OF TWO-PHASE FLOW OF LIQUID METALS IN TUBES COMPARISON OF THE ORDER OF APPROXIMATION IN SEVERAL SPATIAL DIFFERENCE SCHEMES FOR THE DISCRETE-ORDINATES TRANSPORT EQUATION IN ONE- DIMENSIONAL PLANE GEOMETRY GENERALIZED QUASI-STATIC METHOD FOR NUCLEAR REACTOR SPACE-TIME KINETICS ~~~--~~ SEMICLASSICAL CONVERGENT CALCULATIONS FOR THE ELECTRON-IMPACT BROADENING AND SHIFT OF SOME LINES OF NEUTRAL HELIUM IN A HOT PLASMA TRANSITION PROBABILITIES AND THEIR ACCURACY THEORY OF RESONANCE-RADIATION PRESSURE EXCHANGED MOMENTUM BETWEEN A SURFACE WAVE AND ATOMS </SectionTitle> <Paragraph position="0"/> </Section> class="xml-element"></Paper>