File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-1007_metho.xml
Size: 12,555 bytes
Last Modified: 2025-10-06 14:13:37
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1007"> <Title>A BIDIRECTIONAL, TRANSFER-DRIVEN MACHINE TRANSLATION SYSTEM FOR SPOKEN DIALOGUES</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> A BIDIRECTIONAL, TRANSFER-DRIVEN MACHINE TRANSLATION SYSTEM FOR SPOKEN DIALOGUES </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> ABSTRACT </SectionTitle> <Paragraph position="0"> This paper presents a brief overview of the bidirectional (Japanese and English) Transfer-Driven Machine Translation system, currently being developed at ATR. The aim of this development is to achieve bidirectional spoken dialogue translation using a new translation technique, TDMT, in which an example-based framework is fully utilized to translate the whole sentence. Although the translation coverage is presently restricted to conference registration, the system meets requirements for spoken dialogue translation, such as two-way translation, high speed, and high accuracy with robust processing.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 1. INTRODUCTION Transfer-Driven Machine Translation\[ll,\[2\],\[9\], </SectionTitle> <Paragraph position="0"> (TDMT) is a translation technique which utilizes empirical transfer knowledge compiled from actual translation examples. The main part of translation is performed by the transfer module which applies the transfer knowledge to each input sentence. Other modules, such as lexical processing, analysis, and generation, cooperate with the transfer module to improve translation performance. With this transfer-centered translation mechanism together with the example-based f'ramework\[3\],\[4\].\[5\], which conducts distance calculations between linguistic expressions using the semantic hierarchy, TDMT performs efficient and robust translation.</Paragraph> <Paragraph position="1"> TDMT is especially useful for spoken language translation, since spoken language expressions tend to deviate from conventional grammars and since applications dealing with spoken languages, such as automatic telephone interpretation\[6\],\[7l,\[S\], need efficient and robust processing to handle diverse inputs. A prototype system of TDMT which performs bidirectional translation (Japanese to English and English to Japanese) has been implemented. This bidirectional translation system simulates the dialogues between two speakers speaking in different languages (Japanese and English) using an interpreting telephone system. Experimental results have shown TDMT to be promising for spoken dialogue translation.</Paragraph> </Section> <Section position="4" start_page="0" end_page="64" type="metho"> <SectionTitle> 2. TRANSFER-DRIVEN ARCHITECTURE </SectionTitle> <Paragraph position="0"> The bidirectional TDMT system, shown in Figure 1, translates English into Japanese and Japanese into English. Conversion of the translation direction is simply done by flipping the mode selection switch. Moreover, all of the sharable processing modules are used in both translations.</Paragraph> <Paragraph position="1"> This bidirectional translation capability, along with other features adopted tbr spoken language, shows the possibility of two-way dialogue translation.</Paragraph> <Paragraph position="2"> The transfer module, which is the heart of the TDMT system, transfers source language expressions into target language expressions using bilingual translation knowledge. When another language-dependent processing, such as lexical processing, analysis, or generation, is necessary to obtain a proper target expression, the required module is called by the transfer module. In other words, all modules in a TDMT system function as a part of or hell) the transfer module. This transfer-centered architecture simplifies the configuration as well as the control of the machine translation system.</Paragraph> </Section> <Section position="5" start_page="64" end_page="66" type="metho"> <SectionTitle> 3. TRANSLATION MECHANISM </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="64" end_page="65" type="sub_section"> <SectionTitle> 3.1 Examplc-based analysis and transfer </SectionTitle> <Paragraph position="0"> The TDMT system utilizes an example-based framework to translate a sentence. The central mechanism of this framework it; the distance calculation\[4\],\[5\], which is measured in terms of a thesaurus hierarchy. We adopt the calculations of Sumita\[51.</Paragraph> <Paragraph position="1"> Figure 2 shows an example of the transfer knowldege tbr the Japanese pattern &quot;X no Y.&quot; (X and Y are lexical variables and &quot;no&quot; is an adnominal particle. X' represents the English translation for X, and the English translations are noted in braces after the Japanese words for</Paragraph> <Paragraph position="3"> Y' ofX' ((ronbun {paper}, daimohu {title}) .... ), Y' for X' ((beya {room}, yoyaku {reserwltion}) .... ), Y' in X' ((Tokyo {Tokyo}, haigi {conference}) .... ), X' Y' ((en{yen}, heya {room}) .... ), Fig. 2 An Example of Transfer Knowledge The first transfer knowledge, &quot;X no Y -, Y' of X' (ronbun {paper}, daimoku {title})&quot; represents the translation example that a set (ronbun{paper}, daimoku{title}) in structure &quot;X no Y' is transferred into structure &quot;Y' of X'.&quot; Thus, pattern selection is conducted using such examples. When the source pattern &quot;X no Y&quot; is applied to an input, the transfer module compares the actual words tbr X and g with the sets of examples, searches for the nearest example set with its distance score, and provides the most appropriate transferred pattern.</Paragraph> <Paragraph position="4"> For example, if the Japanese input is &quot;Kyoto no kaigi&quot; and the nearest example set (X, Y) would be (Tokyo, kaigi), then the transfer module selects &quot;Y' in X'&quot; as the translation pattern and outputs &quot;conference in Kyoto.&quot; Thus, bilingual transfer knowledge consisting of patterns and translation examples is used in TDMT to select, the most appropriate translation.</Paragraph> <Paragraph position="5"> To analyze a whole source sentence and to form its target structure, the transfer module applies various kinds of bilingual transfer knowledge to the source sentencet.ql. Figure 3 is a list of different types of bilingual transfer knowledge (patterns and words), that are used with examples to analyze the Japanese expression &quot;torokuyoushi wa sudeni o-mochi deshou ha&quot; and to form its transferred result &quot;do you already have the registration tbrm.&quot; As shown in Figure 4, both the source and target structures are formed based on the distance calculation. The numbers in brackets represent the transfer knowledge pairs in Figure 3.</Paragraph> <Paragraph position="7"> Fig. 3 Various Kinds of Transfer Knowledge As we have seen, the example-based framework is employed in the transfer module of the TDMT system, in which bilingual transfer knowledge is used for both analyzing and transferring the input sentence cooperatively. In other words, in the transfer module, both the source and target structm'es are formed by applying the bilingual transfer knowledge extracted from the example database.</Paragraph> <Paragraph position="8"> Source Sentence: &quot;torokuyoushi wa sudeni o-mochi deshou ks&quot; \] 4, Source structure Tarzet st,uetu,'c xde ho,,l. \] ---3---- r \] do you X</Paragraph> </Section> <Section position="2" start_page="65" end_page="65" type="sub_section"> <SectionTitle> 3.2 Structural disambiguation </SectionTitle> <Paragraph position="0"> Multiple source structures may be produced in accordance with the application of the bilingual transfer knowledge. In such cases, the most appropriate structure is chosen by computing the total distances for all possible combinations of partial translations and by selecting the combination with the smallest total distance. The structure with the smallest total distance is judged to be most consistent with the empirical knowledge, and is chosen as the most plausible structure. (See \[9\] for details of the distance and total distance calculations.) For instance, when the pattern &quot;X no Y&quot; is applied to the expression &quot;ichi-man en {10,000 yen} no heya{room} no yoyaku{reservation},&quot; there are two possible structures.</Paragraph> <Paragraph position="1"> 1) ichi-man en no (heya no~) 2) ( ichi-man en no heAL ~) no ~ The TDMT system calculates the total distance for each of the structures 1) and 2) using the bilingual transfer knowledge stored in the system. The following are the target structures when the transfer knowledge in Figure 2 is applied. (Source structures 1 and 2 give target structures 1' and 2', respectively.) 1') 10,000 yen (reservation for room) 2') reservation for (10,000 yen room) In this case, (en{yen}, yoyaku{reservation}) in 1 is semantically distant from the examples of &quot;X no Y,&quot; which increases the total distance for structure 1. Figure 5 illustrates the two sets of source and target structures generated by the transfer module of the TDMT system.</Paragraph> </Section> <Section position="3" start_page="65" end_page="66" type="sub_section"> <SectionTitle> 3.3 Sentence generation </SectionTitle> <Paragraph position="0"> The generation module completes the translation of the transferred sentence using target language knowledge that is not provided at the transfer stage. This module performs the following two tasks in cooperation with the transfer module.</Paragraph> <Paragraph position="1"> 1) Grammatical sentence generation: It determines the word order and morphological inflections, and generates lexically essential words, like articles in English, so that the whole sentence is fully grammatical.</Paragraph> <Paragraph position="2"> 2) Natural sentence generation: It brushes up the sentence by changing, adding, or deleting word(s) so that the whole sentence is as natural as a spoken dialogue sentence.</Paragraph> <Paragraph position="3"> Figure 6 shows an example of Japanese natural sentence generation where addition of a polite auxiliary adverb and deletion of redundant pronouns take place.</Paragraph> <Paragraph position="5"> Target a 10,000 yen room&quot; \] Sentence: &quot;reservation for Fig. 5 A Disambiguation Example for &quot;X no Yno Z&quot;</Paragraph> </Section> </Section> <Section position="6" start_page="66" end_page="67" type="metho"> <SectionTitle> 4. IMPLEMENTATION </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="66" end_page="66" type="sub_section"> <SectionTitle> 4.1 System specification </SectionTitle> <Paragraph position="0"> The bidirectional TDMT system has been developed in Common Lisp and runs either on a Sun Workstation or Symbolics Lisp Machine. Ti~e dictionaries and the rules are made by extracting the entries from the ATR corporaU0l concerning</Paragraph> </Section> <Section position="2" start_page="66" end_page="66" type="sub_section"> <SectionTitle> 4.2 System operation </SectionTitle> <Paragraph position="0"> Figure 7 shows a screen shot of the bidirectional TDMT System on a Symbolics Lisp machine. The system simulates the communication between an applicant (Japanese speaker) and a secretary (English speaker). The dialogue history is displayed at the bottom of the screen. In the screen, Terminal A is the applicant's terminal and Terminal B is the secretary's terminal. The translated sentences are displayed in reverse video (white on black).</Paragraph> </Section> <Section position="3" start_page="66" end_page="67" type="sub_section"> <SectionTitle> 4.3 System performance </SectionTitle> <Paragraph position="0"> The system has been trained with 825 Japanese sentences for J-E translation and 607 English sentences fl)r E-J translation. These sentences were selected from the ATR corpora and typical dialogues about the domain. The system can translate sentences with a vocabulary of 1500 words. In J-E translation, the system on a Lisp machine with 10MIPS performance has provided an average translation time of 1.9 seconds for sentences having an average length of 9.2 words, and a success rate of about 70% for open test sentences of this domain.</Paragraph> </Section> </Section> class="xml-element"></Paper>