XML Viewer - p06-2088

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/p06-2088_metho.xml
Size: 13,901 bytes
Last Modified: 2025-10-06 14:10:30
<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-2088">
  <Title>Simultaneous English-Japanese Spoken Language Translation Based on Incremental Dependency Parsing and Transfer</Title>
  <Section position="4" start_page="683" end_page="684" type="metho">
    <SectionTitle>
2 Japanese in Simultaneous
English-Japanese Translation
</SectionTitle>
    <Paragraph position="0"> In this section, we describe the problem of the difference of word order between English and Japanese in incremental English-Japanese translation. In addition, we outline an approach of simultaneous machine translation utilizing linguistic phenomena, flexible word order, and inversion, characterizing Japanese speech.</Paragraph>
    <Section position="1" start_page="683" end_page="683" type="sub_section">
      <SectionTitle>
2.1 Difference of Word Order between
English and Japanese
</SectionTitle>
      <Paragraph position="0"> Let us consider the following English: (E1) I want to fly from San Francisco to Denver next Monday.</Paragraph>
      <Paragraph position="1"> The standard Japanese for (E1) is (J1) raishu-no ('next') getsuyobi-ni ('Monday') San Francisco-kara ('from') Denver-he ('to') tobi-tai-to omoi-masu ('want to fly').</Paragraph>
      <Paragraph position="2"> Figure 1 shows the output timing when the translation is generated as incrementally as possible in consideration of the word alignments between (E1) and (J1). In Fig. 1, the flow of time is shown from top to bottom. In this study, we assume that the system translates input words chunk-bychunk. We define a simple noun phrase (e.g. San  (e.g. want to fly) and each other word (e.g. I, from, to) as a chunk. There is &amp;quot;raishu-no getsuyobi-ni&amp;quot; ('next Monday') at the beginning of the translation (J1), and there is &amp;quot;next Monday&amp;quot; corresponding to &amp;quot;raishu-no getsuyobi-ni&amp;quot; at the end of the sentence (E1). Thus, the system cannot output &amp;quot;raishu-no getsuyobi-ni&amp;quot; and its following translation until the whole sentence is uttered. This is a fatal flaw in incremental English-Japanese translation because there exists an essential difference between English and Japanese in the word order. It is fundamentally impossible to cancel these problems as long as we assume (J1) to be the translation of (E1).</Paragraph>
    </Section>
    <Section position="2" start_page="683" end_page="684" type="sub_section">
      <SectionTitle>
2.2 Utilizing Flexible Word Order in
Japanese
</SectionTitle>
      <Paragraph position="0"> Japanese is a language with a relatively flexible word order. Thus, it is possible that a Japanese translation can be accepted even if it keeps the word order of an English sentence. Let us consider the following Japanese:  (J2) San Francisco-kara ('from') Denver-he ('to') tobi-tai-to omoi-masu ('want to fly') raishu-no ('next') getsuyobi-ni ('Monday').</Paragraph>
      <Paragraph position="1"> (J2) can be accepted as the translation of the sen- null tence (E1) and still keep the word order as close as possible to the sentence (E1). Figure 2 shows the output timing when the translation is generated as incrementally as possible in consideration of the word alignments between (E1) and (J2). The figure demonstrates that a translation system might  be able to output &amp;quot;San Francisco -kara ('from')&amp;quot; when &amp;quot;San Francisco&amp;quot; is input and &amp;quot;Denver-he ('to') tobi-tai-to omoi-masu ('want to fly')&amp;quot; when &amp;quot;Denver&amp;quot; is input. If a translation system outputs the sentence (J2) as the translation of the sentence (E1), the system can translate it incrementally. The translation (J2) is not necessarily an ideal translation because its word order differs from that of the standard translation and it has an inverted sentence structure. However the translation (J2) can be easily understood due to the high flexibility of word order in Japanese. Moreover, in spoken language machine translation, the high degree of incrementality is preferred to that of quality. Therefore, our study positively utilizes flexible word order and inversion to realize incremental English-Japanese translation while keeping the translation quality acceptable.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="684" end_page="685" type="metho">
    <SectionTitle>
3 Japanese Generation based on
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="684" end_page="684" type="sub_section">
      <SectionTitle>
Dependency Structure
</SectionTitle>
      <Paragraph position="0"> When an English-Japanese translation system incrementally translates an input sentence by utilizing flexible word order and inversion, it is possible that the system will generate a grammatically incorrect Japanese sentence. Therefore, it is necessary for the system to generate the translation while maintaining the translation quality at an acceptable level as a correct Japanese sentence.</Paragraph>
      <Paragraph position="1"> In this section, we describe how to generate an English-Japanese translation that retains the word order of the input sentence as much as possible while keeping the quality acceptable.</Paragraph>
    </Section>
    <Section position="2" start_page="684" end_page="684" type="sub_section">
      <SectionTitle>
3.1 Dependency Grammar in English and
Japanese
</SectionTitle>
      <Paragraph position="0"> Dependency grammar illustrates the syntactic structure of a sentence by linking individual words. In each link, modifiers (dependents) are connected to the word that they modify (head). In Japanese, the dependency structure is usually defined in terms of the relation between phrasal units  A bunsetsu is one of the linguistic units in Japanese, and roughly corresponds to a basic phrase in English. A bunsetsu consists of one independent word and more than zero ancillary words. A dependency is a modification relation between two bunsetsus.</Paragraph>
      <Paragraph position="1">  * Each bunsetsu, except the last one, depends on only one bunsetsu.</Paragraph>
      <Paragraph position="2"> The translation (J1) is satisfied with these constraints as shown in Fig. 3. A sentence satisfying these constraints is deemed grammatically correct sentence in Japanese. To meet this requirement, our method parses the dependency relations between input chunks and generates a translation satisfying Japanese dependency constraints.</Paragraph>
    </Section>
    <Section position="3" start_page="684" end_page="685" type="sub_section">
      <SectionTitle>
3.2 Inversion
</SectionTitle>
      <Paragraph position="0"> In this paper, we call the dependency relations heading from right to left &amp;quot;inversions&amp;quot;. Inversions occur more frequently in spontaneous speech than in written text in Japanese. That is to say, there are some sentences in Japanese spoken language that do not satisfy the constraint mentioned above.</Paragraph>
      <Paragraph position="1"> Translation (J2) does not satisfy this constraint, as shown in Fig. 4. We investigated the inversions using the CIAIR corpus (Ohno et al., 2003) and found the following features: Feature 1 92.2% of the inversions are that the head bunsetsu of the dependency relation is a predicate. (predicate inversion) Feature 2 The more the number of dependency relations that depend on a predicate increases, the more the frequency of predicate inversions increases.</Paragraph>
      <Paragraph position="2"> Feature 3 There are not three or more inversions in a sentence.</Paragraph>
      <Paragraph position="3"> From Feature 1, our method utilizes a predicate inversion to retain the word order of an input sentence. It also generates a predicate when the number of dependency relations that depend on a predicate exceeds the constant R (from Feature 2). If there are three or more inversions in the translation, the system cancels an inversion by restating a predicate (from Feature 3).</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="685" end_page="686" type="metho">
    <SectionTitle>
4 System Configuration
</SectionTitle>
    <Paragraph position="0"> Figure 5 shows the configuration of our system.</Paragraph>
    <Paragraph position="1"> The system translates an English speech transcript into Japanese incrementally. It is composed of three modules: incremental parsing, transfer and generation. In the parsing module the parser determines the English dependency structure for input words incrementally. In the transfer module, structure and lexicon transfer rules transform the English dependency structure into the Japanese case structure. As for the generation module, the system judges whether the translation of each chunk can be output, and if so, outputs the translation of the chunk. Figure 6 shows the processing flow when the fragment &amp;quot;I want to fly from San Francisco to Denver&amp;quot; of2.1is input. In the following subsections we explain each module, referring to Fig. 6.</Paragraph>
    <Section position="1" start_page="685" end_page="686" type="sub_section">
      <SectionTitle>
4.1 Incremental Dependency Parsing
</SectionTitle>
      <Paragraph position="0"> First, the system performs POS tagging for input words and chunking (c.f. &amp;quot;Chunk&amp;quot; in Fig. 6).</Paragraph>
      <Paragraph position="1"> Next, we explain how to parse the English phrase structure (c.f. &amp;quot;English phrase structure&amp;quot; in Fig. 6). When we parse the phrase structure for input words incrementally, there arises the problem of ambiguity; our method needs to determine only one parsing result at a time. To resolve this problem our system selects the phrase structure of the maximum likelihood at that time by using PCFG (Probabilistic Context-Free Grammar) rules. To resolve the problem of the processing time our system sets a cut-off value.</Paragraph>
      <Paragraph position="3"> want to fly from San Francisco to Denver&amp;quot; Furthermore, the system transforms the English phrase structure into an English dependency structure (c.f. &amp;quot;English dependency structure&amp;quot; in Fig. 6). The dependency structure for the sentence can be computed from the phrase structure for the input words by defining the category for each rule in CFG, called a &amp;quot;head child&amp;quot; (Collins, 1999). The head is indicated using an asterisk * in the phrase structure of Fig. 6. In the &amp;quot;English phrase structure,&amp;quot; the chunk in parentheses at each node is the head chunk of the node that is determined by the head information of the syntax rules. If the head chunk (e.g. &amp;quot;from&amp;quot;) of a child node (e.g.</Paragraph>
      <Paragraph position="4"> PP(from)) differs from that of its parent node (e.g.</Paragraph>
      <Paragraph position="5"> VP(want-to-fly)), the head chunk (e.g. &amp;quot;from&amp;quot;) of the child node depends on the head chunk (e.g.</Paragraph>
      <Paragraph position="6"> &amp;quot;want-to-fly&amp;quot;) of the parent node. Some syntax rules are also annotated with subject and object information. Our system uses such information to add Japanese function words to the translation of the subject chunk or the object chunk in the generation module. To use a predicate inversion in the  generation module the system has to recognize the predicate of an input sentence. This system recognizes the chunk (e.g. &amp;quot;want to fly&amp;quot;) on which the subject chunk (e.g. &amp;quot;I&amp;quot;) depends as a predicate.</Paragraph>
    </Section>
    <Section position="2" start_page="686" end_page="686" type="sub_section">
      <SectionTitle>
4.2 Incremental Transfer
</SectionTitle>
      <Paragraph position="0"> In the transfer module, structure and lexicon transfer rules transform the English dependency structure into the Japanese case structure (&amp;quot;Japanese case structure&amp;quot; in Fig. 6). In the structure transfer, the system adds a type of relation to each dependency relation according to the following rules.</Paragraph>
      <Paragraph position="1"> * If the dependent chunk of a dependency relation is a subject or object (e.g. &amp;quot;I&amp;quot;), then the type of such dependency relation is &amp;quot;subj&amp;quot; or &amp;quot;obj&amp;quot;.</Paragraph>
      <Paragraph position="2"> * If a chunk A (e.g. &amp;quot;San Francisco&amp;quot;) indirectly depends on another chunk B (e.g. &amp;quot;wantto-fly&amp;quot;) through a preposition (e.g. &amp;quot;from&amp;quot;), then the system creates a new dependency relation where A depends on B directly, and the type of the relation is the preposition.</Paragraph>
      <Paragraph position="3"> * The type of the other relations is &amp;quot;null&amp;quot;.</Paragraph>
      <Paragraph position="4"> In the lexicon transfer, the system transforms each English chunk into its Japanese translation.</Paragraph>
    </Section>
    <Section position="3" start_page="686" end_page="686" type="sub_section">
      <SectionTitle>
4.3 Incremental Generation
</SectionTitle>
      <Paragraph position="0"> In the generation module, the system transforms the Japanese case structure into the Japanese dependency structure by translating a particle and a predicate. In attaching a particle (e.g. &amp;quot;kara&amp;quot; (from)) to the translation of a chunk (e.g. &amp;quot;San Francisco&amp;quot;), the system determines the attached particle (e.g. &amp;quot;kara&amp;quot; (from)) by particle translation rules. In translating a predicate (e.g. &amp;quot;want to fly&amp;quot;), the system translates a predicate by predicate translation rules, and outputs the translation of each chunk using the method described in Section 3.</Paragraph>
    </Section>
    <Section position="4" start_page="686" end_page="686" type="sub_section">
      <SectionTitle>
4.4 Example of Translation Process
</SectionTitle>
      <Paragraph position="0"> glish sentence, &amp;quot;I want to fly from San Francisco to Denver next Monday.&amp;quot; In Fig. 7 the underlined words indicate that they can be output at that time.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML