File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1606_intro.xml

Size: 1,966 bytes

Last Modified: 2025-10-06 14:04:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1606">
  <Title>SPMT: Statistical Machine Translation with Syntactified Target Language Phrases</Title>
  <Section position="4" start_page="0" end_page="44" type="intro">
    <SectionTitle>
2 SPMT: statistical Machine Translation
with Syntactified Phrases
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="44" type="sub_section">
      <SectionTitle>
2.1 An intuitive introduction to SPMT
</SectionTitle>
      <Paragraph position="0"> After being exposed to 100M+ words of parallel Chinese-English texts, current phrase-based statistical machine translation learners induce reasonably reliable phrase-based probabilistic dictionaries. For example, our baseline statistical phrase-based system learns that, with high probabilities, the Chinese phrases &amp;quot;ASTRO- -NAUTS&amp;quot;, &amp;quot;FRANCE AND RUSSIA&amp;quot; and &amp;quot;COMINGFROM&amp;quot; can be translated into English as &amp;quot;astronauts&amp;quot;/&amp;quot;cosmonauts&amp;quot;, &amp;quot;france and russia&amp;quot;/&amp;quot;france and russian&amp;quot; and &amp;quot;coming from&amp;quot;/&amp;quot;from&amp;quot;, respectively. 1 Unfortunately, when given as input Chinese sentence 1, our phrase-based system produces the output shown in 2 and not the translation in 3, which correctly orders the phrasal translations into a grammatical sequence. We believe this happens because the distortion/reordering models that are used by state-of-the-art phrase-based systems, which exploit phrase movement and ngram target 1To increase readability, in this paper, we represent Chinese words using fully capitalized English glosses and English words using lowercased letters.</Paragraph>
      <Paragraph position="1">  language models (Och and Ney, 2004; Tillman, 2004), are too weak to help a phrase-based decoder reorder the target phrases into grammatical outputs.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML