XML Viewer - p03-1039

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/p03-1039_intro.xml
Size: 2,556 bytes
Last Modified: 2025-10-06 14:01:50
<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-1039">
  <Title>Chunk-based Statistical Translation</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The framework of statistical machine translation formulates the problem of translating a source sentence in a language J into a target language E as the maximization problem of the conditional probability</Paragraph>
    <Paragraph position="2"> P(E|J). The application of the Bayes</Paragraph>
    <Paragraph position="4"> P(E)P(J|E). The former term P(E) is called a language model, representing the likelihood of E. The latter term P(J|E) is called a translation model, representing the generation probability from E into J.</Paragraph>
    <Paragraph position="5"> As an implementation of P(J|E), the word alignment based statistical translation (Brown et al., 1993) has been successfully applied to similar language pairs, such as French-English and German-English, but not to drastically different ones, such as Japanese-English. This failure has been due to the limited representation by word alignment and the weak model structure for handling complicated word correspondence.</Paragraph>
    <Paragraph position="6"> This paper provides a chunk-based statistical translation as an alternative to the word alignment based statistical translation. The translation process inside the translation model is structured as follows. A source sentence is first chunked, and then each chunk is translated into target language with local word alignments. Next, translated chunks are re-ordered to match the target language constraints.</Paragraph>
    <Paragraph position="7"> Based on this scenario, the chunk-based statistical translation model is structured with several components and trained by a variation of the EMalgorithm. A translation experiment was carried out with a decoder based on the left-to-right beam search. It was observed that the translation quality improved from 46.5% to 52.1% in BLEU score and from 59.2% to 65.1% in subjective evaluation.</Paragraph>
    <Paragraph position="8"> The next section briefly reviews the word alignment based statistical machine translation (Brown et al., 1993). Section 3 discusses an alternative approach, a chunk-based translation model, ranging from its structure to training procedure and decoding algorithm. Then, Section 4 provides experimental results on Japanese-to-English translation in the traveling domain, followed by discussion.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML