File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/p03-1039_intro.xml
Size: 2,556 bytes
Last Modified: 2025-10-06 14:01:50
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-1039"> <Title>Chunk-based Statistical Translation</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The framework of statistical machine translation formulates the problem of translating a source sentence in a language J into a target language E as the maximization problem of the conditional probability</Paragraph> <Paragraph position="2"> P(E|J). The application of the Bayes</Paragraph> <Paragraph position="4"> P(E)P(J|E). The former term P(E) is called a language model, representing the likelihood of E. The latter term P(J|E) is called a translation model, representing the generation probability from E into J.</Paragraph> <Paragraph position="5"> As an implementation of P(J|E), the word alignment based statistical translation (Brown et al., 1993) has been successfully applied to similar language pairs, such as French-English and German-English, but not to drastically different ones, such as Japanese-English. This failure has been due to the limited representation by word alignment and the weak model structure for handling complicated word correspondence.</Paragraph> <Paragraph position="6"> This paper provides a chunk-based statistical translation as an alternative to the word alignment based statistical translation. The translation process inside the translation model is structured as follows. A source sentence is first chunked, and then each chunk is translated into target language with local word alignments. Next, translated chunks are re-ordered to match the target language constraints.</Paragraph> <Paragraph position="7"> Based on this scenario, the chunk-based statistical translation model is structured with several components and trained by a variation of the EMalgorithm. A translation experiment was carried out with a decoder based on the left-to-right beam search. It was observed that the translation quality improved from 46.5% to 52.1% in BLEU score and from 59.2% to 65.1% in subjective evaluation.</Paragraph> <Paragraph position="8"> The next section briefly reviews the word alignment based statistical machine translation (Brown et al., 1993). Section 3 discusses an alternative approach, a chunk-based translation model, ranging from its structure to training procedure and decoding algorithm. Then, Section 4 provides experimental results on Japanese-to-English translation in the traveling domain, followed by discussion.</Paragraph> </Section> class="xml-element"></Paper>