File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/e06-1004_intro.xml
Size: 3,039 bytes
Last Modified: 2025-10-06 14:03:20
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-1004"> <Title>Computational Complexity of Statistical Machine Translation</Title> <Section position="2" start_page="0" end_page="25" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Statistical Machine Translation is a data driven machine translation technique which uses probabilistic models of natural language for automatic translation (Brown et al., 1993), (Al-Onaizan et al., 1999). The parameters of the models are estimated by iterative maximum-likelihood training on a large parallel corpus of natural language texts using the EM algorithm (Brown et al., 1993).</Paragraph> <Paragraph position="1"> The models are then used to decode, i.e. translate texts from the source language to the target language 1 (Tillman, 2001), (Wang, 1997), (Germann et al., 2003), (Udupa et al., 2004). The models are independent of the language pair and therefore, can be used to build a translation system for any language pair as long as a parallel corpus of texts is available for training. Increasingly, parallel corpora are becoming available for many language pairs and SMT systems have been built for French-English, German-English, Arabic-English, Chinese-English, Hindi-English and other language pairs (Brown et al., 1993), (Al-Onaizan et al., 1999), (Udupa, 2004).</Paragraph> <Paragraph position="2"> In SMT, every English sentence e is considered as a translation of a given French sentence f with probability Pr(f|e). Therefore, the problem of translatingf can beviewed as aproblem of finding the most probable translation of f:</Paragraph> <Paragraph position="4"> The probability distributions Pr(f|e) and Pr(e) are known as translation model and language model respectively. In the classic work on SMT,Brownandhiscolleagues atIBMintroduced the notion of alignment between a sentence f and its translation e and used it in the development of translation models (Brown et al., 1993). An alignment between f = f1 ...fm and e = e1 ...el is a many-to-one mapping a : {1,...,m} {0,...,l}. Thus, an alignment a between f and e associates the french word fj to the English word eaj 2. The number of words of f mapped to ei by a is called the fertility of ei and is denoted by phi. be rewritten as follows:</Paragraph> <Paragraph position="6"> Brown and his colleagues developed a series of 5 translation models which have become to be known in the field of machine translation as IBM models. For a detailed introduction to IBM translation models, please see (Brown et al., 1993). In practice, models 3-5 are known to give good results and models 1-2 are used to seed the EM iterations of the higher models. IBM model 3 is the prototypical translation model and it models</Paragraph> <Paragraph position="8"> Here, n(ph|e) is the fertility model, t(f|e) is the lexicon model and d(j|i,m,l) is the distortion model.</Paragraph> <Paragraph position="9"> The computational tasks involving IBM Models are the following:</Paragraph> </Section> class="xml-element"></Paper>