File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-3107_intro.xml
Size: 2,778 bytes
Last Modified: 2025-10-06 14:04:11
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-3107"> <Title>Searching for alignments in SMT. A novel approach based on an Estimation of Distribution Algorithm [?]</Title> <Section position="2" start_page="0" end_page="47" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Nowadays, statistical approach to machine translation constitutes one of the most promising approaches in this field. The rationale behind this approximation is to learn a statistical model from a parallel corpus. A parallel corpus can be defined as a set [?]This work has been supported by the Spanish Projects JCCM (PBI-05-022) and HERMES 05/06 (Vic. Inv. UCLM) of sentence pairs, each pair containing a sentence in a source language and a translation of this sentence in a target language. Word alignments are necessary to link the words in the source and in the target sentence. Statistical models for machine translation heavily depend on the concept of alignment, specifically, the well known IBM word based models (Brown et al., 1993). As a result of this, different task on aligments in statistical machine translation have been proposed in the last few years (HLT-NAACL 2003 (Mihalcea and Pedersen, 2003) and ACL 2005 (Joel Martin, 2005)).</Paragraph> <Paragraph position="1"> In this paper, we propose a novel approach to deal with alignments. Specifically, we address the problem of searching for the best word alignment between a source and a target sentence. As there is no efficient exact method to compute the optimal alignment (known as Viterbi alignment) in most of the cases (specifically in the IBM models 3,4 and 5), in this work we propose the use of a recently appeared meta-heuristic family of algorithms, Estimation of Distribution Algorithms (EDAs). Clearly, by using a heuristic-based method we cannot guarantee the achievement of the optimal alignment. Nonetheless, we expect that the global search carried out by our algorithm will produce high quality results in most cases, since previous experiments with this technique (Larra~naga and Lozano, 2001) in different optimization task have demonstrated. In addition to this, the results presented in section 5 support the approximation presented here.</Paragraph> <Paragraph position="2"> This paper is structured as follows. Firstly, Statistical word alignments are described in section 2.</Paragraph> <Paragraph position="3"> Estimation of Distribution Algorithms (EDAs) are introduced in section 3. An implementation of the search for alignments using an EDA is described in section 4. In section 5, we discuss the experimental issues and show the different results obtained. Finally, some conclussions and future work are discussed in section 6.</Paragraph> </Section> class="xml-element"></Paper>