File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1045_intro.xml
Size: 1,725 bytes
Last Modified: 2025-10-06 14:02:07
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1045"> <Title>Improving Word Alignment Quality using Morpho-syntactic Information</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> In statistical machine translation, a translation model Pr(fJ1 jeI1) describes the correspondences between the words in the source language sentence fJ1 and the words in the target language sentence eI1. Statistical alignment models are created by introducing a hidden variable aJ1 representing a mapping from the source word fj into the target word eaj. So far, most of the statistical machine translation systems are based on the single-word alignment models as described in (Brown et al., 1993) as well as the Hidden Markov alignment model (Vogel et al., 1996). The lexicon models used in these systems typically do not include any linguistic or contextual information which often results in inadequate alignments between the sentence pairs.</Paragraph> <Paragraph position="1"> In this work, we propose an approach to improve the quality of the statistical alignments by taking into account the interdependencies of different derivations of the words. We are getting use of the hierarchical representation of the statistical lexicon model as proposed in (Niessen and Ney, 2001) for the conventional EM training procedure. Experimental results are reported for the German-English Verbmobil corpus and the evaluation is done by comparing the obtained Viterbi alignments after the training of conventionalmodelsandmodelswhichareusing morpho-syntactic information with a manually annotated reference alignment.</Paragraph> </Section> class="xml-element"></Paper>