XML Viewer - c04-1006

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/c04-1006_concl.xml

Size: 1,762 bytes

Last Modified: 2025-10-06 13:53:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1006">
  <Title>Improved Word Alignment Using a Symmetric Lexicon Model</Title>
  <Section position="8" start_page="4" end_page="4" type="concl">
    <SectionTitle>
7 Conclusions
</SectionTitle>
    <Paragraph position="0"> We have addressed the task of automatically generating word alignments for bilingual corpora. This problem is of great importance for many tasks in natural language processing, especially in the field of machine translation.</Paragraph>
    <Paragraph position="1"> We have presented lexicon symmetrization methods for statistical alignment models that are trained using the EM algorithm, in particular the five IBM models, the HMM and Model 6. We have evaluated these methods on the Verbmobil task and the Canadian Hansards task and compared our results to the state-of-the-art system of (Och and Ney, 2003). We have shown that both the linear and the loglinear interpolation of lexicon counts after each iteration of the EM algorithm result in statistically significant improvements of the alignment quality. For the Canadian Hansards task, the AER improved by about 30% relative; for the Verbmobil task the improvement was about 25% relative.</Paragraph>
    <Paragraph position="2"> Additionally, we have described lexicon smoothing using the word base forms. Especially for highly inflected languages such as German, this smoothing resulted in statistically significant improvements.</Paragraph>
    <Paragraph position="3"> In the future, we plan to optimize the interpolation weights to balance the two translation directions. We will also investigate the possibility of generating directly an unconstrained alignment based on the symmetrized lexicon probabilities.</Paragraph>
  </Section>
class="xml-element"></Paper>

Download Original XML