File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/w02-1019_intro.xml

Size: 1,737 bytes

Last Modified: 2025-10-06 14:01:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-1019">
  <Title>Minimum Bayes-Risk Word Alignments of Bilingual Texts</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The automatic determination of word alignments in bilingual corpora would be useful for Natural Language Processing tasks such as statistical machine translation, automatic dictionary construction, and multilingual document retrieval. The development of techniques in all these areas would be facilitated by automatic performance metrics, and alignment and translation quality metrics have been proposed (Och and Ney, 2000b; Papineni et al., 2002).</Paragraph>
    <Paragraph position="1"> However, given the difficulty of judging translation quality, it is unlikely that a single, global metric will be found for any of these tasks. It is more likely that specialized metrics will be developed to measure specific aspects of system performance. This is even desirable, as these specialized metrics could be used in tuning systems for particular applications.</Paragraph>
    <Paragraph position="2"> We have applied Minimum Bayes-Risk (MBR) procedures developed for automatic speech recognition (Goel and Byrne, 2000) to word alignment of bitexts. This is a modeling approach that can be used with statistical models of speech and language to develop algorithms that are optimized for specific loss functions. We will discuss loss functions that can be used for word alignment and show how the over-all alignment process can be improved by the use of loss functions that incorporate linguistic features, such as parses and part-of-speech tags.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML