File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-2124_intro.xml

Size: 2,487 bytes

Last Modified: 2025-10-06 14:03:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-2124">
  <Title>BiTAM: Bilingual Topic AdMixture Models for Word Alignment</Title>
  <Section position="4" start_page="969" end_page="969" type="intro">
    <SectionTitle>
2 Notations and Baseline
</SectionTitle>
    <Paragraph position="0"> In statistical machine translation, one typically uses parallel data to identify entities such as &amp;quot;word-pair&amp;quot;, &amp;quot;sentence-pair&amp;quot;, and &amp;quot;documentpair&amp;quot;. Formally, we define the following terms1: * A word-pair (fj,ei) is the basic unit for word alignment, where fj is a French word and ei is an English word; j and i are the position indices in the corresponding French sentence f and English sentence e.</Paragraph>
    <Paragraph position="1"> * A sentence-pair (f,e) contains the source sentence f of a sentence length of J; a target sentence e of length I. The two sentences f and e are translations of each other.</Paragraph>
    <Paragraph position="2"> * A document-pair (F,E) refers to two documents which are translations of each other.</Paragraph>
    <Paragraph position="3"> Assuming sentences are one-to-one correspondent, a document-pair has a sequence of N parallel sentence-pairs {(fn,en)}, where  (fn,en) is the nprimeth parallel sentence-pair.</Paragraph>
    <Paragraph position="4"> * A parallel corpus C is a collection of M parallel document-pairs: {(Fd,Ed)}.</Paragraph>
    <Section position="1" start_page="969" end_page="969" type="sub_section">
      <SectionTitle>
2.1 Baseline: IBM Model-1
</SectionTitle>
      <Paragraph position="0"> The translation process can be viewed as operations of word substitutions, permutations, and insertions/deletions (Brown et al., 1993) in noisy-channel modeling scheme at parallel sentence-pair level. The translation lexicon p(f|e) is the key component in this generative process. An efficient way to learn p(f|e) is IBM-1:</Paragraph>
      <Paragraph position="2"> English-French, i.e., e - f, although our models are tested, in this paper, for English-Chinese. We use the end-user terminology for source and target languages.</Paragraph>
      <Paragraph position="3"> IBM-1 has global optimum; it is efficient and easily scalable to large training data; it is one of the most informative components for re-ranking translations (Och et al., 2004). We start from IBM-1 as our baseline model, while higher-order alignment models can be embedded similarly within the proposed framework.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML