File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1627_intro.xml

Size: 2,760 bytes

Last Modified: 2025-10-06 14:04:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1627">
  <Title>Efficient Search for Inversion Transduction Grammar</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The Inversion Transduction Grammar (ITG) of Wu (1997) is a syntactically motivated algorithm for producing word-level alignments of pairs of translationally equivalent sentences in two languages. The algorithm builds a synchronous parse tree for both sentences, and assumes that the trees have the same underlying structure but that the ordering of constituents may differ in the two languages. ITG imposes constraints on which alignments are possible, and these constraints have been shown to be a good match for real bitext data (Zens and Ney, 2003).</Paragraph>
    <Paragraph position="1"> A major motivation for the introduction of ITG was the existence of polynomial-time algorithms both for alignment and translation. Alignment, whether for training a translation model using EM or for nding the Viterbi alignment of test data, is O(n6) (Wu, 1997), while translation (decoding) is O(n7) using a bigram language model, and O(n11) with trigrams. While polynomial-time algorithms are a major improvement over the NP-complete problems posed by the alignment models of Brown et al. (1993), the degree of these polynomials is high, making both alignment and decoding infeasible for realistic sentences without very signi cant pruning. In this paper, we explore use of the hook trick (Eisner and Satta, 1999; Huang et al., 2005) to reduce the asymptotic complexity of decoding, and the use of heuristics to guide the search.</Paragraph>
    <Paragraph position="2"> Our search heuristics are a conservative estimate of the outside probability of a bitext cell in the complete synchronous parse. Some estimate of this outside probability is a common element of modern statistical (monolingual) parsers (Charniak et al., 1998; Collins, 1999), and recent work has developed heuristics that are admissible for A* search, guaranteeing that the optimal parse will be found (Klein and Manning, 2003). We extend this type of outside probability estimate to include both word translation and n-gram language model probabilities. These measures have been used to guide search in word- or phrase-based MT systems (Wang and Waibel, 1997; Och et al., 2001), but in such models optimal search is generally not practical even with good heuristics. In this paper, we show that the same assumptions that make ITG polynomial-time can be used to ef ciently compute heuristics which guarantee us that we will nd the optimal alignment or translation, while signi cantly speeding the search.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML