File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/00/j00-1004_abstr.xml

Size: 3,664 bytes

Last Modified: 2025-10-06 13:41:42

<?xml version="1.0" standalone="yes"?>
<Paper uid="J00-1004">
  <Title>Learning Dependency Translation Models as Collections of Finite-State Head Transducers Hiyan Alshawi*</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> We will define a dependency transduction model in terms of a collection of weighted head transducers. Each head transducer is a finite-state machine that differs from &amp;quot;standard&amp;quot; finite-state transducers in that, instead of consuming the input string left to right, it consumes it &amp;quot;middle out&amp;quot; from a symbol in the string. Similarly, the output of a head transducer is built up middle out at positions relative to a symbol in the output string. The resulting finite-state machines are more expressive than standard left-to-right transducers. In particular, they allow long-distance movement with fewer states than a traditional finite-state transducer, a useful property for the translation task to which we apply them in this paper. (In fact, finite-state head transducers are capable of unbounded movement with a finite number of states.) In Section 2, we introduce head transducers and explain how input-output positions on state transitions result in middle-out transduction.</Paragraph>
    <Paragraph position="1"> When applied to the problem of translation, the head transducers forming the dependency transduction model operate on input and output strings that are sequences of dependents of corresponding headwords in the source and target languages. The dependency transduction model produces synchronized dependency trees in which each local tree is produced by a head transducer. In other words, the dependency * 180 Park Avenue, Florham Park, NJ 07932 t 180 Park Avenue, Florham Park, NJ 07932 180 Park Avenue, Florham Park, NJ 07932 @ 2000 Association for Computational Linguistics Computational Linguistics Volume 26, Number 1 model applies the head transducers recursively, imposing a recursive decomposition of the source and target strings. A dynamic programming search algorithm finds optimal (lowest total weight) derivations of target strings from input strings or word lattices produced by a speech recognizer. Section 3 defines dependency transduction models and describes the search algorithm.</Paragraph>
    <Paragraph position="2"> We construct the dependency transduction models for translation automatically from a set of unannotated examples, each example comprising a source string and a corresponding target string. The recursive decomposition of the training examples results from an algorithm for computing hierarchical alignments of the examples, described in Section 4.2. This alignment algorithm uses dynamic programming search guided by source-target word correlation statistics as described in Section 4.1.</Paragraph>
    <Paragraph position="3"> Having constructed a hierarchical alignment for the training examples, a set of head transducer transitions are constructed from each example as described in Section 4.3. Finally, the dependency transduction model is constructed by aggregating the resulting head transducers and assigning transition weights, which are log probabilities computed from the training counts by simple maximum likelihood estimation. We have applied this method of training statistical dependency transduction models in experiments on English-to-Spanish and English-to-Japanese translations of transcribed spoken utterances. The results of these experiments are described in Section 5; our concluding remarks are in Section 6.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML