File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/n06-1004_abstr.xml
Size: 1,184 bytes
Last Modified: 2025-10-06 13:44:49
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-1004"> <Title>Segment Choice Models: Feature-Rich Models for Global Distortion in Statistical Machine Translation</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper presents a new approach to distortion (phrase reordering) in phrase-based machine translation (MT). Distortion is modeled as a sequence of choices during translation. The approach yields trainable, probabilistic distortion models that are global: they assign a probability to each possible phrase reordering. These &quot;segment choice&quot; models (SCMs) can be trained on &quot;segment-aligned&quot; sentence pairs; they can be applied during decoding or rescoring. The approach yields a metric called &quot;distortion perplexity&quot; (&quot;disperp&quot;) for comparing SCMs offline on test data, analogous to perplexity for language models. A decision-tree-based SCM is tested on Chinese-to-English translation, and outperforms a baseline distortion penalty approach at the 99% confidence level.</Paragraph> </Section> class="xml-element"></Paper>