File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-3014_intro.xml
Size: 1,964 bytes
Last Modified: 2025-10-06 14:02:30
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-3014"> <Title>Improving Bitext Word Alignments via Syntax-based Reordering of English</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Prior Work </SectionTitle> <Paragraph position="0"> The 2003 HLT-NAACL Workshop on Building and Using Parallel Texts (Mihalcea and Pedersen, 2003) reflected the increasing importance of the word-alignment task, and established standard performance measures and some benchmark tasks.</Paragraph> <Paragraph position="1"> There is prior work studying systematic crossEnglish: null Hindi: use of plutonium is to manufacture nuclear weapons linguistic structural divergences, such as the DUSTer system (Dorr et al., 2002). While the focus on major classes of structural variation such as manner-of-motion verb-phrase transformations have facilitated both transfer and generation in machine translation, these divergences have not been integrated into a system that produces automatic word alignments and have tended to focus on more local phrasal variation rather than more comprehensive sentential syntactic reordering.</Paragraph> <Paragraph position="2"> Complementary prior work (e.g. Wu, 1995) has also addressed syntactic transduction for bilingual parsing, translation, and word-alignment. Much of this work depends on high-quality parsing of both target and source sentences, which may be unavailable for many &quot;lower density&quot; languages of interest. Tree-to-string models, such as (Yamada and Knight, 2001) remove this dependency, and such models are well suited for situations with large, cleanly translated training corpora. By contrast, our method retains the robustness of the underlying aligner towards loose translations, and can if necessary use knowledge of syntactic divergences even in the absence of any training corpora whatsoever, using only a translation lexicon.</Paragraph> </Section> class="xml-element"></Paper>