File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/99/w99-0602_abstr.xml

Size: 3,382 bytes

Last Modified: 2025-10-06 13:49:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="W99-0602">
  <Title>Text-Translation Alignment: Three Languages Are Better Than Two *</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In this article, we show how a bilingual text-translation alignment method can be adapted to deal with more than two versions of a text.</Paragraph>
    <Paragraph position="1"> Experiments on a trilingual corpus demonstrate that this method yields better bilingual alignments than can be obtained with bilingual textalignment methods. Moreover, for a given number of texts, the computational complexity of the multilingual method is the same as for bilingual alignment.</Paragraph>
    <Paragraph position="2"> Introduction While bilingual text corpora have been part of the computational linguistics scene for over ten years now, we have recently witnessed the appearance of text corpora containing versions of texts in three or more languages, such as those developed within the CRATER (McEnery et al., 1997), MULTEXT (Ide and V4ronis, 1994) and MULTEXT-EAST (Erjavec and Ide, 1998) projects. Access to this type of corpora raises a number of questions: Do they make new applications possible? Can methods developed for handling bilingual texts be applied to multilingual texts? More generally: is there anything to gain in viewing multilingual documents as more than just multiple pairs of translations? Bilingual alignments have so far shown that they can play multiple roles in a wide range of linguistic applications, such as computer assisted translation (Isabelle et al., 1993; Brown et al., 1990), terminology (Dagan and Church, 1994) lexicography (Langlois, 1996; Klavans and Tzoukermann, 1995; Melamed, 1996), and cross-language information retrieval (Nie et al., * This research was funded by the Canadian Department of Foreign Affairs and International Trade (http://~.dfait-maeci.gc.ca/), via the Agence de la francophonie (http://~. franeophonie, orE) 1998). However, the case for trilingual and multilingual alignments is not as clear. True multi-lingual resources such as multilingual glossaries are not widely used, and most of the time, when such resources exist, the real purpose is usually to provide bilingual resources for multiple pairs of languages in a compact way.</Paragraph>
    <Paragraph position="3"> What we intend to show here is that while multilingual correspondences may not be interesting in themselves, multilingual text alignment techniques can be useful as a means of extracting information on bilingual correspondences. Our idea is that each additional version of a text should be viewed as valuable information that can be used to produce better alignments. In other words: whatever the intended application, three languages are better than two (and, more generally: the more languages, the merrier!).</Paragraph>
    <Paragraph position="4"> After going through some definitions and preliminary material (Section 1), we present a general method for aligning three versions of a text (Section 2). We then describe some experiments that were carried out to evaluate this approach (Section 3) and various possible optimizations (Section 4). Finally, we report on some disturbing experiments (Section 5), and conclude with directions for future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML