File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/91/p91-1023_abstr.xml

Size: 949 bytes

Last Modified: 2025-10-06 13:47:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="P91-1023">
  <Title>A PROGRAM FOR ALIGNING SENTENCES IN BILINGUAL CORPORA</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> Researchers in both machine Iranslation (e.g., Brown et al., 1990) and bilingual lexicography (e.g., Klavans and Tzoukermann, 1990) have recently become interested in studying parallel texts, texts such as the Canadian Hansards (parliamentary proceedings) which are available in multiple languages (French and English). This paper describes a method for aligning sentences in these parallel texts, based on a simple statistical model of character lengths. The method was developed and tested on a small trilingual sample of Swiss economic reports. A much larger sample of 90 million words of Canadian Hansards has been aligned and donated to the ACL/DCI.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML