File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/c96-1078_intro.xml

Size: 2,045 bytes

Last Modified: 2025-10-06 14:06:00

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-1078">
  <Title>Alignment of Shared Forests for Bilingual Corpora</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The development of a glachine translation (MT) system re&lt;luires the lengthy manual preparation of bilingual lexicons an(t transfer rules. I{esearch over the past few years using parallel senten&lt;:ealigned bilingual corpora sugg{'.sts ways in which this manual effort &lt;'an be partly replaced by corpus-based training. Some o1' l;his research has treated the sentenees as unstructured word sequences to be aligned; this work has primarily involved the acquisition of bilingual lexical correspondences (Chen, 1993), although there has also been a,n attempt to create a full MT system based on such trcat, ment (Brown et al., 1993). I{ecently, several groul)S have been exploring the possibility of aligning t)arMlel syntacticalhj analyzed sentences fr&lt;)m the source and target languages (el. (Sate and Nagao, 1990), (Klawms and Tzoukermann, 1990), (Grishman and Kosaka, 1992), (Kaji et al., 1992), (Matsumoto et al., 11993) and (Grishman, 11994)).</Paragraph>
    <Paragraph position="1"> Tiffs offers the potential for acquiring not j ust lcxical but also structural correspondences between the two languages, q'he specific goal in aligning syntax trees is to identify tile (:orresponding tree fragments in the source and target trees. By processing a. substantial corpus, a large set of such corresponding fragments can be collected. These (:an then serve as the example base for a form of examph;-based MT (of. (Nagao, 198d), (Sate and Nagao, 1990), (IG\ii et al., 1992), (Matsumoto &lt;% al., 19!)3) and (leuruse and lida, 1994)). This approach requires a fast tree alignment teehuiqu&lt;~; research has I)een ham/)ered by the lack of efli&lt;:icnt algorithms. This pa,per des&lt;:ril/cs an efficient algorithm for bilingual tree alignment.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML