File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/c96-1078_intro.xml
Size: 2,045 bytes
Last Modified: 2025-10-06 14:06:00
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-1078"> <Title>Alignment of Shared Forests for Bilingual Corpora</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The development of a glachine translation (MT) system re<luires the lengthy manual preparation of bilingual lexicons an(t transfer rules. I{esearch over the past few years using parallel senten<:ealigned bilingual corpora sugg{'.sts ways in which this manual effort <'an be partly replaced by corpus-based training. Some o1' l;his research has treated the sentenees as unstructured word sequences to be aligned; this work has primarily involved the acquisition of bilingual lexical correspondences (Chen, 1993), although there has also been a,n attempt to create a full MT system based on such trcat, ment (Brown et al., 1993). I{ecently, several groul)S have been exploring the possibility of aligning t)arMlel syntacticalhj analyzed sentences fr<)m the source and target languages (el. (Sate and Nagao, 1990), (Klawms and Tzoukermann, 1990), (Grishman and Kosaka, 1992), (Kaji et al., 1992), (Matsumoto et al., 11993) and (Grishman, 11994)).</Paragraph> <Paragraph position="1"> Tiffs offers the potential for acquiring not j ust lcxical but also structural correspondences between the two languages, q'he specific goal in aligning syntax trees is to identify tile (:orresponding tree fragments in the source and target trees. By processing a. substantial corpus, a large set of such corresponding fragments can be collected. These (:an then serve as the example base for a form of examph;-based MT (of. (Nagao, 198d), (Sate and Nagao, 1990), (IG\ii et al., 1992), (Matsumoto <% al., 19!)3) and (leuruse and lida, 1994)). This approach requires a fast tree alignment teehuiqu<~; research has I)een ham/)ered by the lack of efli<:icnt algorithms. This pa,per des<:ril/cs an efficient algorithm for bilingual tree alignment.</Paragraph> </Section> class="xml-element"></Paper>