File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1154_intro.xml

Size: 3,681 bytes

Last Modified: 2025-10-06 14:02:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1154">
  <Title>Robust Sub-Sentential Alignment of Phrase-Structure Trees</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Related Research
</SectionTitle>
    <Paragraph position="0"> Several approaches to sub-structural alignment of tree representations have been proposed.</Paragraph>
    <Paragraph position="1"> (Matsumoto et al., 1993) and (Imamura, 2001) focus on using alignments to help resolve parsing ambiguities. As we wish to develop an alignment process for use in MT rather than parsing, this makes their approaches unsuitable for our use.</Paragraph>
    <Paragraph position="2"> (Eisner, 2003) presents a tree-mapping method for use on dependency trees which he claims can be adapted for use with PS trees. He uses dynamic programming to break tree pairs into pairs of aligned elementary trees, similar to DOT. However, he aims to estimate a translation model from unaligned data, whereas we wish to align our data off-line. Currently, he has used his algorithm to perform intra-lingual translation but has yet to develop and apply real models to inter-lingual MT.</Paragraph>
    <Paragraph position="3"> (Gildea, 2003) outlines an algorithm for use in syntax-based statistical models of MT, applying a statistical TSG with probabilities parameterized to generate the target tree conditioned on the structure of the source tree. His approach is unsuitable for DOT as it involves altering the shape of trees in order to impose isomorphism and the algorithm does not always generate a complete target tree structure.</Paragraph>
    <Paragraph position="4"> However, unlike (Gildea, 2003), we treat the problem of alignment as a seperate task rather than as part of a generative translation model.</Paragraph>
    <Paragraph position="5"> (Ding et al., 2003) and (Menezes and Richardson, 2003) also present approaches to the alignment of tree structures. Both deal with dependency structures rather than PS trees. (Ding et al., 2003) outline an algorithm to extract word-level alignments using structural information taken from parallel dependency trees. They fix the nodes of tree pairs based on word alignments deduced statistically and then proceed by partitioning the tree into treelet pairs with the fixed nodes as their roots. Their algorithm relies on the fact that, in dependency trees, subtrees are headed by words rather than syntactic labels, making it unsuitable for our use.</Paragraph>
    <Paragraph position="6"> (Menezes and Richardson, 2003) employ a best-first strategy and use a small alignment grammar to extract transfer mappings from bilingual corpora for use in translation. They use a bilingual dictionary and statistical techniques to supply translation pair candidates and to identify multi-word terms.</Paragraph>
    <Paragraph position="7"> Lexical correspondences are established using the lexicon of 98,000 translation pairs and a derivational morphology component to match other lexical items. Nodes are then aligned using these lexical correspondences along with structural information.</Paragraph>
    <Paragraph position="8"> Our algorithm uses a similar methodology. However, (Menezes and Richardson, 2003) use logical forms, which constitute a variation of dependency trees that normalize both the lexical and syntactic form of examples, whereas we align PS trees.</Paragraph>
    <Paragraph position="9"> Although the methods outlined above have achieved promising results, only the approach of (Menezes and Richardson, 2003) seems relevant to our goal, even though they deal with abstract dependency-type structures rather than PS trees.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML