XML Viewer - c04-1154

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/c04-1154_metho.xml
Size: 19,471 bytes
Last Modified: 2025-10-06 14:08:47
<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1154">
  <Title>Robust Sub-Sentential Alignment of Phrase-Structure Trees</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Data-Oriented Translation
</SectionTitle>
    <Paragraph position="0"> Data-Oriented Translation (DOT) (Poutsma, 2000; Hearne and Way, 2003), which is based on Data-Oriented Parsing (DOP) (Bod, 1998; Bod et al., 2003), comprises a context-rich, experience-based approach to translation, where new translations are derived with reference to grammatical analyses of previous translations. DOT exploits bilingual treebanks comprising linguistic representations of previously seen translation pairs, as well as explicit links which map the translational equivalences present within these pairs at sub-sentential level an example of such a linked translation pair can be seen in Figure 1(a). Analyses and translations of the input are produced simultaneously by combining source and target language fragment pairs derived from the treebank trees.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Fragmentation
</SectionTitle>
      <Paragraph position="0"> The tree fragment pairs used in Tree-DOT are called subtree pairs and are extracted from bilingual aligned treebank trees. The two decomposition operators, which are similar to those used in Tree-DOP but are refined to take the translational links into account, are as follows:  * the root operator which takes any pair of linked nodes in a tree pair to be the roots of a subtree pair and deletes all nodes except these new roots and all nodes dominated by them; * the frontier operator which selects a (possibly empty) set of linked node pairs in the newly created subtree pairs, excluding the roots, and deletes all subtree pairs dominated by these nodes.</Paragraph>
      <Paragraph position="1"> Allowing the root operator to select the root nodes of the original treebank tree pair and then the frontier operator to select an empty set of node pairs ensures that the original treebank tree pair is always included in the fragment base - in Figure 1, fragment (a) exactly matches the original treebank tree pair from which fragments (a) - (f) were derived. Fragments (b) and (f) were also derived by allowing the frontier operator to select the empty set; the root operation selected node pairs &lt;A,N&gt; and &lt;NPadj,NPdet&gt; respectively. Fragments (c), (d) and (e) were derived by selecting all further possible combinations of node pairs by root and frontier. null</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Translation
</SectionTitle>
      <Paragraph position="0"> The DOT composition operator is defined as follows. The composition of tree pairs &lt;s1,t1&gt; and &lt;s2,t2&gt; (&lt;s1,t1&gt; * &lt;s2,t2&gt;) is only possible if * the leftmost non-terminal frontier node of s1 is of the same syntactic category (e.g. S, NP, VP) as the root node of s2, and * the leftmost non-terminal frontier node of s1's linked counterpart in t1 is of the same syntactic category as the root node of t2.</Paragraph>
      <Paragraph position="1"> The resulting tree pair consists of a copy of s1 where s2 has been inserted at the leftmost frontier node and a copy of t1 where t2 has been inserted at the node linked to s1's leftmost frontier node, as illustrated in Figure 2.</Paragraph>
      <Paragraph position="2"> The DOT probability of a translation derivation is the joint probability of choosing each of the subtree pairs involved in that derivation. The probability of selecting a subtree pair is its number of occurrences in the corpus divided by the number of pairs in the corpus with the same root nodes as it:</Paragraph>
      <Paragraph position="4"> The probability of a derivation in DOT is the product of the probabilities of the subtree pairs involved  in building that derivation. Thus, the probability of</Paragraph>
      <Paragraph position="6"> Again, a translation can be generated by many different derivations, so the probability of a translation ws --wt is the sum of the probabilities of its derivations:</Paragraph>
      <Paragraph position="8"> Selection of the most probable translation via Monte Carlo sampling involves taking a random sample of derivations and outputting the most frequently occurring translation in the sample.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Our Algorithm
</SectionTitle>
    <Paragraph position="0"> The operation of a DOT system is dependent on the availability of bilingual treebanks aligned at sentential and sub-sentential level. Our novel algorithm attempts to fully automate sub-sentential alignment using an approach inspired by that of (Menezes and Richardson, 2003). The algorithm takes as input a pair of &lt;source,target&gt; PS trees and outputs a mapping between the nodes of the tree pair.</Paragraph>
    <Paragraph position="1"> As with the majority of previous approaches, the algorithm starts by finding lexical correspondences between the source and target trees. Our lexicon is built automatically using a previously developed word aligner based on the k-vec aligner as outlined by (Fung &amp; Church, 1994). This lexical aligner uses a combination of automatically extracted cognate information, mutual information and probabilistic measures to obtain one-to-one lexical correspondences between the source and target strings. During lexical alignment, function words are excluded because, as they are the most common words in a language, they tend to co-occur frequently with the content words they precede. This can lead to the incorrect alignment of content words with function words.</Paragraph>
    <Paragraph position="2"> The algorithm then proceeds from the aligned lexical terminal nodes in a bottom-up fashion, using a mixture of node label matching and structural information to perform language-independent linking between all &lt;source,target&gt; node pairs within the trees. As with (Menezes and Richardson, 2003), it uses a best-first approach. After each step, new linked node pairs are added to the current list of linked nodes. The links made between the nodes are fixed, thus restricting the freedom of alignment for the remaining unaligned nodes in the tree pair.</Paragraph>
    <Paragraph position="3"> The methods of the algorithm are applied to each new linked node pair in turn until no new node pairs can be added. The algorithm consists of five main methods which are performed on each linked node pair in the list: Verb + Object Align (Figure 3): We have a linked source-target node pair &lt;s,t&gt; . s and t are both verbs, are the leftmost children in their respective trees, both have VP parent nodes and they have the same number of siblings which have similar syntactic labels. We align the corresponding siblings of s and t. This aligns the objects of the source verb with the equivalent objects of the target verb. We also align the parents of s and t.</Paragraph>
    <Paragraph position="4">  the links made by Verb + Object Align when the current linked node pair is &lt;MODAL,MODAL&gt; .</Paragraph>
    <Paragraph position="5"> Parent Align (Figure 4): We have a current linked source-target node pair &lt;s,t&gt; with unlinked parents pars and part respectively. All the sister nodes of s are aligned with sister nodes of t. We link pars and part. If s and t each have one unlinked sister, but the remaining sisters of s are aligned with sister nodes of t, link the unlinked sisters and link pars with part.</Paragraph>
    <Paragraph position="6"> NP/VP Align (Figure 5): We have a linked source-target node pair &lt;s,t&gt; and s and t are both nouns. Traverse up the source tree to find the topmost NP node nps dominating s and traverse up the target tree to find the topmost  target NP node npt dominating t. We link nps and npt.</Paragraph>
    <Paragraph position="7"> We then traverse down from nps and npt to the leftmost leaf nodes ( ls and lt) in the source and target subtrees rooted at nps and npt. If ls and lt have similar labels, we link them. This helps to preserve the scope of noun-phrase modifiers. If s and t are both verbs, we perform a similar method, this time linking the topmost VP nodes in the source and target trees.</Paragraph>
    <Paragraph position="8">  by NP Align when the current linked node pair is &lt;N,N&gt; . Child Align (Figure 6): This method is similar to that of Parent Align. We have a current linked source-target node pair &lt;s,t&gt; . Each node has the same number of children and these children have similar node labels. We link their corresponding children.</Paragraph>
    <Paragraph position="9">  Subtree Align: We have a linked source-target node pair &lt;s,t&gt; . If the subtrees rooted at s and at t are fully isomorphic, we link the corresponding nodes within the subtrees. This accounts for the fact that trees may not be completely isomorphic from their roots but may be isomorphic at subtree level.1 1Originally we used a method isomophic which checked for Once lexical correspondences have been established, the methods outlined above use structural information to align the &lt;source,target&gt; nodes. The comparison of &lt;source,target&gt; node labels during alignment ensures that sub-structures with corresponding syntactic categories are aligned. If the algorithm fails to find any alignments between the source and target tree pairs, due to the absence of initial lexical correspondences, we align the &lt;source,target&gt; root nodes.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Experiments and results
</SectionTitle>
    <Paragraph position="0"> Previous DOT experiments (Hearne and Way, 2003) were carried out on a subset of the HomeCentre corpus consisting of 605 English-French sentence pairs from Xerox documentation parsed into LFG c(onstituent)- and f(unctional)-structure representations and aligned at sentence level. This bilingual treebank constitutes a linguistically complex fragment base containing many 'hard' translation examples, including cases of nominalisations, passivisation, complex coordination and combinations thereof. Accordingly, the corpus would appear to present a challenge to any MT system.</Paragraph>
    <Paragraph position="1"> The insertion of the links denoting translational equivalence for the set of tree pairs used in the previous experiments was performed manually. We have applied our automatic sub-structural alignment algorithm to this same set of 605 tree pairs and evaluated performance using two distinct methods.</Paragraph>
    <Paragraph position="2"> Firstly, we used the manual alignments as a 'gold standard' against which we evaluated the output of the alignment algorithm in terms of precision, recall and f-score. The results of this evaluation are presented in Section 5.1. Secondly, we repeated the DOT experiments described in (Hearne and Way, 2003) using the automatically generated alignments in place of those determined manually. We evaluated the output translations in terms of IBM Bleu scores, precision, recall and f-score and present these results in Section 5.2.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 Evaluation of alignment quality
</SectionTitle>
      <Paragraph position="0"> Using the manually aligned tree pairs as a 'gold standard', we evaluated the performance of each of the five methods which constitute the alignment algorithm both individually and in combination.</Paragraph>
      <Paragraph position="1"> These evaluations are summarised in Figures 7 and 8 respectively.</Paragraph>
      <Paragraph position="2"> The alignment process is always initialised by finding word correspondences between the source isomorphism from the roots downwards, assuming a root-root correspondence. However, this significantly decreased the performance of the aligner.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
PRECISION RECALL F-SCORE
</SectionTitle>
    <Paragraph position="0"> and target trees, meaning that lexical alignment is carried out regardless of which other method or combination of methods is included. The low rate of recall achieved by the lexical alignment process of 0.3057, shown in Figure 7, can be largely attributed to the fact that it does not align function words. We achieve high precision relative to recall - as is generally preferred for automatic procedures - indicating that the alignments induced are more likely to be 'partial' than incorrect.</Paragraph>
    <Paragraph position="1"> When evaluated individually, the Parent Align method performs best, achieving an f-score of 0.5978. Overall, the highest f-score of 0.7064 is achieved by using all methods, including the additional subtree method, in combination.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Evaluation of translation quality
</SectionTitle>
      <Paragraph position="0"> In order to evaluate the impact of using automatically generated alignments on translation quality, we repeated the DOT experiments described in (Hearne and Way, 2003) using these alignments in place of manually determined translational equivalences. null In order to ensure that differences in the results achieved could be attributed solely to the different sub-structural alignments imposed, we used precisely the same 8 training/test set splits as before, where each training set contained 545 parsed sentence pairs, each test set 60 sentences, and all words occurring in the source side of the test set also occurred in the source side of the training set (but not necessarily with the same lexical category). As before, all translations carried out were from English into French and the number of samples taken during the disambiguation process was limited to 5000.</Paragraph>
      <Paragraph position="1"> Due to constraints on time and memory, data-oriented language processing applications generally limit the size of the fragment base by exclud- null ing larger fragments. In these experiments, we increased the size of the fragment base incrementally by initially allowing only fragments of link depth (LD) 1 and then including those of LD 2, 3 and 4. 2 We evaluated the output translations in terms of IBM Bleu scores using the NIST MT Evaluation Toolkit3 and in terms of precision, recall and f-score using the NYU General Text Matcher.4 We summarise our results and reproduce and extend those of (Hearne and Way, 2003)5 in Figures 9, 10 and 11.</Paragraph>
      <Paragraph position="2"> Results over the full set of output translations, summarised in Figure 9, show that using the manually linked fragment base results in significantly better overall performance at all link depths (LD1 - LD4) than using the automatic alignments. However, both metrics used assign score 0 in all instances where no translation was output by the system. The comparatively poor scores achieved using the automatically induced alignments reflect the fact that these alignments give poorer coverage at all depths than those determined manually (47.71% vs.</Paragraph>
      <Paragraph position="3"> 66.46% at depth 1, 56.39% vs. 67.92% at depths 2 - 4).</Paragraph>
      <Paragraph position="4"> The results in Figure 10 include scores only where a translation was produced. Here, translations produced using manual alignments score better only at LD 1; better performance is achieved at LD 2 - 4 using the automatically linked fragment base. Again, this may - at least in part - be an issue of coverage: many of the sentences for which only the manually aligned fragment base produces translations are translationally complex and, therefore, more likely to be only partially correct and achieve poor scores.</Paragraph>
      <Paragraph position="5"> Finally, we determined the subset of sentences for which translations were produced both when the manually aligned fragment bases were used and  when the automatically linked ones were used. Figure 11 summarises the results achieved when evaluating only these translations. In terms of Bleu scores, translations produced using manual alignments score slightly better at all depths. However, as link depth increases the gap narrows consistently and at depth 4 the difference in scores is reduced to just 0.0125. In terms of f-scores, the translations produced using automatic alignments actually score better than those produced using manual alignments at depths 2 - 4.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.3 Discussion
</SectionTitle>
      <Paragraph position="0"> Our first evaluation method (Section 5.1) is, perhaps, the obvious one to use when evaluating alignment performance. However, the results of this evaluation, which show best f-scores of 70%, provide no insight into the effect using these alignments has on translation accuracy. Evaluating these alignments in context - by using them in the DOT system for which they were intended - gives us a true picture of their worth. Crucially, in Section 5.2 we showed that using automatic rather than manual alignments results in translations of extremely high quality, comparable to those produced using manual alignments.</Paragraph>
      <Paragraph position="1"> In many cases, translations produced using automatic alignments contain fewer errors involving local syntactic phenomena than those produced using manual alignment. This suggests that, as links between function words are infrequent in the automatic alignments, we achieve better modelling of phenomena such as determiner-noun agreement because the determiner fragments do not generally occur without context. For example, there are relatively few instances of 'D-the' aligned with 'D-le/la/l'/les' in the automatic alignment compared to the manual alignment.</Paragraph>
      <Paragraph position="2"> On the other hand, we achieve 10% less coverage when translating using the automatic alignments.</Paragraph>
      <Paragraph position="3"> The automatic alignments are less likely to identify non-local phenomena such as long-distance dependencies. Consequently, the sentences only translated when using manual alignments are generally longer and more complex than those translated by both. While a degree of trade-off between coverage and accuracy is to be expected, we would like to increase coverage while maintaining or improving translation quality. Improvements to lexical alignment should prove valuable in this regard. While we expect translation quality to improve as depth increases, experiments using the automatical alignment show disproportionately poor performance at depth 1. The majority of links in the depth 1 fragment base are inserted using the lexical aligner, indicating that these are less than satisfactory. We expect improvements to the lexical aligner to significantly improve the overall performance of the alignment algorithm and, consequently, the quality of the translations produced. Lexical alignment is crucial in identifying complex phenomena such as long distance dependencies. Using machine-readable bilingual dictionaries or, alternatively, manually established word-alignments to intiate the automatic sub-structural alignment algorithm may provide more accurate results.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML