File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-1628_concl.xml

Size: 4,733 bytes

Last Modified: 2025-10-06 13:55:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1628">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics A Discriminative Model for Tree-to-Tree Translation</Title>
  <Section position="9" start_page="239" end_page="240" type="concl">
    <SectionTitle>
8 Conclusions and Future Work
</SectionTitle>
    <Paragraph position="0"> We have presented an approach to tree-to-tree based translation which models a new representation--aligned extended projections-within a discriminative, feature-based framework. Our model makes use of an explicit representation of syntax in the target language, together with constraints on the alignments between source and target parse trees.</Paragraph>
    <Paragraph position="1"> The current system presents many opportunities for future work. For example, improvement in accuracy may come from a tighter integration of modifier translation into the over-all translation process. The current method-using an n-best reranking model to select the best candidate--chooses each modifier independently and then places it into the translation. We intend to explore an alternative method that combines finite-state machines representing the n-best output from the phrase-based system with finite-state machines representing the complementizers, verbs, modals, and other substrings of the translation derived from the AEP. Selecting modifiers using this representation would correspond to searching the finite-state network for the most likely path. A finite-state representation has many advantages, including the ability to easily incorporate an n-gram language model.</Paragraph>
    <Paragraph position="2"> Future work may also consider expanded definitions of AEPs. For example, we might consider AEPs that include larger chunks of phrase structure, or we might consider AEPs that contain more detailed information about the relative ordering of modifiers. There is certainly room for improvement in the accuracy with which AEPs are predicted in our data; the feature-driven approach allows a wide range of features to be tested. For example, it would be relatively easy to incorporate a syntactic language model (i.e., a prior distribution over AEP structures) induced from a large amount of English monolingual data.</Paragraph>
    <Paragraph position="3"> Appendix A: Identification of Clauses In the English parse trees, we identify clauses as follows. Any non-terminal labeled by the parser of (Collins, 1999) as SBAR or SBAR-A is labeled as a clause root. Any node labeled by the parser as S or S-A is also labeled as the root of a clause, unless it is directly dominated by a non-terminal labeled SBAR or SBAR-A. Any node labeled SG or SG-A by the parser is labeled as a clause root, unless (1) the node is directly dominated by SBAR or SBAR-A; or (2) the node is directly dominated by a VP, and the node is directly preceded by a verb (POS tag beginning withV) or modal (POS tag beginning with M). Any node labeled VP is marked as a clause root if (1) the node is not directly dominated by a VP, S, S-A, SBAR, SBAR-A, SG, or SG-A; or (2) the node is directly preceded by a coordinating conjunction (i.e., a POS tag labeled as CC).</Paragraph>
    <Paragraph position="4"> In German parse trees, we identify any nodes labeled as S or CS as clause roots. In addition, we mark any node labeled as VP as a clause root, provided that (1) it is preceded by a coordinating conjunction, i.e., a POS tag labeled as KON; or (2) it has one of the functional tags -mo, -re or -sb.</Paragraph>
    <Paragraph position="5">  The n-best reranking model for the translation of modifiers considers a list of candidate translations. We hand-labeled 800 examples, marking the element in each list that would lead to the best translation. The features of the n-best reranking algorithm are combinations of the basic features in Tables 3 and 4.</Paragraph>
    <Paragraph position="6"> Each list contained the n-best translations produced by the phrase-based system of Koehn et al.</Paragraph>
    <Paragraph position="7"> (2003). The lists also contained a supplementary candidate &amp;quot;DELETED&amp;quot;, signifying that the modifier should be deleted from the English translation. In addition, each candidate derived from the phrase-based system contributed one new candidate to the list signifying that the first word of the candidate should be deleted. These additional candidates were motivated by our observation that the optimal candidate in the n-best list produced by the phrase-based system often included an unwanted preposition at the beginning of the string.</Paragraph>
    <Paragraph position="8">  AEP output used for making features in the n-best reranking model.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML