File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-3601_intro.xml

Size: 3,925 bytes

Last Modified: 2025-10-06 14:04:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3601">
  <Title>A Syntax-Directed Translator with Extended Domain of Locality</Title>
  <Section position="3" start_page="2" end_page="3" type="intro">
    <SectionTitle>
2 Previous Work
</SectionTitle>
    <Paragraph position="0"> It is helpful to compare this approach with recent efforts in statistical MT. Phrase-based models (Koehn et al., 2003; Och and Ney, 2004) are good at learning local translations that are pairs of (consecutive) sub-strings, but often insufficient in modeling the re-orderings of phrases themselves, especially between language pairs with very different word-order. This is because the generative capacity of these models lies within the realm of finite-state machinery (Kumar and Byrne, 2003), which is unable to process nested structures and long-distance dependencies in natural languages.</Paragraph>
    <Paragraph position="1"> Syntax-based models aim to alleviate this problem by exploiting the power of synchronous rewriting systems. Both Yamada and Knight (2001) and Chiang (2005) use SCFGs as the underlying model, so their translation schemata are syntax-directed as in Fig. 1, but their translators are not: both systems do parsing and transformation in a joint search, essentially over a packed forest of parse-trees. To this end, their translators are not directed by a syntactic tree. Although their method potentially considers more than one single parse-tree as in our case, the packed representation of the forest restricts the scope of each transfer step to a one-level context-free rule, while our approach decouples the source-language analyzer and the recursive converter, so that the latter can have an extended domain of locality. In addition, our translator also enjoys a speed-up by this decoupling, with each of the two stages having a smaller search space. In fact, the recursive transfer step can be done by a a linear-time algorithm (see Section 5), and the parsing step is also fast with the modern Treebank parsers, for instance (Collins, 1999; Charniak, 2000). In contrast, their decodings are reported to be computationally expensive and Chiang (2005) uses aggressive pruning to make it tractable. There also exists a compromise between these two approaches, which uses a k-best list of parse trees (for a relatively small k) to approximate the full forest (see future work).</Paragraph>
    <Paragraph position="2"> Besides, our model, as being linguistically motivated, is also more expressive than the formally syntax-based models of Chiang (2005) and Wu (1997). Consider, again, the passive example in rule r3. In Chiang's SCFG, there is only one nonterminal X, so a corresponding rule would be &lt;was X(1) by X(2), bei X(2) X(1)&gt; which can also pattern-match the English sentence: I was [asleep]1 by [sunset]2 .</Paragraph>
    <Paragraph position="3"> and translate it into Chinese as a passive voice. This produces very odd Chinese translation, because here &amp;quot;was A by B&amp;quot; in the English sentence is not a passive construction. By contrast, our model applies rule r3 only if A is a past participle (VBN) and B is a noun phrase (NP-C). This example also shows that, one-level SCFG rule, even if informed by the Treebank as in (Yamada and Knight, 2001), is not enough to capture a common construction like this which is five levels deep (from VP to &amp;quot;by&amp;quot;).</Paragraph>
    <Paragraph position="4"> There are also some variations of syntax-directed translators where dependency structures are used in place of constituent trees (Lin, 2004; Ding and Palmer, 2005; Quirk et al., 2005). Although they share with this work the basic motivations and similar speed-up, it is difficult to specify re-ordering information within dependency elementary structures, so they either resort to heuristics (Lin) or a separate ordering model for linearization (the other two</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML