File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/j99-4005_concl.xml

Size: 3,077 bytes

Last Modified: 2025-10-06 13:58:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="J99-4005">
  <Title>Squibs and Discussions Decoding Complexity in Word-Replacement Translation Models</Title>
  <Section position="6" start_page="612" end_page="614" type="concl">
    <SectionTitle>
5. Discussion
</SectionTitle>
    <Paragraph position="0"> The two proofs point up separate factors in MT decoding complexity. One is word-order selection. But even if any word order will do, there is still the problem of picking a concise decoding in the face of overlapping bilingual dictionary entries. The former is more closely tied to the source model, and the latter to the channel model, though the complexity arises from the interaction of the two.</Paragraph>
    <Paragraph position="1"> We should note that Model 1 is an intentionally simple translation model, one whose primary purpose in machine translation has been to allow bootstrapping into more complex translation models (e.g., IBM Models 2-5). It is easy to show that the intractability results also apply to stronger &amp;quot;fertility/distortion&amp;quot; models; we assign zero probability to fertilities other than 1, and we set up uniform distortion tables.</Paragraph>
    <Paragraph position="2"> Simple translation models like Model 1 find more direct use in other applications (e.g., lexicon construction, idiom detection, psychological norms, and cross-language information retrieval), so their computational properties are of wider interest.</Paragraph>
    <Section position="1" start_page="614" end_page="614" type="sub_section">
      <SectionTitle>
Knight Decoding Complexity
</SectionTitle>
      <Paragraph position="0"> The proofs we presented are based on a worst-case analysis. Real s, e, and b tables may have properties that permit faster optimal decoding than the artificial tables constructed above. It is also possible to devise approximation algorithms like those devised for other NP-complete problems. To the extent that word ordering is like solving the Traveling Salesman Problem, it is encouraging substantial progress continues to be made on Traveling Salesman algorithms. For example, it is often possible to get within two percent of the optimal tour in practice, and some researchers have demonstrated an optimal tour of over 13,000 U.S. cities. (The latter experiment relied on things like distance symmetry and the triangle inequality constraint, however, which do not hold in word ordering.) So far, statistical translation research has either opted for heuristic beam-search algorithms or different channel models. For example, some researchers avoid bag generation by preprocessing bilingual texts to remove word-order differences, while others adopt channels that eliminate syntactically unlikely alignments.</Paragraph>
      <Paragraph position="1"> Finally, expensive decoding also suggests expensive training from unannotated (monolingual) texts, which presents a challenging bottleneck for extending statistical machine translation to language pairs and domains where large bilingual corpora do not exist.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML