File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/w05-0832_concl.xml

Size: 2,193 bytes

Last Modified: 2025-10-06 13:55:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0832">
  <Title>Gaming Fluency: Evaluating the Bounds and Expectations of Segment-based Translation Memory</Title>
  <Section position="6" start_page="180" end_page="181" type="concl">
    <SectionTitle>
5 Discussion
</SectionTitle>
    <Paragraph position="0"> The maximum bleu score attained by a TM we describe (6.56) would place it in last place in the NIST2002 evals, butbyless than 0.5 bleu. Successive NIST competitions have exhibited impressive system progress, but each year there have been newcomers who score near (or in some cases lower than) our simple TM baseline.</Paragraph>
    <Paragraph position="1">  We have presented several experiments that quantitatively describe how well a simple TM performs when measured with a standard MT evaluation measure, bleu. We showed that the translation performance of a TM grows as a log-linear function of corpus size below 7.5 million segments. We showed, somewhat surprisingly, only 1000 IR returns need be evaluated by a rescorer to get within 1 bleu point of the maximum possible score attainable by the TM.</Paragraph>
    <Paragraph position="2"> In future work, we expect to validate these results with other language pairs. One question is: how well does this simple IR query expansion addresssegmented languages andlanguages that allow moreliberal wordorder? Supervisedtraining of n-best reranking schemes would also determine how far the oracle bound can bepushed.</Paragraph>
    <Paragraph position="3"> The computationally more expensive reranking procedure that attempts to optimize bleu on the entire document should be investigated to determine how much can be gained by better global management of the brevity penalty.</Paragraph>
    <Paragraph position="4"> Finally, webelieve it's worthnotingthedegree to which high fluency of the TM output could potentially mislead target-language-only readers in their estimation of the system's performance.</Paragraph>
    <Paragraph position="5"> Table 1 is representative of system output, and is a good example of why translations should not be judged solely on the fluency of a few segments of target language output.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML