File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/p06-2003_abstr.xml
Size: 1,043 bytes
Last Modified: 2025-10-06 13:45:07
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2003"> <Title/> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> We present a comparative study on Machine Translation Evaluation according to two different criteria: Human Likeness and Human Acceptability. We provide empirical evidence that there is a relationship between these two kinds of evaluation: Human Likeness implies Human Acceptability but the reverse is not true.</Paragraph> <Paragraph position="1"> From the point of view of automatic evaluation this implies that metrics based on Human Likeness are more reliable for system tuning.</Paragraph> <Paragraph position="2"> Our results also show that current evaluation metrics are not always able to distinguish between automatic and human translations. In order to improve the descriptive power of current metrics we propose the use of additional syntax-based metrics, and metric combinations inside the</Paragraph> </Section> class="xml-element"></Paper>