File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/w01-0815_intro.xml
Size: 1,555 bytes
Last Modified: 2025-10-06 14:01:20
<?xml version="1.0" standalone="yes"?> <Paper uid="W01-0815"> <Title>Evaluating text quality: judging output texts without a clear source</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Probably the most critical questions that need to be addressed when evaluating automatically generated texts are: does the text actually say what it's supposed to say and is it fluent, coherent, clear and grammatical? The answers to these questions say something important about how good the target texts are and -- perhaps more to the point -- how good the system that generated them is. There is no a priori reason why the target texts should be any better or worse when they result from natural language generation (NLG) or from machine translation (MT): indeed, they could result from the same language generator. Given this, it may be natural to assume that NLG could appropriately adopt evaluation methods developed for its more mature sister, MT. However, while this holds true for issues related to intelligibility (the second critical question), it does not apply as readily to issues of fidelity (the first question). We go beyond our recent experience of evaluating the AGILE system for producing multilingual versions of software user manuals (Hartley, Scott et al., 2000; Kruijff et al., 2000) and raise some open questions about how best to evaluate the faithfulness of an output text with respect to its input specification.</Paragraph> </Section> class="xml-element"></Paper>