File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/92/a92-1023_abstr.xml
Size: 2,216 bytes
Last Modified: 2025-10-06 13:47:23
<?xml version="1.0" standalone="yes"?> <Paper uid="A92-1023"> <Title>A Practical Methodology for the Evaluation of Spoken Language Systems</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> A meaningful evaluation methodology can advance the state-of-the-art by encouraging mature, practical applications rather than &quot;toy&quot; implementations. Evaluation is also crucial to assessing competing claims and identifying promising technical approaches. While work in speech recognition (SR) has a history of evaluation methodologies that permit comparison among various systems, until recently no methodology existed for either developers of natural language (NL) interfaces or researchers in speech understanding (SU) to evaluate and compare the systems they developed.</Paragraph> <Paragraph position="1"> Recently considerable progress has been made by a number of groups involved in the DARPA Spoken Language Systems (SLS) program to agree on a methodology for comparative evaluation of SLS systems, and that methodology has been put into practice several times in comparative tests of several SLS systems. These evaluations are probably the only NL evaluations other than the series of Message Understanding Conferences (Sundheim, 1989; Sundheim, 1991) to have been developed and used by a group of researchers at different sites, although several excellent workshops have been held to study some of these problems (Palmer et al., 1989; Neal et al., 1991).</Paragraph> <Paragraph position="2"> This paper describes a practical &quot;black-box&quot; methodology for automatic evaluation of question-answering NL systems.</Paragraph> <Paragraph position="3"> While each new application domain will require some development of special resources, the heart of the methodology is domain-independent, and it can be used with either speech or text input. The particular characteristics of the approach are described in the following section: subsequent sections present its implementation in the DARPA SLS community, and some problems and directions for future development.</Paragraph> </Section> class="xml-element"></Paper>