File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/h92-1003_intro.xml
Size: 4,313 bytes
Last Modified: 2025-10-06 14:05:17
<?xml version="1.0" standalone="yes"?> <Paper uid="H92-1003"> <Title>Multi-Site Data Collection for a Spoken Language Corpus MADCOW *</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> Following the February 1991 DARPA Speech and Natural Language Workshop, the DARPA Spoken Language contractors decided to institute a multi-site data collection paradigm in order to: * support a common evaluation on speech, natural language and spoken language; * maximize the amount of data collected; * provide some diversity in data collection paradigms; * reduce cost to any one site by sharing the data collection activity across multiple participating sites. To co-ordinate this effort, MADCOW was formed in May 1991 with a representative from each of the participating sites. This included the six sites planning to collect and evaluate on the data: AT&T, BBN, CMU, MIT, SRI and Paramax (formerly Unisys), it included NIST, which was responsible for data validation, distribution and selection and scoring of test material, and it included the Annotation group at SRI, responsible under a separate contract for annotating the data with database reference answers.</Paragraph> <Paragraph position="1"> *This paper was written under the auspices of the Multi-</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Site ATIS Data Collection Working group (MADCOW) by L. </SectionTitle> <Paragraph position="0"> Hirschman, M. Bates, D. Dahl, W. Fisher, J. Garofolo, K. Hunicke-Smith, D. Pallett, C. Pao, P. Price, and A. Rudnicky. In addition, many other people made important contributions to this work and are listed in the Acknowledgements section.</Paragraph> <Paragraph position="1"> The charter of MADCOW was to implement the multi-site data collection paradigm, to monitor the distribution of the data, and to agree on a test paradigm for the multi-site data. The original goals for the data collection activity were to collect 10,000 training utterances and 1,000 test utterances, plus material for a dry-run test to be held in October 1991. Between May 1991 and February 1992, the following data have been collected under the MADCOW effort: Significant data collection and evaluation infrastructure were already in place prior to the formation of MAD-COW. This included the definition of the air travel planning task \[11\], the database (a relational version of an eleven city subset of the Official Airline Guide, containing airline, flight and ground transportation information, initially set up by C. Hemphill at TI and revised and extended by It. Moore and others at SRI), a comparator-based evaluation methodology for comparing database tuples to reference answers \[1, 12\], and several earlier ATIS corpora collected at TI \[2\] and SRI.</Paragraph> <Paragraph position="2"> To implement the multi-site data collection effort, each site agreed to collect a corpus of 2200 utterances and to provide this corpus to NIST in a standard format, including speech data, transcriptions, and a logfile recording th e subject's interaction with the data collection system.</Paragraph> <Paragraph position="3"> David Pallett's group at NIST was responsible for validation and distribution of the training data as well as for running the common evaluation. As data were submitted, NIST checked the data for conformity to the standard formats and randomly set aside 20% of each site's incoming data for test sets. For the common evaluation, NIST was responsible for the release of the test data, and the collection, scoring and analysis of the results, as well as for adjudication of questions about reference answers on the Spoken Language and Natural Language tests.</Paragraph> <Paragraph position="4"> The Annotation group under Jared Bernstein at SRI was responsible for providing the database reference answers and for categorization of the data into context-independent (class A), context-dependent (class D) and unanswerable (class X) utterances. To facilitate timely agreement on specific issues, a special subgroup, chaired by Deborah Dahl, was formed under MADCOW, with responsibility for the Principles of Interpretation 1.</Paragraph> </Section> </Section> class="xml-element"></Paper>