File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/n03-2007_intro.xml
Size: 1,883 bytes
Last Modified: 2025-10-06 14:01:44
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-2007"> <Title>Active Learning for Classifying Phone Sequences from Unsupervised Phonotactic Models</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> A major barrier to the rapid and cost-effective development of spoken language processing applications is the need for time-consuming and expensive human transcription and annotation of collected data. Extensive transcription of audio is generally undertaken to provide word-level labeling to train recognition models. Applications that use statistically trained classification as a component of an understanding system also require this transcribed text to train on, plus an assignment of class labels to each utterance.</Paragraph> <Paragraph position="1"> In recent work by Alshawi (2003) reported in this conference, new methods for unsupervised training of phone string recognizers have been developed, removing the need for word-level transcription. The phone-string output of such recognizers has been used in classification tasks using the BoosTexter text classification algorithm, giving utterance classfication accuracy that is surprisingly close to that obtained using conventionally trained word trigram models requiring transcription. The only training data required for classification using these recognition methods is assigning class labels to the audio files. The aim of the work described in this paper is to amplify this advantage by reducing the amount of effort required to train classifiers for phone-based systems by actively selecting which utterances to assign class labels. Active learning has been applied to classification problems before (McCallum and Nigam, 1998; Tur et al., 2003), but not to classifiying phone strings.</Paragraph> </Section> class="xml-element"></Paper>