File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/90/h90-1060_abstr.xml
Size: 1,679 bytes
Last Modified: 2025-10-06 13:47:00
<?xml version="1.0" standalone="yes"?> <Paper uid="H90-1060"> <Title>A New Paradigm for Speaker-Independent Training and Speaker Adaptation</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper reports on two contributions to large vocabulary continuous speech recognition. First, we present a new paradigm for speaker-independent (SI) training of hidden Markov models (HMM), which uses a large amount of speech from a few speakers instead of the traditional practice of using a little speech from many speakers. In addition, combination of the training speakers is done by averaging the statistics of independently trained models rather than the usual pooling of all the speech data from many speakers prior to training. With only 12 training speakers for SI recognition, we achieved a 7.5% word error rate on a standard grammar and test set from the DARPA Resource Management corpus. This performance is comparable to our best condition for this test suite, using 109 training speakers.</Paragraph> <Paragraph position="1"> Second, we show a significant improvement for speaker adaptation (SA) using the new SI corpus and a small amount of speech from the new (target) speaker. A probabilistic spectral mapping is estimated independently for each training (reference) speaker and the target speaker. Each reference model is transformed to the space of the target speaker and combined by averaging. Using only 40 utterances from the target speaker for adaptation, the error rate dropped to 4.1% -- a 45% reduction in error compared to the SI result.</Paragraph> </Section> class="xml-element"></Paper>