File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/90/h90-1067_abstr.xml

Size: 1,300 bytes

Last Modified: 2025-10-06 13:47:00

<?xml version="1.0" standalone="yes"?>
<Paper uid="H90-1067">
  <Title>Experiments with Tree-Structured MMI Encoders on the RM Task</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> This paper describes the tree-structured maximum mutual information (MMI) encoders used in SSrs Phonetic Engine (r) to perform large-vocabulary, continuous speech recognition.</Paragraph>
    <Paragraph position="1"> The MMI encoders are arranged into a two-stage cascade. At each stage, the encoder is trained to maximize the mutual information between a set of phonetic targets and corresponding codes. After each stage, the codes are compressed into segments. This step expands acoustic-phonetic context and reduces subsequent computation. We evaluated these MMI encoders by comparing them against a standard minimum distortion (MD) vector quantizer (encoder). Both encoders produced code streams, which were used to train speaker-independent discrete hidden Markov models in a simplified version of the Sphinx system \[3\]. We used data from the DARPA Resource Management (RM) task. The two-stage cascade of MMI encoders significantly outperforms the standard MD encoder in both speed and accuracy.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML