File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/89/h89-2038_concl.xml

Size: 3,300 bytes

Last Modified: 2025-10-06 13:56:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="H89-2038">
  <Title>Large-Vocabulary Speaker-Independent Continuous Speech Recognition with Semi.Continuous Hidden Markov Models</Title>
  <Section position="6" start_page="278" end_page="278" type="concl">
    <SectionTitle>
4. CONCLUSIONS
</SectionTitle>
    <Paragraph position="0"> Semi-continuous hidden Markov models based on multiple vector quantization codebooks take the advantages of both the discrete HMM and continuous HMM. With the SCHMM, it is possible to model a mixture of a large number of probability density rune* tions with a limited amount of training data and computational complexity Robustness is enhanced by using multiple codewords and multiple codebooks for the semi-continuous output probability representation. In addition, the VQ codebook itself can be adjusted together with the HMM parameters in order to obtain the optimum maximum likelihood of the HMM. The applicab,lity of the continuous mixture HMM or the SCHMM relies on appropriately chosen acoustic parameters and assumption of the continuous probability density function. Acoustic features must be well represented if diagonal covariance is applied to the Geussian probability density function. This is strongly indicated by the experimental results based on the bilineer transformed cepstrum and eepstrum. With bilinear transformation, high frequency components are compressed in comparison to low frequency com.</Paragraph>
    <Paragraph position="1"> ponents \[2.3\]. Such a transformation converts the linear frequency axis into a mel.scale-like one. The discrete HMM can be substantially improved by bilinear transformation However. bil.</Paragraph>
    <Paragraph position="2"> inear transformation introduces strong correlations, which is inap.</Paragraph>
    <Paragraph position="3"> propriate for the diagonal Gausstan assumption modeling. Using the cepstrum without bilinear transformation, the diagonal SCHMM can be substantially improved in comparison to the discrete HMM All experiments conducted here were based on only 200 generalired tr~phnnes as smoothing can play a more ~mportant role m those less-well.tra=ned models, more improvement can be expected for 1000 generahzed triphones ,where the word accuracy for the discrete HMM is 91~ with bilinear transformed data~ \[n addi.</Paragraph>
    <Paragraph position="4"> tion. removal of diagonal covariance assumption by use of full covariance can be expected to further improve recognition accuracy\[l\]. Regard,ng use of full covariance, theSCl(MM hasadis. tinctive advantage Since Gaussian probability density functions are tied to the VQ codebook, by chosing M mo~t significant codewords, computational complexity can be several order lower than the conventional continuous mixture \[{MM while mamtalning the modeling power of large mixture components Experimental results have clearly demonstrated that the SCIIMM ofl'erx improved recognition accuracy in compurison to hoth the discrete HMM and the continuous mixture HMM in speaker.</Paragraph>
    <Paragraph position="5"> independent continuous speech recognition, We conclude that the SCHMM is indeed a powerful technique for modeling non-stationary stochastic processes with multi-modal prohabilistic functions of Markov chains</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML