File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/89/h89-2049_concl.xml

Size: 3,077 bytes

Last Modified: 2025-10-06 13:56:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="H89-2049">
  <Title>SPEECH RECOGNITION IN PARALLEL</Title>
  <Section position="15" start_page="369" end_page="370" type="concl">
    <SectionTitle>
CONCLUSION
</SectionTitle>
    <Paragraph position="0"> By way of summary, the major goal of our research is to demonstrate that recognition accuracy and speed can be improved by composing independently-executing recognizers, and that each recognizer can itself be executed as a parallel process. Along the way, we hope to provide answers to the following questions: * How can we use more of the acoustic information present in the speech signal? Existing speech recognizers do not use all the available information; human listeners perform much better than computers even on tasks, like the E-set, that depend only on acoustic information. We hope to demonstrate that it is possible to use more acoustic information without requiting excessive amounts of training, and to use this information in a parallelizable (hence quickly computable) way.</Paragraph>
    <Paragraph position="1"> * How are different acoustic measurements correlated? It is unlikely that they all provide the same information to a recognizer. If they do, then it doesn't matter which set of features a recognizer uses,  but there is also something fundamentally wrong in the acoustic processing used by current recognizers, since the speech signal conveys enough information to allow people to understand speech much better so far than computers. If they do not, then we should be able to improve recognition accuracy by using more acoustic information.</Paragraph>
    <Paragraph position="2"> * How can we merge the acoustic information derived from different units of speech with different time alignments? As K.-F. Lee has pointed out, it is difficult to combine variable-width parameters and fixed-width paraameters in a single Markov model. We hope to show that it is possible to merge such information coming from different Markov models.</Paragraph>
    <Paragraph position="3"> * How can we merge acoustic information with higher level constraints? Our approach will include the incorporation of a syntactic natural language component to discriminate among different word candidates.</Paragraph>
    <Paragraph position="4"> * Can we speed up the computation involved in speech recognition by running parts of it in parallel in a cost effective manner? Certainly various speech recognition algorithms have been parallelized on a variety of parallel machine architectures. Can we provide the same on a tree architecture utilizing the multiple speech recognizer paradigm? * How might we balance the load among a number of independent speech recognizers to deliver maximum performance and utilization of the parallel processor? Can the load balancing be completely automated at compile time, or are dynamic runtime facilities required? * Can we speed up, either serially or in parallel, the DTW and Viterbi search, for example, by applying new advances in algorithms for dynamic programming?</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML