File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/p99-1002_intro.xml

Size: 2,583 bytes

Last Modified: 2025-10-06 14:06:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="P99-1002">
  <Title>AUTOMATIC SPEECH RECOGNITION AND ITS APPLICATION TO INFORMATION EXTRACTION</Title>
  <Section position="3" start_page="0" end_page="11" type="intro">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> The field of automatic speech recognition has witnessed a number of significant advances in the past 5 - 10 years, spurred on by advances in signal processing, algorithms, computational architectures, and hardware. These advances include the widespread adoption of a statistical Figure 1 shows a mechanism of state-of-the-art speech recognizers \[2\]. Common features of these systems are the use of cepstral parameters and their regression coefficients as speech features, triphone HMMs as acoustic models, vocabularies of several thousand or several ten thousand entries, and stochastic language models such as bigrams and trigrams. Such methods have  been applied not only to English but also to French, German, Italian, Spanish, Chinese and Japanese. Although there are several language-specific characteristics, similar recognition results have been obtained.</Paragraph>
    <Paragraph position="1">  world domain of obvious value has lead to rapid technology transfer of speech recognition into other research areas and applications. Since the variations in speaking style and accent as well as in channel and environment conditions are totally unconstrained, broadcast news is a superb stress test that requires new algorithms to work across widely varying conditions. Algorithms need to solve a specific problem without degrading any other condition.</Paragraph>
    <Paragraph position="2"> Another advantage of this domain is that news is easy to collect and the supply of data is boundless. The data is found speech; it is completely uncontrived.</Paragraph>
    <Paragraph position="3"> Fig. 1 - Mechanism of state-of-the-art speech recognizers. The remainder of this paper is organized as follows. Section 2 describes recent progress in broadcast news dictation and its application to information extraction, and Section 3 describes human-computer dialogue systems. In spite of the remarkable recent progress, we are still far behind our ultimate goal of understanding free conversational speech uttered by any speaker under any environment. Section 4 describes how to increase the robustness of speech recognition, and Section 5 describes perspectives of linguistic modeling for spontaneous speech recognition/ understanding. Section 6 concludes the paper.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML