File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/91/p91-1039_concl.xml

Size: 2,272 bytes

Last Modified: 2025-10-06 13:56:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="P91-1039">
  <Title>FACTORIZATION OF LANGUAGE CONSTRAINTS IN SPEECH RECOGNITION</Title>
  <Section position="9" start_page="304" end_page="304" type="concl">
    <SectionTitle>
5. Summary
</SectionTitle>
    <Paragraph position="0"> For most speech recognition and understanding tasks, the syntactic and semantic knowledge for the task is often represented in an integrated manner with a finite state network.</Paragraph>
    <Paragraph position="1"> However for more ambitious tasks, the FSN representation can become so large that performing speech recognition using such an FSN becomes computationally prohibitive. One way to circumvent this difficulty is to factor the language constraints such that speech decoding is accomplished using a covering grammar with a smaller FSN representation and language decoding is accomplished by imposing the complete set of task constraints in a post-processing mode using multiple word and string hypotheses generated from the speech decoder as input. When testing on the DARPA resource management task using the word-pair grammar, we found (Lee, 1990/2) that most of the word errors involve short function words (60% of the errors, e.g. a, the, in) and confusions among morphological variants of the same lexeme (20% of the errors, e.g. six vs. sixth). These errors are not easily resolved on the acoustic level, however they can easily be corrected with a simple set of syntactic and semantic rules operating in a post-processing mode.</Paragraph>
    <Paragraph position="2"> The language constraint factoring scheme has been shown efficient and effective. For the DARPA RMT, we found that the proposed semantic post-processor improves both the word accuracy and the semantic accuracy significantly. However in the current implementation, no acoustic information is used in disambiguating words; only the pronunciations of words are used to verify the values of the semantic variables in cases when there is semantic ambiguity in finding the best matching string.</Paragraph>
    <Paragraph position="3"> The performance can further be improved if the acoustic matching information used in the recognition process is incorporated into the language decoding process.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML