XML Viewer - h91-1056

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/91/h91-1056_intro.xml
Size: 1,995 bytes
Last Modified: 2025-10-06 14:05:02
<?xml version="1.0" standalone="yes"?>
<Paper uid="H91-1056">
  <Title>LEXICAL ACCESS WITH A STATISTICALLY-DERIVED PHONETIC NETWORK</Title>
  <Section position="4" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> We describe a new approach to lexical access in a phone-based speech recognition system. By &amp;quot;lexical access&amp;quot; we mean taking a sequence (or, more generally, a lattice) of phones and durations that is output by a phone recognizer and mapping it onto a word sequence (or, more generally, a lattice).</Paragraph>
    <Paragraph position="1"> In conventional word-based speech recognizers, segmental durations, word co-articulation and alternative pronunciations are usually poorly modelled if at all since the architecture is not convenient or efficient for exploiting these constraints.</Paragraph>
    <Paragraph position="2"> Phone-based recognition offers an attractive alternative from this point of view. Our approach will be to create a probabilistic model that provides the likelihood that a particular word sequence gives rise to a particular phone sequence. This model will take into account allophonic variation, alternative pronunciation, word co-articulation and segmental durations.</Paragraph>
    <Paragraph position="3"> We then combine these lexical likelihoods with the acoustic likelihoods generated by the phone recognizer and priors from our language model to get an overall recognition model whose error rate we seek to minimize.</Paragraph>
    <Paragraph position="4"> We have taken this stochastic approach for two reasons.</Paragraph>
    <Paragraph position="5"> First, it provides a principled way to combine seemingly disparate  information: (a) acoustic likelihoods, (b) segmental durations, (c) alternative pronunciations, and (d) the language model. Second, the availability of large speech corpora now allow the statistical estimation of these probabilities.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML