File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/w98-1222_concl.xml

Size: 3,295 bytes

Last Modified: 2025-10-06 13:58:15

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-1222">
  <Title>Extracting Phoneme Pronunciation Information from Corpora</Title>
  <Section position="6" start_page="0" end_page="0" type="concl">
    <SectionTitle>
4 Discussion
</SectionTitle>
    <Paragraph position="0"> We have analyzed a large corpus of sentences read by a large number of speakers with a view to determining possible mis-pronunciation of phonemes and the context in which such mis-pronunciations occur.</Paragraph>
    <Paragraph position="1"> The results of our analysis support our hypotheses that phonemes are mis-pronounced as phones in the same broad sound categories, and that the context of mis-pronunciation provides valuable information Thomas, Zukerman and Raskutt/ 180 Extracting Phoneme Pronunciation Information  Thomas, Zukerman and Raskutti 181 Extracting Phoneme Pronunciation Information about the intended phoneme.</Paragraph>
    <Paragraph position="2"> As indicated in Figure 3, mis-pronunciation of affricates and fricatives is rare (over 84% certain matches), though when they are mis-pronounced, the uttered phone may be one of the stop consonants. Vowels are often mis-pronounced, but the uttered phone is almost always from the same broad sound category. The stop consonants, d and t, the nasal en, and the semi-vowel hh are often mispronounced. Analysis of the decision trees generated for these mis-pronunciations indicates that the attributes of the phones surrounding the mis-pronounced phoneme do indeed provide information about the intended phoneme. These decision trees are particularly informative for vowels owing to the large number of mis-pronunciations as well as the regularity of these mis-pronunciations. The attributes that are most useful vary for the different pronunciations of each phoneme. For instance, ay is the pronunciation for aa when the previous phone is nasal, while ao is pronounced for aa when the preceding phone is a voiced obstruent and the next phone is a semivowel or glide.</Paragraph>
    <Paragraph position="3"> The frequency of the matches in Figure 3 combined with the decision trees produced using the MML principle may be used to generate alternate pronunciations of phonemes in word models. This will assist in the recognition of mis-pronounced words during automatic speech understanding. The decision tree is weakest for uncommon contexts, because of a lack of training data for constructing the tree (the message length for encoding phonemes in such contexts is no better than an efficient encoding of the context classes using a Huffman code). In this case, the matrix in Figure 3 should be used to predict alternative pronunciations. However for more common contexts, the decision trees are preferred, as they use more information than the matrix to determine the intended phoneme.</Paragraph>
    <Paragraph position="4"> The system described in this paper investigates dependencies between an intended phoneme and a pronounced phone, but it may be easily adapted to determine relationships between an intended sound and a recognized sound, i.e., the output of a speech recognizer. Relationships determined in this manner may be used during speech recognition, and thus account for mis-recognition as well as mispronunciation. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML