File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/89/h89-1020_evalu.xml

Size: 4,438 bytes

Last Modified: 2025-10-06 14:00:02

<?xml version="1.0" standalone="yes"?>
<Paper uid="H89-1020">
  <Title>GOATS TO SHEEP: CAN RECOGNITION RATE BE IMPROVED FOR POOR TANGORA SPEAKERS?</Title>
  <Section position="5" start_page="146" end_page="148" type="evalu">
    <SectionTitle>
RESULTS
</SectionTitle>
    <Paragraph position="0"> Initial performance for this group of users was comparable to previous results with the TANGORA system.</Paragraph>
    <Paragraph position="1"> Error rate for the first day of practice with the system ranged from 4.5% to 14.0%, with an average of 8.6%.</Paragraph>
    <Paragraph position="2"> This replicates the findings for first day performance reported by Brown et at. (in preparation).</Paragraph>
    <Paragraph position="3"> To address the issue of changes in recognition performance over the four week period, error rate for each week was computed by averaging data from all sessions obtained with a given speaker model, when that model was current. That is, for weeks one and two, averages were taken over Practice Session 1, Practice Session 2, Test Session 1 and Test Session 2. Whereas in Weeks 3 and 4, averages were taken over the two Practice sessions and only one Test session. These data were then averaged over all 12 users. The resultant average error rate for the first week was 8.4%. It dropped to 7.5% in the second week, to 6.4% in the third week and to 5.6% in the final week. Thus, there was a 33% reduction in error rate from the first to the fourth week (see Figure 2). There was no reduction in error within a week. That is, performance was constant over the four (Weeks 1 &amp; 2) or three (Weeks 3 &amp; 4) sessions in which each speaker model was used as a current model. Data were collapsed over all four weeks and all users to produce an average error rate for the first day, for the Second day, for the third day and for the fourth day. These error rates were, respectively, 6.1%, 5.9%, 6.3% and 6.2%.</Paragraph>
    <Paragraph position="4"> Recognition accuracy for 11 of the 12 users improved from the first to the fourth week. Figure 3 shows first week error rate plotted against final week error rate for each user. The diagonal line represents no</Paragraph>
    <Paragraph position="6"> Re-training proved to be a successful technique for decreasing error rate (see Figure 4). During the third week, error rate for the one test session in which the current (i.e., third week) speaker model was used was 6.9%. However, use of the older speaker model (i.e., the one generated during the first week) during the third week produced an error rate of 10.3%. A similar pattern was observed during the fourth week. Error rate with the current, fourth week, model was 5.7%. It increased to 8.0% when the model from the second week was used.</Paragraph>
    <Paragraph position="8"> A number of speech habits brought by users to the ASR situation were identified as contributing to poor recognition by TANGORA. These included: (a) a too fast speech rate, (b) failure to pause between words, (c) hyper-correct articulation of the final phoneme in words, and (d) incomplet e articulation of the first phoneme in words. Suggestions on how to modify speaking style were made to users and, in some cases, dramatic decreases in recognition error followed. Speaking at a too fast rate, which resulted in the production of words which were not clearly articulated, was frequently observed with these users. The error was easily corrected by slowing down. Only one user had pervasive difficulties pausing between words. In trying to overcome this problem, in response to feedback, he developed the habit of emphasizing the final phoneme in words. This, referred to here as hyper-articulation, resulted in substitution errors composed of the word spoken by the user plus an unintended word ending (e.g., hearT - &gt; hearts; detail - &gt; detailed).</Paragraph>
    <Paragraph position="9"> Unclear or too short articulation of the first phoneme in words, particularly for words with unstressed initial syllables, frequently resulted in errors as well. Most users were able to modify their speaking style in response to feedback about types of recognition errors the TANGORA system was making. These behavioral interventions were less successful for the four speakers who completed the study with the highest error rate (between 6.8% and 8.3%). Further analysis of their speech is needed to determine why this was the case.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML