File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/97/a97-1009_evalu.xml

Size: 5,857 bytes

Last Modified: 2025-10-06 14:00:20

<?xml version="1.0" standalone="yes"?>
<Paper uid="A97-1009">
  <Title>Name pronunciation in German text-to-speech synthesis</Title>
  <Section position="5" start_page="53" end_page="54" type="evalu">
    <SectionTitle>
5 Evaluation
</SectionTitle>
    <Paragraph position="0"> We evaluated the name analysis system by comparing the pronunciation performance of two versions of the TTS system, one with and one without the name-specific module. We ran both versions on two lists of street names, one selected from the training material and the other from unseen data.</Paragraph>
    <Section position="1" start_page="53" end_page="53" type="sub_section">
      <SectionTitle>
5.1 General-purpose vs. name-specific
analysis
</SectionTitle>
      <Paragraph position="0"> Two versions of the German TTS system were involved in the evaluation experiments, differing in the structure of the text analysis component. The first system contained the regular text analysis modules, including a general-purpose module that handles words that are not represented in the system's lexicon: typically compounds and names. This version will be refered to as the old system. The second version purely consisted of the name grammar transducer discussed in the previous section. It did not have any other lexical information at its disposal.</Paragraph>
      <Paragraph position="1"> This version will be refered to as the new system.</Paragraph>
      <Paragraph position="2"> number of names at least one system wrong both systems wrong total error rate</Paragraph>
    </Section>
    <Section position="2" start_page="53" end_page="53" type="sub_section">
      <SectionTitle>
5.2 Training vs. test materials
</SectionTitle>
      <Paragraph position="0"> The textual materials used in the evaluation experiments consisted of two sets of data. The first set, henceforth training data, was a subset of the data that were used in building the name analysis grammar. For this set, the street names for each of the four cities Berlin, Hamburg, KSln and Miinchen were randomized. We then selected every 50th entry from the four files, yielding a total of 631 street names; thus, the training set also reflected the respective size of the cities.</Paragraph>
      <Paragraph position="1"> The second set, henceforth test data, was extracted from the databases of the cities Frankfurt am Main and Dresden. Using the procedure described above, we selected 206 street names. Besides being among the ten largest German cities, Frankfurt and Dresden also meet the requirement of a balanced geographical and dialectal coverage. These data were not used in building the name analysis system.</Paragraph>
    </Section>
    <Section position="3" start_page="53" end_page="54" type="sub_section">
      <SectionTitle>
5.3 Results
</SectionTitle>
      <Paragraph position="0"> The old and the new versions of the TTS system were run on the training and the test set. Pronunciation performance was evaluated on the symbolic level by manually checking the correctness of the resulting transcriptions. A transcription was considered correct when no segmental errors or erroneous syllabic stress assignments were detected. Multiple mistakes within the same name were considered as one error. Thus, we made a binary decision between correct and incorrect transcriptions.</Paragraph>
      <Paragraph position="1"> Table 2 summarizes the results. On the training data, in 250 out of a total of 631 names (39.6%) at least one of the two systems was incorrect. In 72  out of these 250 cases (28.8%) both systems were wrong. Thus, for 72 out of 631 names (11.4%) no correct transcription was obtained by either system. On the test data, at least one of the two systems was incorrect in 82 out of a total of 206 names (39.8%), an almost identical result as for the training data. However, in 26 out of these 82 cases (31.7%) both systems were wrong. In other words, no correct transcription was obtained by either system for 26 out of 206 names (12.6%), which is only slightly higher than for the training data.</Paragraph>
      <Paragraph position="2"> Table 3 compares the performances of the two text analysis systems. On the training data, the new system outperforms the old one in 138 of the 163 cases (84.7%) where one of the systems was correct and the other one was wrong; we disregard here all cases where both systems were correct as well as the 87 names for which no correct transcription was given by either system. But there were also 25 cases (15.3%) where the old system outperformed the new one. Thus, the net improvement by the name-specific system over the old one is 69.4%.</Paragraph>
      <Paragraph position="3"> On the test data set, the old system gives the correct solution in 15 of 50 cases (30.0%), compared to 35 names (70.0%) for which the new system gives the correct transcription; again, all cases were excluded in which both systems performed equally well or poorly. The net improvement by the name-specific system over the generic one on the test data is thus 40%.</Paragraph>
      <Paragraph position="4"> A detailed error analysis yielded the following  strings (hyper-correction over old system); e.g., Rim+par+strafle \[ri:mpa~\] instead of Rimpar+strafle \[rimpa~\].</Paragraph>
      <Paragraph position="5"> * Pronunciation rules: &amp;quot;Holes&amp;quot; in the general-purpose pronunciation rule set were revealed by orthographic substrings that do not occur in the regular lexicon. It has been shown for English (van Santen, 1992) that the frequency distribution of triphones in names is quite dissimilar to the one found in regular words.</Paragraph>
      <Paragraph position="6"> * Idiosyncrasies: Peculiar pronunciations that cannot be described by rules and that even native speakers quite often do not know or do not agree upon; e.g., Oeynhausen \[C/~:nhauzon\], Duisdorf \[dy:sd~f\] or \[du:sd~f\] or \[du:isd~f\].</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML