File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/93/h93-1068_evalu.xml

Size: 4,684 bytes

Last Modified: 2025-10-06 14:00:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="H93-1068">
  <Title>PERCEIVED PROSODIC BOUNDARIES AND THEIR PHONETIC CORRELATES</Title>
  <Section position="5" start_page="342" end_page="343" type="evalu">
    <SectionTitle>
3. RESULTS
</SectionTitle>
    <Paragraph position="0"> For all three speakers, a high correlation was found between the PBS's obtained in the normal and delexicalized test versions (r, = .78, p &lt; .01). This warrants the conclusion that, in this experiment, syntactic and semantic factors did not affect the listeners' judgments. The delexicalized test version is ignored in the rest of this paper.</Paragraph>
    <Section position="1" start_page="342" end_page="342" type="sub_section">
      <SectionTitle>
3.1. Perceptual Boundary Strength And
</SectionTitle>
      <Paragraph position="0"> made more extensive use of all three phonetic cues than the other two speakers and was the only one to employ declination resets in a systematic fashion. Not shown in the table is the t~ct that there were also clear differences between the speakers in preferred type of melodic discontinuity. null Combinations of the three cues can be considered as possible phonetic strategies of the speakers to mark prosodic boundaries. Figure 2 shows the relation of these strategies to PBS. Generally speaking, PBS values are higher as more phonetic cues are associated with a given word boundary. While the speakers differ in their preferences for certain strategies, the impact of strategy on Pi\]S is roughly the same across speakers.</Paragraph>
      <Paragraph position="1"> Additional trends not shown in l&amp;quot;igure 2 are the following. First, there was a trend for longer pauses to be associated with greater t'BS's for all speakers. As for the tour types of melodic discontinuity, the main tendency is that melodic discontinuity inw)lving a continuation rise '2' (a steep pitch rise ve.ry late in the pre-boundary syll-</Paragraph>
      <Paragraph position="3"> nation reset, pse = pause, int = melodic cue, 0 and I = absence or presence of a cue.</Paragraph>
      <Paragraph position="4"> able) is associated with greater PBS's than other types. The data show strong interactions between the three phonetic cues. &amp;quot;/'he main observations are that l) the presence of a declination reset implies the presence of a pause in all cases, 2) the presence of a pause implies the presence of a melodic discontinuity in about 80% of the cases, for all speakers, 3) pauses not accompanied by a melodic cue are usually shorter than 100 ms, and 4) it is quite common for word boundaries to be marked only by a melodic cue.</Paragraph>
    </Section>
    <Section position="2" start_page="342" end_page="343" type="sub_section">
      <SectionTitle>
3.2. Perceptual Boundary Strength And
P~sodic Boundaries
</SectionTitle>
      <Paragraph position="0"> The prosodic analysis of the test material consisted of the application of the latest version of the so-called Pros-3 algorithm \[8\]. This is a program currently under development at IPO to automatically determine accent and prosodic phrase structure of sentences on the basis of syntactic and metrical analysis. In this way, each word boundary was assigned to one of three predicted prosodic boundary categories: no boundary, Phi-boundary or I-boundary.</Paragraph>
      <Paragraph position="1">  As can be seen in figure 3, word boundaries that were designated as I-boundaries by the Pros-3 algorithm have greater PBS's than Phi-boundaries, while these in turn are perceived as stronger than unlabelled boundaries. This effect is apparent for all speakers, but is clearest in the  be exploited tentatively in text-to-speech conversion and in automatic speech recognition.</Paragraph>
      <Paragraph position="2"> Finally, the gradience that can be observed in the PBS values of Figure 2, shows that listeners can do better than merely distinguishing between presence or absence of a boundary. On the basis of our limited set of data, it is not possible to determine exactly how many categories listeners can discriminate reliably. This will be partly determined by the number and the nature of the phonetic cues, and need not be limited to a maximum of three (no boundary, minor boundary, major boundary). Indeed, Figure 2 suggests that it is not unreasonable to assume that listeners can handle five PBS categories. Interestingly, such a five-level distinction is used in the TOBI labelling scheme \[9\]. However, an important difference between the two approaches is that the TOBI scheme obliges labelers to explicitly assess the nature of the phonetic cues, while PBS values are the result of a purely intuitive judgment. null</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML