File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/h92-1099_metho.xml

Size: 4,181 bytes

Last Modified: 2025-10-06 14:13:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1099">
  <Title>Evaluating the Use of Prosodic Information in Speech Recognition and Understanding</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
PROJECT GOALS
</SectionTitle>
    <Paragraph position="0"> The goal of this project is to investigate the use of different levels of prosodic information in speech recognition and understanding. In particular, the current focus of the work is the use of prosodic phrase boundary information in parsing. The research involves determining a representation of prosodic information suitable for use in a speech understanding system, developing reliable algorithms for detection of the prosodic cues in speech, investigating architectures for integrating prosodic cues in a parser, and evaluating the potential improvements of prosody in the context of the SRI Spoken Language System. This research is sponsored jointly by DARPA and NSF. *</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
RECENT RESULTS
</SectionTitle>
    <Paragraph position="0"> * Developed an algorithm for recognizing intonational cues (pitch accents or prominences and boundary tones) using a modified version of our previously de- * veloped break detection algorithm. Improved break detection performance by including probability of boundary tone as an additional feature. * * Extended previous work in analysis/synthesis parse scoring by introducing a new probabilistic scot- .</Paragraph>
    <Paragraph position="1"> ing technique that uses a decision tree to predict prosodic phrase breaks. Disambiguation results show that this automatically trainable synthesis technique yields performance comparable to the rule-based synthesis algorithms previously investigated, even though the syntactic structures represented in the training corpus were quite different * from those in the testing corpus.</Paragraph>
    <Paragraph position="2"> * Investigated, in conjunction with Sl:tI's SLS project, acoustic attributes of hypothesized repair locations, finding that relative durations of two repeated words and existence and duration of an intervening pause can be reliable cues to repairs.</Paragraph>
    <Paragraph position="3"> * Discovered that pause fillers in spontaneous speech were a frequent source of recognition errors, analyzed acoustic cues to pause fillers, and found a significant difference between the pitch of a pause filler and that of its local context, suggesting a simple detection algorithm.</Paragraph>
    <Paragraph position="4"> Played a major role in organizing and leading a workshop aimed at developing a common core prosodic transcription standard; the impending availability of large corpora of data including syntactic annotations makes the need for agreement on prosodic standards especially critical.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="466" type="metho">
    <SectionTitle>
PLANS FOR THE COMING YEAR
</SectionTitle>
    <Paragraph position="0"> Evaluate the break index and prominence recognition algorithms on paragraphs of speech (as opposed to sentences) and on spontaneous speech as opposed to read speech. Investigate new acoustic features for improving recognition results.</Paragraph>
    <Paragraph position="1"> Extend parse scoring algorithm to include prominence information making use of tree-based prominence prediction algorithms.</Paragraph>
    <Paragraph position="2"> Utilize the parse scoring algorithm in speech understanding. null Continue study of acoustic cues to repairs and other spontaneous speech effects; specifically, analyze the proportion of such events in different SLS configurations, the effect of such events on types of errors in different SLS configurations, and the role of intonation and duration patterns in their detection.</Paragraph>
    <Paragraph position="3"> Coordinate project with research on prosody at IPO, Eindhoven; this laboratory is the research center for the largest group in the world focussed in prosody. Dr. Price will spend three months at this institution working on the current project half-time.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML