File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/97/w97-1203_abstr.xml

Size: 1,453 bytes

Last Modified: 2025-10-06 13:49:10

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1203">
  <Title>A compact Representation of prosodically relevant Knowledge in a Speech Dialogue System</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> The acceptance of speech dialogue systems by the user is critically dependent on the degree of &amp;quot;naturalness&amp;quot; realized. The speech generation and synthesis modules have to be able to run in real time and to produce high-quality speech output.</Paragraph>
    <Paragraph position="1"> To produce naturally sounding speech, the synthesizer has to have not only the knowledge of the words to utter and the order in which they appear but also information about their structural relationship. The latter is expressed acoustically in the form of prosody, i.e. how the voice raises and falls during an utterance, the rhythm, where pauses are set, etc. Prosody is also influenced by the properties associated with given words in the context of an utterance, e.g. the focus of a sentence or certain emphatic elements. This article describes a compact representation for conveying this type of information from the generator to the synthesizer in a modular system and describes how (parts of) this information is (are) derived in the EFFENDI system, the generation module for a speech dialogue system for train inquiries.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML