File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-2171_intro.xml
Size: 2,483 bytes
Last Modified: 2025-10-06 14:06:40
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2171"> <Title>From Information Structure to Intonation: A Phonological Interface for Concept-to-Speech</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The task of interfacing between a tactical generator and a speech synthesizer is two-fold: A grammatical description enriched with semantic and pragmatic features has to be translated into a (qualitative) phonological description which then has to be mapped onto the set of (quantitative) parameter values needed as input to the synthesizer.</Paragraph> <Paragraph position="1"> The requirements imposed by a concept-to-speech system differ from those on both text generation and text-to-speech systems. In * This work has been sponsored by the Fonds zur FSrderung der wissenschaftlichen Forschung (FWF), Grant No. P10822.</Paragraph> <Paragraph position="2"> text generation the generator produces a sequence of abstract descriptions of word forms which are-either by direct access to a lexicon or via a morphological component-transformed into strings of graphemes and output. With concept-to-speech the task is more complex.</Paragraph> <Paragraph position="3"> Not only is segmental information influenced by morphonology and post-lexical rules (covering, e.g., reduction and assimilation phenomena) but-more important-suprasegmental information must be provided as well.</Paragraph> <Paragraph position="4"> Compared to text-to-speech the task is at the same time easier and more difficult. Information from pragmatic, semantic and syntactic layers are readily available. This eliminates the need to analyze an input text for necessary cues to come up with proper pronunciation and prosody. On the other hand all this information must be properly accounted for to come up with an adequate description of the utterance that-when fed into the synthesizerproduces high-quality output. In particular, pragmatic-semantic features must be mapped onto (abstract) prosodic features.</Paragraph> <Paragraph position="5"> We employ an extended version of two-level morphology (Trost 91) for this interface) The formalism proved to be very well suited for the task. The various Mmost independent subsystems can be kept conceptually separate resulting in good transparency while at the same time enabling the necessary amount of interaction between them.</Paragraph> </Section> class="xml-element"></Paper>