File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/93/h93-1042_intro.xml

Size: 2,579 bytes

Last Modified: 2025-10-06 14:05:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="H93-1042">
  <Title>A SPEECH TO SPEECH TRANSLATION SYSTEM BUILT FROM STANDARD COMPONENTS</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> From standard components and a suite of generalizable customization techniques, we have developed an English to Swedish speech translation system in the air travel planning (ATIS) domain. The modular architecture consists of a pipelined series of processing phases that each output multiple hypotheses filtered by statistical preference mechanisms. 2 The statistical information used in the system is derived from automatic processing of domain corpora. The architecture provides greater robustness than a 1-best approach, and yet is more computationally tractable and more portable to new languages and domains than a tight integration, because of the modularity of the components: speech recognition, source language processing, source to target language transfer, target language processing, and speech synthesis. null Some aspects of adaptation to the domain task were fairly simple: addition of new lexical entries was facilitated by existing tools, and grammar coverage required  put and speech synthesis has not yet been implemented.</Paragraph>
    <Paragraph position="1"> adding only a few very domain-specific phrase structure rules, as described in Section 3.1. Much of the effort in the project, however, has focussed on the development of well-specified methods for adapting and customizing other aspects of the existing modules, and on tools for guiding the process. In addition to the initial results (Section 5), the reported work makes several contributions to speech translation in particular and to language processing in general: A general method for training statistical preferences to filter multiple hypotheses, for use in ranking both analysis and translation hypotheses (Section 3.2); A method for rapid creation of a grammar for the target language by exploiting overlapping syntactic structures in the source and target languages (Section 3.3); An Explanation Based Learning (EBL) technique for automatically chunking the grammar into commonly occurring phrase-types, which has proven valuable in maximizing return on effort expended on coverage extension, and a set of procedures for automatic testing and reporting that helps to ensure smooth integration across aspects of the effort performed at the various sites involved (Section 4).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML