File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-0409_intro.xml
Size: 2,766 bytes
Last Modified: 2025-10-06 14:06:23
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0409"> <Title>Interactive Speech Translation in the DIPLOMAT Project</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The DIPLOMAT project is designed to explore the feasibility of creating rapid-deployment, wearable bi-directional speech translation systems. By &quot;rapid-deployment&quot;, we mean being able to develop an MT system that performs initial translations at a useful level of quality between a new language and English within a matter of days or weeks, with continual, graceful improvement to a good level of quality over a period of months.</Paragraph> <Paragraph position="1"> The speech understanding component used is the SPHINX II HMM-based speaker-independent continuous speech recognition system (Huang el al., 1992; Ravishankar, 1996), with techniques for rapidly developing acoustic and language models for new languages (Rudnicky, 1995). The machine translation (MT) technology is the Multi-Engine Machine Translation (MEMT) architecture (Frederking and Nirenburg, 1994), described further below. The speech synthes!s component is a newly-developed concatenative system (Lenzo, 1997) based on variable-sized compositional units.</Paragraph> <Paragraph position="2"> This use of subword concatenation is especially important, since it is the only currently available method for rapidly bringing up synthesis for a new language. DIPLOMAT thus involves research in MT, speech understanding and synthesis, interface design, as well as wearable computer systems. While beginning our investigations into new semi-automatic techniques for both speech and MT knowledge-base development, we have already produced an initial bidirectional system for Serbo-Croatian ~ English speech translation in less than a month, and are currently developing Haitian-Creole ~ English and Korean ~ English systems.</Paragraph> <Paragraph position="3"> A major concern in the design of the DIPLOMAT system has been to cope with the error-prone nature of both current speech understanding and MT technology, to produce an application that is usable by non-translators with a small amount of training. We attempt to achieve this primarily through user interaction: wherever feasible, the user is presented with intermediate results, and allowed to correct them. In this paper, we will briefly describe the machine translation architecture used in DIPLOMAT (showing how it is well-suited for interactive user correction), describe our approach to rapid-deployment speech recognition and then discuss our approach to interactive user correction of errors in the over-all system.</Paragraph> </Section> class="xml-element"></Paper>