File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-3706_intro.xml

Size: 4,218 bytes

Last Modified: 2025-10-06 14:04:15

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3706">
  <Title>Converser(TM): Highly Interactive Speech-to-Speech Translation for Healthcare</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Spoken Translation, Inc. (STI) of Berkeley, CA has developed a commercial system for interactive speech-to-speech machine translation designed for both high accuracy and broad linguistic and topical coverage. Planned use is in situations requiring both of these features, for example in helping Spanish-speaking patients to communicate with English-speaking doctors, nurses, and other health-care staff.</Paragraph>
    <Paragraph position="1"> The twin goals of accuracy and broad coverage have until now been in opposition: speech translation systems have gained tolerable accuracy only by sharply restricting both the range of topics which can be discussed and the sets of vocabulary and structures which can be used to discuss them.</Paragraph>
    <Paragraph position="2"> The essential problem is that both speech recognition and translation technologies are still quite error-prone. While the error rates may be tolerable when each technology is used separately, the errors combine and even compound when they are used together. The resulting translation output is generally below the threshold of usability - unless restriction to a very narrow domain supplies sufficient constraints to significantly lower the error rates of both components.</Paragraph>
    <Paragraph position="3"> STI's approach has been to concentrate on interactive monitoring and correction of both technologies. null First, users can monitor and correct the speaker-dependent speech recognition system to ensure that the text, which will be passed to the machine translation component, is completely correct. Voice commands (e.g. Scratch That or Correct &lt;incorrect text&gt;) can be used to repair speech recognition errors. While these commands are similar in appearance to those of IBM's ViaVoice or ScanSoft's Dragon NaturallySpeaking dictation systems, they are unique in that they will remain usable even when speech recognition operates at a server. Thus, they will provide for the first time the capability to interactively confirm or correct wide-ranging text, which is dictated from anywhere. null Next, during the MT stage, users can monitor, and if necessary correct, one especially important aspect of the translation - lexical disambiguation.</Paragraph>
    <Paragraph position="4"> STI's approach to lexical disambiguation is twofold: first, we supply a specially controlled back translation, or translation of the translation. Using this paraphrase of the initial input, even a monolingual user can make an initial judgment concerning the quality of the preliminary machine translation output. To make this technique effective, we use proprietary facilities to ensure that the lexical senses used during back translation are appropriate. null In addition, in case uncertainty remains about the correctness of a given word sense, we supply a proprietary set of Meaning Cues(TM) - synonyms, definitions, etc. - which have been drawn from various resources, collated in a unique database (called SELECT(TM)), and aligned with the respective lexica of the relevant machine translation systems. With these cues as guides, the user can select the preferred meaning from among those available.</Paragraph>
    <Paragraph position="5"> Automatic updates of translation and back translation then follow.</Paragraph>
    <Paragraph position="6"> The result is an utterance, which has been monitored and perhaps repaired by the user at two levels - those of speech recognition and translation. By employing these interactive techniques while integrating state-of-the-art dictation and machine translation programs - we work with Dragon Naturally Speaking for speech recognition; with Word Magic MT (for the current Spanish system); and with ScanSoft for text-to-speech - we have been able to build the first commercial-grade speech-to-speech translation system which can achieve broad coverage without sacrificing accuracy. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML