File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/w02-0713_intro.xml

Size: 2,468 bytes

Last Modified: 2025-10-06 14:01:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0713">
  <Title>Sharing Problems and Solutions for Machine Translation of Spoken and Written Interaction</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The informal, dialogic character of oral interaction imposes demands on translation systems that are not encountered in well-formed, monologic texts. These differences make it appear that any similarities between the machine translation of text and speech will be limited to core translation components, as opposed to preand post-processing operations that are linked to the medium.</Paragraph>
    <Paragraph position="1"> In this paper, we demonstrate that many challenges of translating spoken interaction are also encountered in translating written interaction such as chat or instant messaging.</Paragraph>
    <Paragraph position="2"> Consequently, it is proposed that solutions developed for these common problems can be shared by researchers engaged in applying machine translation technologies to both types of interaction. Specifically, preprocessing operations can address many of the problems that make dialogic interaction difficult to translate in both spoken and written media.</Paragraph>
    <Paragraph position="3"> After surveying the challenges that are shared in machine translation of spoken and written interaction, we identify several areas in which preprocessing solutions have been proposed that could be fruitfully adopted for either spoken or written input. The speech recognition problem of discriminating out of vocabulary words from unrecognized vocabulary words is equivalent to the problem of discriminating novel forms that emerge in chat environments from words that are unrecognized due to nonstandard spellings. We suggest that a solution based on templates like those used in example-based translation could be a useful approach to the problem for both spoken and written input. Similarly, other preprocessing operations that tag input for special processing can be used to facilitate translation of problematic phenomena such as discourse markers and vocatives. Finally, we explore the possibility that the complexity of translating interaction can be reduced by translating smaller packages of input and exploiting participants' strategies for packaging certain discourse functions in smaller turn units.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML