File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/c96-1070_intro.xml

Size: 3,848 bytes

Last Modified: 2025-10-06 14:05:58

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-1070">
  <Title>Incremental Translation Utilizing Constituent Boundary Patterns</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> A system dealing with spoken language requires a quick response in order to provide smooth communication between humans or between a human and a computer. Thereibre, assuring efficiency in spoken-language translation is one of the most crucial tasks in devising such a system.</Paragraph>
    <Paragraph position="1"> In spoken language, the translation of lengthy utterances can yield a huge amount of structural ambiguity, which needs to be efficiently processed by the system. As a solution for achieving an efficient spoken-language system, several techniques, such as incremental generation (Finkler, 1992; Kempen, 1987) and marker-passing memory-based translation (Kitano, 1994), have been proposed. Many of these techniques adopt a left-to-right strategy to handle an input incrementally and a best-first strategy to avoid the explosion of structural ambiguity. These strategies (:an be achieved with bottom-up processing.</Paragraph>
    <Paragraph position="2"> We have already proposed Transfer-Driven Machine Translation (TDMT) for efficient and robust spoken-language translation (Furuse, 1994a; Furuse, 1994b). However, the top-down and breadth-first translation strategy in the earlier versions of TDMT, which yields a quick response for inputs with restricted lengths, may show poor efficiency when processing a very lengthy input or inputs having many competing structures.</Paragraph>
    <Paragraph position="3"> In a top-down and breadth-first application, all the possible structures are retained until the whole input string is parsed. This requires many computations and results in inefficient translation. For instance, the sentence below has many competing structures, mainly because of possible combinations within noun sequences. If this expression is combined with another expression, the structurM ambiguity will be further compounded.</Paragraph>
    <Paragraph position="4"> With bacon chicken eggs lettuce and tomato on it.</Paragraph>
    <Paragraph position="5"> In contrast, if structural ambiguities of sub-strings are always settled and are never inherited to the upper structures, the explosion of structurM ambiguity could be constrained. Thus, an incremental strategy that fixes partial results is necessary for efficient processing and is achieved by bottom-up processing in left-to-right order.</Paragraph>
    <Paragraph position="6"> This paper proposes TDMT using an incremental strategy for achieving efficient translation of a lengthy input or one having a lot of structural ambiguity. In this method, several constituent boundary patterns are applied to an input string in a bottom-up fashion. This bottom-up application, based on the concept of chart parsing, can constrain the explosion of structural ambiguity by dealing with best-only substructures using semantic distance calculations.</Paragraph>
    <Paragraph position="7"> In this paper, we will first; outline our new translation strategy. We will then explain how constituent boundary patterns can be used to describe the structure of an input string in TDMT.</Paragraph>
    <Paragraph position="8"> Then we will describe the bottom-up pattern application, based on chart parsing. Next, we will show how the explosion of structural ambiguity is constrained by dealing with the best-only substructures, based on semantic distance calculations. By comparing the preliminary experimental results from the former top-down method and those from our new method, we will demonstrate the usefulness of our new method. A summary of our approach will conclude the paper.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML