File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-0313_intro.xml
Size: 2,391 bytes
Last Modified: 2025-10-06 14:01:56
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-0313"> <Title>Translation Spotting for Translation Memories</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Translation spotting is the term coined by V'eronis and Langlais (2000) for the task of identifying the word-tokens in a target-language (TL) translation that correspond to some given word-tokens in a source-language (SL) text. Translation spotting (TS) takes as input a couple, i.e. a pair of SL and TL text segments, which are known to be translations of one another, and a SL query, i.e. a subset of the tokens of the SL segment, on which the TS will focus its attention. The result of the TS process consists of two sets of tokens, i.e. one for each language.</Paragraph> <Paragraph position="1"> We call these sets the SL and TL answers to the query.</Paragraph> <Paragraph position="2"> In more formal terms: * The input to the TS process is a pair of SL and TL text segments <S,T> , and a contiguous, non-empty sequence of word-tokens in S, q = si1...si2 (the query).</Paragraph> <Paragraph position="3"> * The output is a pair of sets of tokens <rq(S),rq(T)> , the SL answer and TL answer respectively.</Paragraph> <Paragraph position="4"> Figure 1 shows some examples of TS, where the words in italics represent the SL query, and the words in bold are the SL and TL answers.</Paragraph> <Paragraph position="5"> As can be seen in these examples, the tokens in the query q and answers rq(S) and rq(T) may or may not be contiguous (examples 2 and 3), and the TL answer may possibly be empty (example 4) when there is no satisfying way of linking TL tokens to the query.</Paragraph> <Paragraph position="6"> Translation spotting finds different applications, for example in bilingual concordancers, such as the TransSearch system (Macklovitch et al., 2000), and example-based machine translation (Brown, 1996). In this article, we focus on a different application: a sub-sentential translation memory. We describe this application context in section 2, and discuss how TS fits in to this type of system. We then propose in section 3 a series of TS methods, specifically adapted to this application context. In section 4, we present an empirical evaluation of the proposed methods.</Paragraph> </Section> class="xml-element"></Paper>