File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1119_intro.xml

Size: 3,934 bytes

Last Modified: 2025-10-06 14:02:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1119">
  <Title>Back Transliteration from Japanese to English Using Target English Context</Title>
  <Section position="2" start_page="0" end_page="1" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In transliteration, a word in one language is converted into a character string of another language expressing how it is pronounced. In the case of transliteration into Japanese, special characters called katakana are used to show how a word is pronounced. For example, a personal name and its transliterated word are shown below.</Paragraph>
    <Paragraph position="1"> Cunningham kaningamu (ka ni n ga mu) [Transliteration] Here, the italic alphabets are romanized Japanese katakana characters.</Paragraph>
    <Paragraph position="2"> New transliterated words such as personal names or technical terms in katakana are not always listed in dictionaries. It would be useful for cross-language information retrieval if these words could be automatically restored to the original English words.</Paragraph>
    <Paragraph position="3"> Back transliteration is the process of restoring transliterated words to the original English words. Here is a problem of back transliteration.</Paragraph>
    <Paragraph position="4"> ? kuratutihuirudo (English word) (ku ra cchi fi - ru do) [Back transliteration] There are many ambiguities to restoring a transliterated katakana word to its original English word. For example, should &amp;quot;a&amp;quot; in &amp;quot;ku ra cchi fi - ru do&amp;quot; be converted into the English letter of &amp;quot;a&amp;quot; or &amp;quot;u&amp;quot; or some other letter or string? Trying to resolve the ambiguity is a difficult problem, which means that back transliteration to the correct English word is also difficult.</Paragraph>
    <Paragraph position="5"> Using the pronunciation of a dictionary or limiting output English words to a particular English word list prepared in advance can simplify the problem of back transliteration. However, these methods cannot produce a new English word that is not registered in a dictionary or an English word list. Transliterated words are mainly proper nouns and technical terms, and such words are often not registered. Thus, a back transliteration framework for creating new words would be very useful.</Paragraph>
    <Paragraph position="6"> A number of back transliteration methods for selecting English words from an English pronun- null Their English letter-to-sound WFST does not convert English words that are not registered in a pronunciation dictionary. null 1998), and Korean-to-English (Lin and Chen, 2002).</Paragraph>
    <Paragraph position="7"> There are also methods that select English words from an English word list, e.g., Japanese-to-English (Fujii and Ishikawa, 2001) and Chinese-to-English (Chen et al., 1998).</Paragraph>
    <Paragraph position="8"> Moreover, there are back transliteration methods capable of generating new words, there are some methods for back transliteration from Korean to English (Jeong et al., 1999; Kang and Choi, 2000).</Paragraph>
    <Paragraph position="9"> These previous works did not take the target English context into account for calculating the plausibility of matching target characters with the source characters.</Paragraph>
    <Paragraph position="10"> This paper presents a method of taking the target English context into account to generate an English word from a Japanese katakana word.</Paragraph>
    <Paragraph position="11"> Our character-based method can produce new English words that are not listed in the learning corpus.</Paragraph>
    <Paragraph position="12"> This paper is organized as follows. Section 2 describes our method. Section 3 describes the experimental set-up and results. Section 4 discusses the performance of our method based on the experimental results. Section 5 concludes our research.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML