File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-2220_intro.xml

Size: 2,365 bytes

Last Modified: 2025-10-06 14:06:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2220">
  <Title>Automatic English-Chinese name transliteration for development of multilingual resources</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In the context of multilingual natural language processing systems which aim for coverage of both languages using a roman alphabet and languages using other alphabets, the development of lexical resources must include mechanisms for handling words which do not have standard translations. Words falling into this category are words which do not have any obvious semantic content, e.g. most indo-european personal and place names, and which can therefore not simply be mapped to translation equivalents.</Paragraph>
    <Paragraph position="1"> In this paper, we examine the problem of generating Chinese characters which correspond to English personal and place names. Section 2 introduces the basic principles of English-Chinese transliteration, Section 3 identifies issues specific to the domain of name transliteration, and Section 4 introduces a rule-based algorithm for automatically performing the name transliteration. In Section 5 we present an example of the application of the algorithm, and in Section 6 we discuss extensions to improve the robustness of the algorithm.</Paragraph>
    <Paragraph position="2"> Our need for automatic transliteration mechanisms stems from a multilingual text generation system which we are currently constructing, on the basis of an English-language database containing descriptive information about museum objects (the POWER system; Verspoor et al 1998). That database includes fields such as manufacturer, with values of personal and place names. Place names and personal names do not fall into a well-defined set, nor do they have semantic content which can be expressed in other languages through words equivalent in meaning.</Paragraph>
    <Paragraph position="3"> As more objects are added to our database (as will happen as a museum acquires new objects), new names will be introduced, and these must also be added to the lexica for each language in the system. We require an automatic procedure for achieving this, and concentrate here on techniques for the creation of a Chinese lexicon.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML