XML Viewer - w06-1630

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-1630_concl.xml
Size: 1,969 bytes
Last Modified: 2025-10-06 13:55:36
<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1630">
  <Title>Unsupervised Named Entity Transliteration Using Temporal and Phonetic Correlation</Title>
  <Section position="7" start_page="255" end_page="256" type="concl">
    <SectionTitle>
5 Conclusions and Future Work
</SectionTitle>
    <Paragraph position="0"> In this paper we have discussed the problem of name transliteration as one component of a system for finding matching names in comparable corpora. We have proposed two unsupervised methods for transliteration, one that is based on carefully designed measures of phonetic correspondence and the other that is based on the temporal distribution of words. We have shown that both methods yield good results, and that even better results can be achieved by combining the methods.</Paragraph>
    <Paragraph position="1"> One particular area that we will continue to work on is phonetic distance. We believe our hand-assigned costs are a reasonable starting point if one knows nothing about the particular pair of languages in question. However one could also train such costs, either from an existing list of known transliterations, or as part of an iterative bootstrapping method as, for example, in Yarowsky and Wicentowski's (2000) work on morphological induction.</Paragraph>
    <Paragraph position="2"> The work we report is ongoing and is part of a larger project on multilingual named entity recognition and transliteration. One of the goals of this project is to develop tools and resources for underresourced languages. Insofar as the techniques we have proposed have been shown to work on three language pairs involving one source language (English) and three unrelated and quite different target languages, one can reasonably claim that the techniques are language-independent. Furthermore, as  the case of Hindi shows, even with data from completely different news agencies we are able to extract useful correspondences.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML