File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/03/n03-2035_abstr.xml

Size: 1,201 bytes

Last Modified: 2025-10-06 13:42:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-2035">
  <Title>A Context-Sensitive Homograph Disambiguation in Thai Text-to-Speech Synthesis</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Homograph ambiguity is an original issue in Text-to-Speech (TTS). To disambiguate homograph, several efficient approaches have been proposed such as part-of-speech (POS) n-gram, Bayesian classifier, decision tree, and Bayesian-hybrid approaches. These methods need words or/and POS tags surrounding the question homographs in disambiguation.</Paragraph>
    <Paragraph position="1"> Some languages such as Thai, Chinese, and Japanese have no word-boundary delimiter.</Paragraph>
    <Paragraph position="2"> Therefore before solving homograph ambiguity, we need to identify word boundaries. In this paper, we propose a unique framework that solves both word segmentation and homograph ambiguity problems altogether.</Paragraph>
    <Paragraph position="3"> Our model employs both local and long-distance contexts, which are automatically extracted by a machine learning technique called Winnow.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML