File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1104_intro.xml

Size: 1,296 bytes

Last Modified: 2025-10-06 14:02:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1104">
  <Title>Adaptive Compression-based Approach for Chinese Pinyin Input</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Chinese words comprise ideographic and pictographic characters. Unlike English, these characters can't be entered by keyboard directly.</Paragraph>
    <Paragraph position="1"> They have to be transliterated from keyboard input based on different input methods. There are two main approaches: phonetic-based input methods such as Pinyin input and structure-based input methods such as WBZX. Pinyin input is the easiest to learn and most widely used. WBZX is more difficult as the user has to remember all the radical parts of each character, but it is faster.</Paragraph>
    <Paragraph position="2"> Early products using Pinyin input methods are very slow because of the large number of homonyms in the Chinese language. The user has to choose the correct character after each Pinyin has been entered. The situation in current products such as Microsoft IME for Chinese and Chinese Star has been improved with the progress in language modelling (Goodman, 2001) but users are still not satisfied.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML