File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/w04-3236_relat.xml

Size: 2,032 bytes

Last Modified: 2025-10-06 14:15:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-3236">
  <Title>Chinese Part-of-Speech Tagging: One-at-a-Time or All-at-Once? Word-Based or Character-Based?</Title>
  <Section position="11" start_page="89" end_page="89" type="relat">
    <SectionTitle>
7 Related Work
</SectionTitle>
    <Paragraph position="0"> Much previous research on Chinese language processing focused on word segmentation (Sproat et al., 1996; Teahan et al., 2000; Sproat and Emerson, 2003). Relatively less work has been done on Chinese POS tagging. Kwong and Tsou (2003) discussed the implications of POS ambiguity in Chinese and the possible approaches to tackle this problem when tagging a corpus for NLP tasks. Zhou and Su (2003) investigated an approach to build a Chinese analyzer that integrated word segmentation, POS tagging and parsing, based on a hidden Markov model. Jing et al. (2003) focused on Chinese named entity recognition, considering issues like character-based versus word-based approaches.</Paragraph>
    <Paragraph position="1"> To our knowledge, our work is the first to systematically investigate issues of processing architecture and feature representation for Chinese POS tagging.</Paragraph>
    <Paragraph position="2"> Our maximum entropy word segmenter is similar to that of (Xue and Shen, 2003), but the additional features we used and the post-processing step gave improved word segmentation accuracy.</Paragraph>
    <Paragraph position="3"> The research most similar to ours is (Luo, 2003). Luo presented a maximum entropy character-based parser, which as a consequence of parsing also performed word segmentation and POS tagging. The all-at-once, character-based approach reported in this paper is essentially the approach proposed by Luo. While our investigation reveals that such an approach gives good accuracy, our findings however indicate that a one-at-a-time, character-based approach to POS tagging gave quite comparable accuracy, with the benefit of incurring much reduced computational cost.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML