File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/w04-1107_abstr.xml

Size: 979 bytes

Last Modified: 2025-10-06 13:43:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1107">
  <Title>Chinese Chunking with another Type of Spec</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Spec is a critical issue for automatic chunking.</Paragraph>
    <Paragraph position="1"> This paper proposes a solution of Chinese chunking with another type of spec, which is not derived from a complete syntactic tree but only based on the un-bracketed, POS tagged corpus. With this spec, a chunked data is built and HMM is used to build the chunker. TBL-based error correction is used to further improve chunking performance. The average chunk length is about 1.38 tokens, F measure of chunking achieves 91.13%, labeling accuracy alone achieves 99.80% and the ratio of crossing brackets is 2.87%. We also find that the hardest point of Chinese chunking is to identify the chunking boundary inside</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML