File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/p03-2039_concl.xml

Size: 930 bytes

Last Modified: 2025-10-06 13:53:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-2039">
  <Title>Chinese Unknown Word Identification Using Character-based Tagging and Chunking</Title>
  <Section position="7" start_page="0" end_page="0" type="concl">
    <SectionTitle>
7 Conclusion
</SectionTitle>
    <Paragraph position="0"> We proposed an &amp;quot;all-purpose&amp;quot; method for Chinese unknown word detection. Our method is based on an morphological analysis that generates segmentations and POS tags using Markov Models, followed by a chunking based on character features using Support Vector Machines. We have also shown that character based features yields better results than word based features in the chunking process. Our experiments showed that the proposed method is able to detect person names and organization names quite accurately and is also quite satisfactory even for low frequency unknown words in the corpus.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML