File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-0140_intro.xml

Size: 1,671 bytes

Last Modified: 2025-10-06 14:03:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0140">
  <Title>Chinese Named Entity Recognition with a Multi-Phase Model</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Named entity recognition (NER) is a fundamental component for many NLP applications, such as Information extraction, text Summarization, machine translation and so forth. In recent years, much attention has been focused on the problem of recognition of Chinese named entities. The problem of Chinese named entity recognition is difficult and challenging, In addition to the challenging difficulties existing in the counterpart problem in English, this problem also exhibits the following more difficulties: (1) In a Chinese document, the names do not have &amp;quot;boundary tokens&amp;quot; such as the capitalized initial letters for a person name in an English document. (2) There is no space between words in Chinese text, so we have to segment the text before NER is performed. null In this paper, we report a Chinese named entity recognition system using a multi-phase model which includes a basic segmentation phase and three named entity recognition phases.</Paragraph>
    <Paragraph position="1"> In our system, the implementations of basic segmentation components and named entity recognition component are both based on conditional random fields (CRFs) (Lafferty et al., 2001). At last, we apply the rule method to recognize some simple and short location names and organization names in the text. We will describe each of these phases in more details below.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML