File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/92/c92-4199_abstr.xml
Size: 1,123 bytes
Last Modified: 2025-10-06 13:47:30
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-4199"> <Title>Recognizing Unregistered Names for Mandarin Word Identification</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> Word Identification has been an important and active issue in Chinese Natural Language Processing.</Paragraph> <Paragraph position="1"> In this paper, a new mechanism, based on the concept of sublanguage, is proposed for identifying unknown words, especially personal names, in Chinese newspapers. The proposed mechanism includes title.driven name recognition, adaptive dynamic word formation, identification of Z-character and 3-character Chinese names without title. We will show the e~:perimental results for two corpora and compare them with the results by the NTIIU's statistic-based system, the only system that we know has attacked the same problem.</Paragraph> <Paragraph position="2"> The ezperimental results have shown significant improvements over the WI systems without the name identification capability.</Paragraph> </Section> class="xml-element"></Paper>