File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/m98-1017_concl.xml
Size: 1,061 bytes
Last Modified: 2025-10-06 13:58:06
<?xml version="1.0" standalone="yes"?> <Paper uid="M98-1017"> <Title>DESCRIPTION OF THE NTU SYSTEM USED FOR MET2</Title> <Section position="10" start_page="12" end_page="12" type="concl"> <SectionTitle> CONCLUDING REMARKS </SectionTitle> <Paragraph position="0"> This paper proposes a pipeline model to extract named entities from Chinese documents. Different types of information from different levels of text are employed, including character conditions, statistic information, titles, punctuation marks, organization and location keywords, speech-act and locative verbs, cache and n-gram model. The context ranges from very short to very long. The recall rate (83%) and the precision rate (77%) are achieved. The major errors result from propagation errors, keyword sets, character sets, rule coverage, and so on. How to integrate different modules (including segmentation and recognition) in an interleaving way, and how to learn grammar rules, keyword sets and character sets automatically have to be studied furthermore.</Paragraph> </Section> class="xml-element"></Paper>