File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1141_intro.xml
Size: 2,272 bytes
Last Modified: 2025-10-06 14:03:43
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1141"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition</Title> <Section position="4" start_page="1121" end_page="1121" type="intro"> <SectionTitle> 2 Conditional Random Fields </SectionTitle> <Paragraph position="0"> We use a Conditional Random Field (Lafferty et al., 2001; Sha and Pereira, 2003) since it represents the state of the art in sequence modeling and has also been very effective at Named Entity Recognition. It allows us both discriminative training that CMMs offer as well and the bi-directional flow of probabilistic information across the sequence that HMMs allow, thereby giving us the best of both worlds. Due to the bi-directional flow of information, CRFs guard against the myopic locally attractive decisions that CMMs make. It is customary to use the Viterbi algorithm, to find the most probably state sequence during inference. A large number of possibly redundant and correlated features can be supplied without fear of further reducing the accuracy of a high-dimensional distribution. These are well-documented benefits (Lafferty et al., 2001).</Paragraph> <Section position="1" start_page="1121" end_page="1121" type="sub_section"> <SectionTitle> 2.1 Our Baseline CRF for Named Entity Recognition </SectionTitle> <Paragraph position="0"> Our baseline CRF is a sequence model in which labels for tokens directly depend only on the labels corresponding to the previous and next tokens. We use features that have been shown to be effective in NER, namely the current, previous and next words, character n-grams of the current word, Part of Speech tag of the current word and surrounding words, the shallow parse chunk of the current word, shape of the current word, the surrounding word shape sequence, the presence of a word in a left window of size 5 around the current word and the presence of a word in a left window of size 5 around the current word. This gives us a competitive baseline CRF using local information alone, whose performance is close to the best published local CRF models, for Named Entity Recognition</Paragraph> </Section> </Section> class="xml-element"></Paper>