XML Viewer - w04-1221

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-1221_evalu.xml
Size: 2,611 bytes
Last Modified: 2025-10-06 13:59:16
<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1221">
  <Title>Biomedical Named Entity Recognition Using Conditional Random Fields and Rich Feature Sets</Title>
  <Section position="4" start_page="105" end_page="106" type="evalu">
    <SectionTitle>
4 Results and Discussion
</SectionTitle>
    <Paragraph position="0"> Two experiments were completed in the time allotted: one CRF model using only the orthographic features described in section 3.1, and a second system using all the semantic lexicons from 3.2 as well. Detailed results are presented in table 2. The orthographic model achieves an overall F1 measure of 69.8 on the evaluation set (88.9 on the training set), converging after 230 training iterations and approximately 18 hours of computation. The complete model, however, only reached an overall F1 of 69.5 on the evaluation set (86.7 on the training set), converging after 152 iterations in approximately 9 hours.</Paragraph>
    <Paragraph position="1"> The deleterious effect of the semantic lexicons is surprising and puzzling.7 However, even though semantic lexicons slightly decrease over-all performance, it is worthwhile to note that adding lexicons actually improves both recall and precision for the RNA and CELL-LINE entities. These happen to be the two lowest frequency class labels in the data, together comprising less than 10% of the mentions in either the training or evaluation set. Error analysis shows that several of the orthographic model's false negatives for these entities are of the form &amp;quot;messenger accumulation&amp;quot; (RNA) or &amp;quot;nonadherent culture&amp;quot; (CELL-LINE). It may be that key-word lexicons contributed to the model identifying these low frequency terms more accurately. 7Note, however, that these figures are on a single training/evaluation split without cross-validation, so differences are likely not statistically significant.</Paragraph>
    <Paragraph position="2">  Also of note is that, in both experiments, the CRF framework achieves somewhat comparable performance across all entities. In a previous attempt to use a Hidden Markov Model to simultaneously recognize multiple biomedical entities (Collier et al., 2000), HMM performance for a particular entity seemed more or less proportional to its frequency in the data. The advantage of the CRF here may be due to the fact that HMMs are generative models trained to learn the joint probability P(o,l) -- where data for l may be sparse -- and use Bayes rule to predict the best label. CRFs are discriminative models trained to maximize P(l|o) directly.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML