File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/w03-0432_evalu.xml

Size: 2,466 bytes

Last Modified: 2025-10-06 13:58:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0432">
  <Title>Named Entity Recognition Using a Character-based Probabilistic Approach</Title>
  <Section position="6" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
5 Results
</SectionTitle>
    <Paragraph position="0"> Table 2 shows how the system performs in terms of recognition. There is a large discrepancy between recognition performance for English and German. For German, it appears that there is insufficient morphological information in a word and its immediate context to reliably discriminate between NEs and common nouns. Precision is markedly higher than recall across all tests. The most common error in English was the misclassification  of a single-term entity as a non-entity, while multi-word entities were more successfully identified.</Paragraph>
    <Paragraph position="1"> Table 3 shows the overall performance difference between words present in the tagged training corpus and those that only occurred in the test set. For previously seen words, both recognition and classification perform well, aided by the variable depth of MD-tries. The progressive back-off model of tries is quite effective in classifying new tokens, achieving up to 85% accuracy in classification unseen entities. It is interesting to note that, given a successful recognition phase, German NEs are more successfully classified than English NEs.</Paragraph>
    <Paragraph position="2"> The effects of heuristically restoring case information can be seen in Table 4. The contribution of recapitalisation is limited by the proportion of entities in caseless positions. Both the word-based method and the trie-based method produced improvements. The higher accuracy of the trie-based approach gives better overall performance.</Paragraph>
    <Paragraph position="3"> The final results for each language and dataset are given in Table 5. Both English datasets have the same performance profile: results for the PER and LOC categories were markedly better than the MISC and ORG categories. Since seen and unseen performance remained quite stable, the lower results for the second test set can be explained by a higher percentage of previously unseen words. While MISC is traditionally the worst-performing category, the lowest results were for ORG. This pattern of performance was different to that for German, in which MISC was consistently identified less well than the other categories.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML