File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/w03-0427_evalu.xml
Size: 2,653 bytes
Last Modified: 2025-10-06 13:58:59
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-0427"> <Title>Memory-based one-step named-entity recognition: Effects of seed list features, classifier stacking, and unannotated data</Title> <Section position="6" start_page="0" end_page="0" type="evalu"> <SectionTitle> 5 Results </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 5.1 Initial classifier: Iterative deepening </SectionTitle> <Paragraph position="0"> Iterative deepening produced estimations of optimal parameter settings for our initial systems for the two languages, displayed in the first and third row of Table 2.</Paragraph> <Paragraph position="1"> With this setting we achieved an overall F-rate of 78.83 for English and 61.16 for German. Table 1 lists the full evaluation results.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 5.2 First and second extension: seed list features </SectionTitle> <Paragraph position="0"> and stacking We have also performed iterative deepening in the experiment with the seed list information. This altered the best setting found by the iterative deepening process (the second and fourth rows of Table 2). The results on the English development set are slightly better than the initial system, as can be seen in Table 3. The classifier with system with seed-list features and second-stage stacking on the test sets of both languages.</Paragraph> <Paragraph position="1"> seed list information performs worse on the English test set than the one without seed lists. The reverse effect is seen on the German data. On the development set, using the seed list information gave a slight lower performance, but on the test set it has a slightly positive effect. Our second extension, stacking, improves on all over-all F-scores of both languages as compared to the seed-list extended systems, as shown in Table 4.</Paragraph> <Paragraph position="2"> 5.3 Third extension: Selecting instances from unannotated data The three extensions, using seed list information, performing second stage stacking and adding information from unannotated data, are combined in the final experiment. This experiment achieves the highest result on the English development set, and on both German test sets, as listed in Table 5. The positive effect of adding selected unannotated data on the German test sets is rather minimal, but we added only a very small amount of unlabeled material. The performance on the English test set is not better than the initial classifier.</Paragraph> </Section> </Section> class="xml-element"></Paper>