File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-0423_concl.xml

Size: 1,950 bytes

Last Modified: 2025-10-06 13:53:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0423">
  <Title>Named Entity Recognition with a Maximum Entropy Approach</Title>
  <Section position="5" start_page="0" end_page="0" type="concl">
    <SectionTitle>
4 Experiments
</SectionTitle>
    <Paragraph position="0"> The English training and test data are part of the Reuters Corpus, Volume 11. The German training and test data are part of the European Corpus Initiative, Multilingual Corpus 1. The best results obtained on the developement and test sets of the 2 languages are as shown in Table 2.</Paragraph>
    <Paragraph position="1"> Results in Table 1 are obtained by applying ME1, without the help of name lists, on the 2 languages.</Paragraph>
    <Paragraph position="2"> The best results for English are obtained using ME2, which made use of name lists compiled from the Internet and the list provided with the training set (See Section 3.4). The best results on German are obtained by using part-of-speech tags (provided in both training and test data) as an additional feature to the features used by ME1.</Paragraph>
    <Paragraph position="3"> For all experiments, features that occur only once in the training data are not used, and the GIS algorithm is run for 600 iterations. Running more iterations does not bring about any significant improvement to the accuracy.</Paragraph>
    <Paragraph position="4"> Our system usually does well for the LOC and PER class, but fails to do as well for the MISC and ORG class. The bad performance on the MISC class agrees with the observations of (Carreras et al., 2002). We felt that the  languages by ME1 MISC class is particularly difficult due to its generality (it can refer to anything from movie titles to sports events). Acknowledgements We would like to thank Yoong Keok Lee for helping us to apply boosting and feature selection to the maximum entropy algorithm, although these were not used in the final system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML