XML Viewer - w99-0613

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/99/w99-0613_evalu.xml
Size: 1,948 bytes
Last Modified: 2025-10-06 14:00:43
<?xml version="1.0" standalone="yes"?>
<Paper uid="W99-0613">
  <Title>Unsupervised Models for Named Entity Classification</Title>
  <Section position="8" start_page="108" end_page="109" type="evalu">
    <SectionTitle>
6 Evaluation
</SectionTitle>
    <Paragraph position="0"> 88,962 (spelling,context) pairs were extracted as training data. 1,000 of these were picked at random, and labeled by hand to produce a test set. We chose one of four labels for each example: location, person, organization, or noise where the noise category was used for items that were outside the three categories. The numbers falling into the location, person, organi z at i on categories were 186, 289 and 402 respectively.</Paragraph>
    <Paragraph position="1"> 123 examples fell into the noise category. Of these cases, 38 were temporal expressions (either a day of the week or month of the year). We excluded these from the evaluation as they can be easily identified with a list of days/months. This left 962 examples, of which 85 were noise. Taking Arc to be the number of examples an algorithm classified correctly (where all gold standard items labeled no i s e were counted as being incorrect), we calculated two measures of accuracy:  See Tab. 2 for the accuracy ofthe different methods. Note that on some examples (around 2% of the test set) CoBoost abstained altogether; in these cases we labeled the test example with the baseline, organization, label. Fig. (3) shows learning curves for CoBoost.</Paragraph>
    <Paragraph position="2">  gives the accuracy on the test set, the coverage (proportion of examples on which both classifiers give a label rather than abstaining), and the proportion of these examples on which the two classifiers agree. With each iteration more examples are assigned labels by both classifiers, while a high level of agreement (&gt; 94%) is maintained between them. The test accuracy more or less asymptotes.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML