File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/96/c96-2151_evalu.xml

Size: 3,358 bytes

Last Modified: 2025-10-06 14:00:22

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-2151">
  <Title>Handling Sparse Data by Successive Abstraction</Title>
  <Section position="8" start_page="898" end_page="899" type="evalu">
    <SectionTitle>
6 Experiments
</SectionTitle>
    <Paragraph position="0"> A standard statistical trigram tagger has been implemented that uses linear successive abstraction for smoothing the trigram and bigram probabilities, as described in Section 3.1, and that handles unknown words using a reversed suffix tree, as described in Section 3.2, again using linear successive abstraction to improve the probability estimates. This tagger was tested on the Susanne Corpus, (Sampson 1995), using a reduced tag set of 62 tags. The size of the training corpus A was almost 130,000 words. There were three separate test corpora B, C and D consisting of approximately 10,000 words each.</Paragraph>
    <Paragraph position="1">  Tile performance of the tagger was compared with that of an tlMM-based trigram tagger that uses linear interpolation for N-gram smoothing, but where the back-off weights do not depend on the eonditionings. An optimal weight, setting was determined for each test corpus individually, and used in the experiments. Incidentally, this setting varied considerably from corpus to corpus. Thus, this represented the best possible setting of back-off weights obtainable by linear interpolation, and in particular by linear deleted interpolation, when these are not allowed to depend on the context.</Paragraph>
    <Paragraph position="2"> In contrast, the successive abstraction scheme determined the back-off weights automatically from the training corpus alone, and the same weight setting was nsed for all test corpora, yielding results that were at least on par with those obtained using linear interpolation with a globally optimal setting of contcxt-independent back-off weights determined a posteriori. Both taggers handled unknown words by inspecting the suffixes, but the HMM-based tagger did not smooth the probability distributions.</Paragraph>
    <Paragraph position="3"> The experimental results are shown in Figure 1.</Paragraph>
    <Paragraph position="4"> Note that the absolute performance of the trigram tagger is around 96 % accuracy in two cases and distinctly above 95 % accuracy in all cases, which is clearly state-of-the-art results. Since each test corpus consisted of about 10,000 words, and the error rates are between 4 and 5 %, the 5 percent significance level for differences in error rate is between 0.39 and 0.43 % depending on the error rate, and the 10 percent significance level is between 0.32 and 0,36 %.</Paragraph>
    <Paragraph position="5">  We see that the trigram tagger is better than the bigram tagger in all three cases and significantly better at significance level 10 percent, but not at 5 percent, in case C. So at this significance level, we can conclude that smoothed tri-gram statistics improve on bigram statistics alone. The trigram tagger performed better than the HMM-based one in all three cases, but not significantly better at any significance level below 10 percent. This indicates that the successive abstraction scheme yields back-off weights that are at least as good as the best ones obtainable through linear deleted interpolation with context-independent back-off weights.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML