XML Viewer - p99-1023

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/p99-1023_concl.xml
Size: 5,987 bytes
Last Modified: 2025-10-06 13:58:26
<?xml version="1.0" standalone="yes"?>
<Paper uid="P99-1023">
  <Title>A Second-Order Hidden Markov Model for Part-of-Speech Tagging</Title>
  <Section position="5" start_page="178" end_page="181" type="concl">
    <SectionTitle>
4 Experiment and Conclusions
</SectionTitle>
    <Paragraph position="0"> The new tagging model is tested in several different ways. The basic experimental technique is a 10-fold cross validation. The corpus in question-is randomly split into ten sections with nine of the sections combined to train the tagger and the tenth for testing. The results of the ten possible training/testing combinations are merged to give an overall accuracy measure. The tagger was tested on two corpora-the Brown corpus (from the Treebank II CD-ROM (Marcus et al., 1993)) and the Wall Street Journal corpus (from the same source). Comparing results for taggers can be difficult, especially across different researchers. Care has been taken in this paper that, when comparing two systems, the comparisons are from experiments that were as similar as possible and that differences are highlighted in the comparison.</Paragraph>
    <Paragraph position="1"> First, we compare the results on each corpus of four different versions of our HMM tagger: a standard (bigram) HMM tagger, an HMM using second-order lexical probabilities, an HMM using second-order contextual probabilities (a standard trigram tagger), and a full second-order HMM tagger. The results from both corpora for each tagger are given in Table 1. As might be expected, the full second-order HMM had the highest accuracy levels. The model using only second-order contextual information (a standard trigram model) was second best, the model using only second-order lexical information was third, and the standard bigram HMM had the lowest accuracies. The full second-order HMM reduced the number of errors on known words by around 16% over bigram taggers (raising the accuracy about 0.6-0.7%), and by around 6% over conventional trigram taggets (accuracy increase of about 0.2%). Similar results were seen in the overall accuracies. Unknown word accuracy rates were increased by around 2-3% over bigrams.</Paragraph>
    <Paragraph position="2"> The full second-order HMM tagger is also compared to other researcher's taggers in Table 2. It is important to note that both SNOW, a linear separator model (Roth and Zelenko,</Paragraph>
    <Paragraph position="4"> 1998), and the voting constraint tagger (Tiir and Oflazer, 1998) used training data that contained full lexical information (i.e., no unknown words), as well as training and testing data that did not cover the entire WSJ corpus. This use of a full lexicon may have increased their accuracy beyond what it would have been if the model were tested with unknown words. The standard trigram tagger data is from (Weischedel et al., 1993). The MBT (Daelemans et al., 1996)  did not include numbers in the lexicon, which accounts for the inflated accuracy on unknown words. Table 2 compares the accuracies of the taggers on known words, unknown words, and overall accuracy. The table also contains two additional pieces of information. The first indicates if the corresponding tagger was tested using a closed lexicon (one in which all words appearing in the testing data are known to the tagger) or an open lexicon (not all words are known to.the system). The second indicates whether a hold-out method (such as cross-validation) was used, and whether the tagger was tested on the entire WSJ corpus or a reduced corpus.</Paragraph>
    <Paragraph position="5"> Two cross-validation tests with the full second-order HMM were run: the first with an open lexicon (created from the training data), and the second where the entire WSJ lexicon was used for each test set. These two tests allow more direct comparisons between our system and the others. As shown in the table, the full second-order HMM has improved overall accuracies on the WSJ corpus to state-of-the-art  for training and testing with cross-validation.</Paragraph>
    <Paragraph position="6"> levels--96.9% is the greatest accuracy reported on the full WSJ for an experiment using an open lexicon. Finally, using a closed lexicon, the full second-order HMM achieved an accuracy of 98.05%, the highest reported for the WSJ corpus for this type of experiment.</Paragraph>
    <Paragraph position="7"> The accuracy of our system on unknown words is 84.9%. This accuracy was achieved by creating separate classifiers for capitalized, hyphenated, and numeric digit words: tests on the Wall Street Journal corpus with the full second-order HMM show that the accuracy rate on unknown words without separating these types of words is only 80.2%. 6 This is below the performance of our bigram tagger that separates the classifiers. Unfortunately, unknown word accuracy is still below some of the other systems. This may be due in part to experimental differences. It should also be noted that some of these other systems use hand-crafted rules for unknown word rules, whereas our system uses only statistical data. Adding additional rules to our system could result in comparable performance. Improving our model on unknown words is a major focus of future research.</Paragraph>
    <Paragraph position="8"> In conclusion, a new statistical model, the full second-order HMM, has been shown to improve part-of-speech tagging accuracies over current models. This model makes use of second-order approximations for a hidden Markov model and 8Mikheev (1997) also separates suffix probabilities into different estimates, but fails to provide any data illustrating the implied accuracy increase.</Paragraph>
    <Paragraph position="9">  improves the state of the art for taggers with no increase in asymptotic running time over traditional trigram taggers based on the hidden Markov model. A new smoothing method is also explained, which allows the use of second-order statistics while avoiding sparse data problems.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML