File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/01/j01-2002_concl.xml

Size: 4,159 bytes

Last Modified: 2025-10-06 13:53:00

<?xml version="1.0" standalone="yes"?>
<Paper uid="J01-2002">
  <Title>Improving Accuracy in Word Class Tagging through the Combination of Machine Learning Systems</Title>
  <Section position="9" start_page="225" end_page="226" type="concl">
    <SectionTitle>
7. Conclusion
</SectionTitle>
    <Paragraph position="0"> Our experiments have shown that, at least for the word class tagging task, combination of several different systems enables us to raise the performance ceiling that can be observed when using data-driven systems. For all tested data sets, combination provides a significant improvement over the accuracy of the best component tagger. The amount of improvement varies from 11.3% error reduction for WSJ to 24.3% for LOB.</Paragraph>
    <Paragraph position="1"> The data set that is used appears to be the primary factor in the variation, especially the data set's consistency.</Paragraph>
    <Paragraph position="2"> As for the type of combiner, all stacked systems using only the set of proposed tags as features reach about the same performance. They are clearly better than simple voting systems, at least as long as there is sufficient training data. In the absence of sufficient data, one has to fall back to less sophisticated combination strategies.</Paragraph>
    <Paragraph position="3"> Addition of word information does not lead to improved accuracy, at least with the current training set size. However, it might still be possible to get a positive effect by restricting the word information to the most frequent and ambiguous words only. Addition of context information does lead to improvements for most systems. WPDV and Maccent make the best use of the extra information, with WPDV having an edge for less consistent data (WSJ) and Maccent for material with a high error rate (Wotan).</Paragraph>
    <Paragraph position="4">  van Halteren, Zavrel, and Daelemans Combination of Machine Learning Systems Although the results reported in this paper are very positive, many directions for research remain to be explored in this area. In particular, we have high expectations for the following two directions. First, there is reason to believe that better results can be obtained by using the probability distributions generated by the component systems, rather than just their best guesses (see, for example, Ting and Witten \[1997a\]). Second, in the present paper we have used disagreement between a fixed set of component classifiers. However, there exist a number of dimensions of disagreement (inductive bias, feature set, data partitions, and target category encoding) that might fruitfully be searched to yield large ensembles of modular components that are evolved to cooperate for optimal accuracy.</Paragraph>
    <Paragraph position="5"> Another open question is whether and, if so, when, combination is a worthwile technique in actual NLP applications. After all, the natural language text at hand has to be processed by each of the base systems, and then by the combiner. Now none of these is especially bothersome at run-time (most of the computational difficulties being experienced during training), but when combining N systems, the time needed to process the text can be expected to be at least a factor N+ 1 more than when using a single system. Whether this is worth the improvement that is achieved, which is as yet expressed in percents rather than in factors, will depend very much on the amount of text that has to be processed and the use that is made of the results. There are a few clear-cut cases, such as a corpus annotation project where the CPU time for tagging is negligible in relation to the time needed for manual correction afterwards (i.e., do use combination), or information retrieval on very large text collections where the accuracy improvement does not have enough impact to justify the enormous amount of extra CPU time (i.e., do not use combination). However, most of the time, the choice between combining or not combining will have to be based on evidence from carefully designed pilot experiments, for which this paper can only hope to provide suggestions and encouragement.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML