File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/97/p97-1031_evalu.xml

Size: 5,761 bytes

Last Modified: 2025-10-06 14:00:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="P97-1031">
  <Title>A Flexible POS Tagger Using an Automatically Acquired Language Model*</Title>
  <Section position="8" start_page="241" end_page="243" type="evalu">
    <SectionTitle>
6 Experiments and results
</SectionTitle>
    <Paragraph position="0"> The whole WSJ corpus contains 241 different classes of ambiguity. The 40 most representative classes t-&amp;quot; were selected for acquiring the corresponding decision trees. That produced 40 trees totaling up to 2995 leaf nodes, and covering 83.95% of the ambiguous words. Given that each tree branch produces as many constraints as tags its leaf involves, these trees were translated into 8473 context constraints.</Paragraph>
    <Paragraph position="1"> We also extracted the 1404 bigram restrictions and the 17387 trigram restrictions appearing in the training corpus.</Paragraph>
    <Paragraph position="2"> Finally, the model-tuning set was tagged using a bigram model. The most common errors commited by the bigram tagger were selected for manually writing the sample linguistic part of the model, consisting of a set of 20 hand-written constraints.</Paragraph>
    <Paragraph position="3"> From now on C will stands for the set of acquired context constraints. B for the bigram model, T for th.e trigram model, and H for the hand-written constraints. Any combination of these letters will indicate the joining of the corresponding models (BT, BC, BTC, etc.).</Paragraph>
    <Paragraph position="4"> In addition, ML indicates a baseline model conraining no constraints (this will result in a most-likely tagger) and HMM stands for a hidden Markov model bigram tagger (Elworthy, 1992).</Paragraph>
    <Paragraph position="5"> We tested the tagger on the 50 Kw test set using all the combinations of the language models. Results are reported below.</Paragraph>
    <Paragraph position="6"> The effect of the acquired rules on the number of errors for some of the most common cases is shown in table 1. XX/Y'Y stands for an error consisting of a word tagged ~t%_&amp;quot; when it should have been XX. Table 2 contains the meaning of all the involved tags. Figures in table 1 show that in all cases the learned constraints led to an improvement.</Paragraph>
    <Paragraph position="7"> It is remarkable that when using C alone, the number of errors is lower than with any bigram 12In terms of number of examples.</Paragraph>
    <Paragraph position="8">  and/or trigram model, that is, the acquired model performs better than the others estimated from the same training corpus.</Paragraph>
    <Paragraph position="9"> We also find that the cooperation of a bigram or trigram model with the acquired one, produces even better results. This is not true in the cooperation of bigrams and trigrams with acquired constraints (BTC), in this case the synergy is not enough to get a better joint result. This might be due to the fact that the noise in B and T adds up and overwhelms the context constraints.</Paragraph>
    <Paragraph position="10"> The results obtained by the baseline taggers can be found in table 3 and the results obtained using all the learned constraints together with the bi/trigram models in table 4.</Paragraph>
    <Paragraph position="11">  On the one hand. the results in tables 3 and 4 show that our tagger performs slightly worse than a HMM tagger in the same conditions 13, that is, when using only bigram information.</Paragraph>
    <Paragraph position="12"> 13Hand analysis of the errors commited by the algorithm suggest that the worse results may be due to noise in the training and test corpora, i.e., relaxation algorithm seems to be more noise-sensitive than a Markov model. Further research is required on this point.  tagger using every combination On the other hand, those results also show that since our tagger is more flexible than a HMM, it can easily accept more complex information to improve its results up to 97.39% without modifying the algorithm. null  of constraint kinds and hand written constraints Table 5 shows the results adding the hand written constraints. The hand written set is very small and only covers a few common error cases. That produces poor results when using them alone (H). but they are good enough to raise the results given by the automatically acquired models up to 97.-15%.</Paragraph>
    <Paragraph position="13"> Although the improvement obtained might seem small, it must be taken into .account that we are moving very close to the best achievable result with these techniques.</Paragraph>
    <Paragraph position="14"> First, some ambiguities can only be solved with semantic information, such as the Noun-Adjective ambiguity for word principal in the phrase lhe principal office. It could be an adjective, meaning the  main office, or a noun, meaning the school head ofrice, null Second, the WSJ corpus contains noise (mistagged words) that affects both the training and the test sets. The noise in the training set produces noisy -and so less precise- models. In the test set, it produces a wrong estimation of accuracy, since correct answers are computed as wrong and vice-versa.</Paragraph>
    <Paragraph position="15"> For instance, verb participle forms are sometimes tagged as such (VBIV) and also as adjectives (J J) in other sentences with no structural differences:</Paragraph>
    <Paragraph position="17"> Another structure not coherently tagged are noun chains when the nouns are ambiguous and can be</Paragraph>
    <Paragraph position="19"> All this makes that the performance cannot reach 100%, and that an accurate analysis of the noise in WS3 corpus should be performed to estimate the actual upper bound that a tagger can achieve on these data. This issue will be addressed in further work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML