File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/c04-1082_evalu.xml

Size: 4,842 bytes

Last Modified: 2025-10-06 13:59:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1082">
  <Title>Tagging with Hidden Markov Models Using Ambiguous Tags</Title>
  <Section position="5" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
3.1.3 Results
</SectionTitle>
    <Paragraph position="0"> The results of the successive models have been plotted in flgure 1 and summarized in table 1, which also shows the results on the test corpus.</Paragraph>
    <Paragraph position="1"> For each iteration i, recall and ambiguity rates of modelsMi andBi on the development corpus were computed. The results show, as expected, that recall and ambiguity rate increase with the increase of the number of ambiguous tags added to the tagset. This is true for both models Mi and Bi. The flgure also shows that recall of Bi, for a given i, is generally a bit lower than Mi while its ambiguity is higher. Figure 2 shows that for the same recall Bi introduces more ambiguous tags than Mi.</Paragraph>
    <Paragraph position="2"> The list of the 20 flrst ambiguous tags created during the process is represented below :  few flgures here. For low values of n, the n best solutions have better recall for a given value of the ambiguity rate. For instance, the 4 best tagger output yields a recall of 0:9767 for an ambiguity rate of 1:12, while, for the same ambiguity rate, the iterative method obtains a 0:9604 recall. However, the 0:982 recall value which we attained at the end of the iterative ambiguous tag learning procedure, corresponding to an ambiguity rate of 1:23, was also reached by keeping the 7 best solutions of the tagger, with an ambiguity rate of 1:20 (only slightly better than ours).</Paragraph>
    <Paragraph position="3">  The original idea of our method consists in correcting errors that were made by M0, through the introduction of ambiguous tags. Ideally, we would like models Mi with i &gt; 0 to introduce an ambiguous tag only where M0 made a mistake. Unfortunately, it is not always the case. We have classifled the use of ambiguous tags into four situations function of their in uence on both recall and ambiguity rate as indicated in table 2, where G stands for the gold standard.</Paragraph>
    <Paragraph position="4"> In situations 1 and 2 model M0 made a mistake. In situation 1, the mistake was corrected by the introduction of the ambiguous tag while in situation 2 it was not. In situations 3 and 4, model M0 did not make a mistake. In situation 3 the introduction of the ambiguous tag did not create a mistake while it did in situation 4.</Paragraph>
    <Paragraph position="5">  biguous tag on recall and ambiguity rates The frequency of each situation for some of the 20 flrst ambiguous tags has been reported in table 3. The last column of the table indicates the frequency of the ambiguous tag (number of occurrences of this tag divided by the sum of occurrences of all ambiguous tags). The flgures show that ambiguous tags are not very e-cient: only a moderate proportion of their occurrences (24% on average) actually corrected an error.</Paragraph>
    <Paragraph position="6"> While we are very rarely confronted with situation 4 which decreases recall and increases ambiguity (0:5% on average), in the vast majority of cases ambiguous tags simply increase the ambiguity without correcting any mistakes.</Paragraph>
    <Paragraph position="7"> Ambiguous tags behave quite difierently with respect to the four situations described above.</Paragraph>
    <Paragraph position="8"> In the best cases (tag 6), 46% of the occurrences corrected an error, and the tag is used one out of ten times the tagger selects an ambiguous tag, as opposed to tag 19 , which corrected errors in 48% of the cases but is not frequently used. The worst conflguration is tag 9, which, although not chosen very often, corrects an error in 13% of the occurrences and increases the ambiguity in 85% of its occurrences.</Paragraph>
    <Paragraph position="9"> A more detailed evaluation of the basic tagging mistakes has suggested a better adapted and more subtle method of using the ambiguous tags which may at the same time constitute a direction for future work. While the vast majority of mistakes are due to mixing up word classes, such as the -ing forms used as adjectives, as nouns or as verbs, about one third of the mistakes concern only 25 common words such as that, out, there, on, ofi, etc. Using the ambigu- null ous tags for these words alone has yielded a recall of 0:965 on the test corpus (25% errors less than model M0) while keeping the ambiguity rate very low (1:04). With this procedure, 35% of the ambiguous tags occurrences corrected an error made by M0 and 59% increased the ambiguity. The result can be improved by designing two sets of ambiguous tags: one to be used for this set of words, and one for the word-classes most often mistaken.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML