XML Viewer - n06-1042

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/n06-1042_evalu.xml
Size: 2,633 bytes
Last Modified: 2025-10-06 13:59:40
<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1042">
  <Title>Learning Morphological Disambiguation Rules for Turkish</Title>
  <Section position="6" start_page="332" end_page="332" type="evalu">
    <SectionTitle>
4.5 Results
</SectionTitle>
    <Paragraph position="0"> The nal evaluation of the model was performed on a test data set of 958 instances. The possible parses for each instance were generated by the morphological analyzer and the correct one was picked manually. 40% of the instances were ambiguous, which on the average had 3.9 parses. The disambiguation accuracy of our model was 95.82%. The 95% con dence interval for the accuracy is [0.9457, 0.9708].</Paragraph>
    <Paragraph position="1"> An analysis of the mistakes in the test data show that at least some of them are due to incorrect tags in our training data. The training data was semi-automatically generated and thus contained some errors. Based on hand evaluation of the differences between the training data tags and the GPA generated tags, we estimate the accuracy of the training data to be below 95%. We ran two further experiments to see if we could improve on the initial results.</Paragraph>
    <Paragraph position="2"> In our rst experiment we used our original model to re-tag the training data. The re-tagged training data was used to construct a new model. The resulting accuracy on the test set increased to 96.03%, not a statistically signi cant improvement.</Paragraph>
    <Paragraph position="3"> In our second experiment we used only unambiguous instances for training. Decision list training requires negative examples, so we selected random unambiguous instances for positive and negative examples for each feature. The accuracy of the resulting model on the test set was 82.57%. The problem with selecting unambiguous instances is that certain common disambiguation decisions are never represented during training. More careful selection of negative examples and a sophisticated bootstrapping mechanism may still make this approach workable.</Paragraph>
    <Paragraph position="4"> Finally, we decided to see if our decision lists could be used for tagging rather than disambiguation, i.e. given a word in a context decide on the full tag without the help of a morphological analyzer.</Paragraph>
    <Paragraph position="5"> Even though the number of possible tags is unlimited, the most frequent 1000 tags cover about 99% of the instances. A single decision list trained with the full tags was able to achieve 91.23% accuracy using 10000 rules. This is a promising result and will be explored further in future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML