File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-2609_evalu.xml

Size: 5,409 bytes

Last Modified: 2025-10-06 13:59:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2609">
  <Title>Learning to Identify Definitions using Syntactic Features</Title>
  <Section position="8" start_page="710" end_page="710" type="evalu">
    <SectionTitle>
7 Evaluation
</SectionTitle>
    <Paragraph position="0"> We evaluated each configuration of Section 5 and each learning method of Section 6 on the dataset which consists of 1336 definitions and 963 non-definitions sentences. Table 7 reports the accuracy and standard error estimated from this experiment.</Paragraph>
    <Paragraph position="1"> In all experiment runs, all of the classifiers in all configurations outperform our baseline (75.9%).</Paragraph>
    <Paragraph position="2"> The best accuracy of each classifier (bold) is between 11.57% to 16.31% above the baseline.</Paragraph>
    <Paragraph position="3"> The bigram only attributes (config. 2) clearly outperform the simplest setting (bag-of-word only attributes) for all classifiers. The combination of both attributes (config. 3) achieves some improvement between 0.17% to 4.41% from configuration 2. It is surprising that naive Bayes shows the best and relatively high accuracy in this base configuration (89.82%) and even outperforms all other settings.</Paragraph>
    <Paragraph position="4"> Adding syntactic properties (config. 4) or position of sentences in documents (config. 6) to the base configuration clearly gives some improvement (in 4 and 5 classifiers respectively for each configuration). But, adding root forms (config.</Paragraph>
    <Paragraph position="5"> 7) does not significantly contribute to an improvement. These results show that in general, syntactic properties can improve the performance of most classifiers. The results also support the intuition that the position of sentences in documents plays important role in identifying definition sentences.</Paragraph>
    <Paragraph position="6"> Moreover, this intuition is also supported by the result that the best performance of naive Bayes is achieved at configuration 6 (90.26%). Compared to the syntactic features, sentence positions give better accuracy in all classifiers.</Paragraph>
    <Paragraph position="7"> The above results demonstrate an interesting finding that a simple attribute set which consists of bag-of-words, bigrams, and sentence position under a fast and simple classifier (e.g. naive Bayes) could give a relatively high accuracy. One explanation that we can think of is that candidate sentences have been syntactically very well extracted with our filter. Thus, the sentences are biased by the filter from which important words and bigrams of definitions can be found in most of the sen- null tences. For example, the word and bigrams is een (is a), een (a), zijn (are), is (is), zijn de (are the), and is van (is of) are good clues to definitions and consequently havehighinformation gain. Wehave to test this result ina future work oncandidate definition sentences which are extracted by filters using various other syntactic patterns.</Paragraph>
    <Paragraph position="8"> More improvement is shown when both syntactic properties and sentence position are added together (config. 8). Allof the classifiers in this configuration obtain more error reduction compared to the base configuration. Moreover, the best accuracy of this experiment is shown by maximum entropy at this configuration (92.21%). This may be asign that our proposed syntactic properties are good indicators to identify definition sentences.</Paragraph>
    <Paragraph position="9"> Other interesting findings can be found in the addition of named entity classes to configuration 3 (config. 5), to configuration 8 (config. 9) and to configuration 10 (config. 11). In these configurations, adding NEC increases accuracies of almost all classifiers. On the other hand, adding root forms to configuration 3 (config. 7) and to configuration 8 (config. 10) does not improve accuracies. However, the best accuracies of naive Bayes (90.26%) and maximum entropy (92.21%) areachieved whennamedentity androotformsare not included as attributes.</Paragraph>
    <Paragraph position="10"> We now evaluate the classifiers. It is clear from the table that SVM1 and SVM2 settings can not achieve better accuracy compared to the naive Bayessetting, whileSVM3setting marginally out-performs naive Bayes (on 6 out of 11 configurations). This result is contrary to the superiority of SVMs in many text classification tasks. Huang et al. (2003) reported that both classifiers show similar predictive accuracy and AUC (area under the ROC (Receiver Operating Characteristics) curve) scores. This performance of naive Bayes supports the motivation behind its renaisance in machine learning (Lewis, 1998).</Paragraph>
    <Paragraph position="11"> From the three SVM settings, SVM with RBF kernel appears as the best classifier for our task in which it outperforms other SVMs settings in all configurations. This result supports the above mentioned argument thatifthebestC andg canbe selected, we do not need to consider linear SVM (e.g. the svm1 setting).</Paragraph>
    <Paragraph position="12"> Among all of the classifiers, maximum entropy shows the best accuracy. It wins at 9 out of 11 configurations in all experiments. This result confirms previous reports e.g. in Nigam et al. (1999) that maximum entropy performs better than naive Bayes in some text classification tasks.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML