File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/c04-1153_evalu.xml
Size: 5,389 bytes
Last Modified: 2025-10-06 13:59:09
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1153"> <Title>Learning Greek Verb Complements: Addressing the Class Imbalance</Title> <Section position="9" start_page="2" end_page="2" type="evalu"> <SectionTitle> 6 Experimental results </SectionTitle> <Paragraph position="0"> Unlike previous approaches that evaluate their methodology using the accuracy metric, we evaluated classification using precision and recall metrics for every class. a and d are the correctly identified adjuncts and complements respectively, b are the adjuncts which have been misclassified as complements and c are the misclassified complements. null</Paragraph> <Paragraph position="2"> The f-measure for each class combines the previous two metrics into one:</Paragraph> <Paragraph position="4"> Table 1 shows the results for each classification algorithm and various window sizes using the initial dataset before any attempt is made to reduce its size. The drop in performance of the minority class compared to the majority class is obvious. The scores corresponding to the best f-measure for the complement class are indicated in bold.</Paragraph> <Paragraph position="5"> By explicitly storing and taking into account every training example, IB1 presents a drop in performance as the window size increases due to sparse data. The performance of C4.5 remains relatively stable, regardless of the size of the instance vector. Naive Bayes leads to a significant number of adjunct instances being labeled as complements.</Paragraph> <Paragraph position="6"> This is attributed to the fact that the Naive Bayes learner does not take into account conditional dependencies among features. Given that an instance is a complement, for example, if the fp is an adjective in the nominative case, there is a very high probability in reality that the verb is copular. This dependence is not captured by the Naive Bayes learner.</Paragraph> <Paragraph position="7"> [0] [-1,0] [-2,0] [-2,1] [-3,3] text window sizes using the initial dataset.</Paragraph> <Paragraph position="8"> Tables 2 and 3 show the classification results after balancing the dataset using the Euclidean distance and VDM respectively. The increase in f-measure after reducing the dataset is very interesting to observe and depends on the size of the fp context window.</Paragraph> <Paragraph position="9"> When taking into account the fp only, the highest increase is over 8% in complement class f-measure with the Euclidean distance.</Paragraph> <Paragraph position="10"> When regarding the context surrounding the fp, the positive impact of balancing the dataset is even stronger. As the fp window size increases, Naive Bayes performs better, reaching an f-measure of over 60% with [-3,+3] (as opposed to 53.4% prior to balancing). Recall with C4.5 increases by 14% in context [-3,+3] after balancing. Instance-based learning, as mentioned earlier is not helped by a lot of context information and reaches its highest score when considering only one phrase preceding the fp. The increase in complement class precision with IB1 exceeds 12% with VDM. This is the experiment which achieved the highest f-measure (73.7%). Regarding larger context windows and IB1, the removal of the noisy and redundant examples seems to compensate for the noise introduced by the increased number of features in the vector.</Paragraph> <Paragraph position="11"> Increase in recall reaches 22%. As a general remark, instance-based learning performs best when the context surrounding the candidate complement is very restricted (at most one phrase preceding the fp), while Bayesian learning improves its performance as the window increases.</Paragraph> <Paragraph position="12"> In most of the experiments VDM leads to better results than the Euclidean distance because it is more appropriate for nominal features, especially when the instance vector is small. When larger windows are considered, the two metrics have the same effect. Minor occasional differences (~0.1%) mirrored in the results are attributed to the 10-fold experimentation.</Paragraph> <Paragraph position="13"> [0] [-1,0] [-2,0] [-2,1] [-3,3] Apart from the positive impact of One-sided Sampling on predicting positive examples, the tables show its positive (or at least non-negative) impact on predicting negative instances. Noncomplement accuracy either increases or remains the same after balancing.</Paragraph> <Paragraph position="14"> Concerning the resolution of the ambiguities discussed in section 2, three classified examples of the verb asko (to exercise) with context environment [-1,fp] follow. The first class label is the true and the second is the predicted class. Example (a) has been classified correctly with and without One-sided Sampling. Examples (b) and (c) are the same instance classified without (b) and with (c) One-sided Sampling. Example (b) is erroneously tagged as an adjunct due to class imbalance. The phrase preceding the fp helps resolve the ambiguity in (a) and (c): usually a punctuation mark before the fp (indicated by the triple NP,F,-) separates syntactically the fp from the verb and the fp is unlikely to be a complement.</Paragraph> <Paragraph position="15"> a. asko , E, P, NC, A, F, F, PP,-,se , NP,F,-, A A b. asko , E, P, NC, A, F, F, PP,-,se , NP,N,a, C A c. asko , E, P, NC, A, F, F, PP,-,se , NP,N,a, C C</Paragraph> </Section> class="xml-element"></Paper>