File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-1012_evalu.xml
Size: 6,917 bytes
Last Modified: 2025-10-06 13:59:38
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1012"> <Title>Estimating Class Priors in Domain Adaptation for Word Sense Disambiguation</Title> <Section position="7" start_page="93" end_page="95" type="evalu"> <SectionTitle> 6 Discussion </SectionTitle> <Paragraph position="0"> The experimental results show that the sense priors estimated using the calibrated probabilities of naive Bayes are effective in increasing the WSD accuracy. However, using a learning algorithm which already gives well calibrated posterior probabilities may be more effective in estimating the sense priors. One possible algorithm is logistic regression, which directly optimizes for getting approximations of the posterior probabilities.</Paragraph> <Paragraph position="1"> Hence, its probability estimates are already well calibrated (Zhang and Yang, 2004; Niculescu-Mizil and Caruana, 2005).</Paragraph> <Paragraph position="2"> In the rest of this section, we first conduct experiments to estimate sense priors using the predictions of logistic regression. Then, we perform significance tests to compare the various methods.</Paragraph> <Section position="1" start_page="93" end_page="94" type="sub_section"> <SectionTitle> 6.1 Using Logistic Regression </SectionTitle> <Paragraph position="0"> We trained logistic regression classifiers and evaluated them on the 4 datasets. However, the WSD accuracies of these unadjusted logistic regression classifiers are on average about 4% lower than those of the unadjusted naive Bayes classifiers.</Paragraph> <Paragraph position="1"> One possible reason is that being a discriminative learner, logistic regression requires more training examples for its performance to catch up to, and possibly overtake the generative naive Bayes learner (Ng and Jordan, 2001).</Paragraph> <Paragraph position="2"> Although the accuracy of logistic regression as a basic classifier is lower than that of naive Bayes, its predictions may still be suitable for estimating 1Though not shown, we also calculated the accuracies of these binary classifiers without calibration, and found them to be similar to the accuracies of the multiclass naive Bayes shown in the column L under NB in Table 1.</Paragraph> <Paragraph position="3"> sense priors. To gauge how well the sense priors are estimated, we measure the KL divergence between the true sense priors and the sense priors estimated by using the predictions of (uncalibrated) multiclass naive Bayes, calibrated naive Bayes, and logistic regression. These results are shown in Table 3 and the column EMa0 a2a4a3 a5 shows that using the predictions of logistic regression to estimate sense priors consistently gives the lowest KL divergence.</Paragraph> <Paragraph position="4"> Results of the KL divergence test motivate us to use sense priors estimated by logistic regression on the predictions of the naive Bayes classifiers.</Paragraph> <Paragraph position="5"> To elaborate, we first use the probability estimates the calibrated naive Bayes classifier are then used in Equation (4) to obtain the adjusted predictions.</Paragraph> <Paragraph position="6"> The resulting WSD accuracy is shown in the column EMa0 a2a4a3 a5 under NBcal in Table 1. Corresponding results when the predictions a38 of the multiclass naive Bayes is used in Equation (4), are given in the column EMa0 a2a6a3 a5 under NB. The relative improvements against using the true sense priors, based on the calibrated probabilities, are given in the column EMa0 a2a6a3 a5 a10 L in Table 2. The results show that the sense priors provided by logistic regression are in general effective in further improving the results. In the case of DSO nouns, this improvement is especially significant.</Paragraph> </Section> <Section position="2" start_page="94" end_page="95" type="sub_section"> <SectionTitle> 6.2 Significance Test </SectionTitle> <Paragraph position="0"> Paired t-tests were conducted to see if one method is significantly better than another. The t statistic of the difference between each test instance pair is computed, giving rise to a p value. The results of significance tests for the various methods on the 4 datasets are given in Table 4, where the symbols &quot;a7 &quot;, &quot;a8 &quot;, and &quot;a9 &quot; correspond to p-value a8 0.05, (0.01, 0.05], and a46 0.01 respectively.</Paragraph> <Paragraph position="1"> The methods in Table 4 are represented in the form a1-a2, where a1 denotes adjusting the predictions of which classifier, and a2 denotes how the sense priors are estimated. As an example, NBcal-EMa0 a2a4a3 a5 specifies that the sense priors estimated by logistic regression is used to adjust the predictions of the calibrated naive Bayes classifier, and corresponds to accuracies in column EMa0 a2a4a3 a5 under NBcal in Table 1. Based on the significance tests, the adjusted accuracies of EMa10a1a0 and</Paragraph> <Paragraph position="3"> a53 in Table 1 are significantly better than their respective unadjusted L accuracies, indicating that estimating the sense priors of a new domain via the EM approach presented in this paper significantly improves WSD accuracy compared to just using the sense priors from the old domain.</Paragraph> <Paragraph position="4"> NB-EMa10a1a0 represents our earlier approach in (Chan and Ng, 2005b). The significance tests show that our current approach of using calibrated naive Bayes probabilities to estimate sense priors, and then adjusting the calibrated probabilities by nificantly better than NB-EMa10a1a0 (refer to row 2 of Table 4). For DSO nouns, though the results are similar, the p value is a relatively low 0.06. Using sense priors estimated by logistic regression further improves performance. For example, row 1 of Table 4 shows that adjusting the predictions of multiclass naive Bayes classifiers by sense priors estimated by logistic regression (NB-EMa0 a2a4a3 a5 ) performs significantly better than using sense priors estimated by multiclass naive Bayes (NB-EMa10a1a0 ). Finally, using sense priors estimated by logistic regression to adjust the predictions of calibrated naive Bayes (NBcal-EMa0 a2a4a3 a5 ) in general performs significantly better than most other methods, achieving the best overall performance. null In addition, we implemented the unsupervised method of (McCarthy et al., 2004), which calculates a prevalence score for each sense of a word to predict the predominant sense. As in our earlier work (Chan and Ng, 2005b), we normalized the prevalence score of each sense to obtain estimated sense priors for each word, which we then used to adjust the predictions of our naive Bayes classifiers. We found that the WSD accuracies obtained with the method of (McCarthy et al., 2004) are on average 1.9% lower than our NBcal-EMa0 a2a4a3 a5 method, and the difference is statistically significant. null</Paragraph> </Section> </Section> class="xml-element"></Paper>