File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-1208_concl.xml
Size: 2,666 bytes
Last Modified: 2025-10-06 13:53:45
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1208"> <Title>Question Classification using HDAG Kernel</Title> <Section position="9" start_page="0" end_page="0" type="concl"> <SectionTitle> 5 Discussion </SectionTitle> <Paragraph position="0"> First, we could increase the performance by using the information on named entities and semantic information compared to only using the words, which is the same result given in (Li and Roth, 2002). This result proved that high-performance question classification requires not only word features but also many more types of information in the question.</Paragraph> <Paragraph position="1"> Second, our proposed method showed higher performance than any method using BOW. This result indicates that the structural information in the question, which includes several levels of chunks and their relations, must provide powerful features to classify the target intention of a given question. We assume that such structural information must provide shallow semantic information of the text.</Paragraph> <Paragraph position="2"> Therefore, it is natural to improve the performance to identify the intention of the question in order to use the structural information in the manner of our proposed method.</Paragraph> <Paragraph position="3"> Table 4 shows the results of Qacc at each depth of the question taxonomy. The results of depth a104 represent the total performance measured by Qacc, considering only the uppera104 levels of question types in the question taxonomy. If the depth goes lower, all results show worse performance. There are several reasons for this. One problem is the unbalanced training data, where the lower depth question types have fewer positive labeled samples (questions) as shown in table 2. Moreover, during the classification process misclassification is multiplied. Consequently, if the upper-level classifier performed misclassification, we would no longer get a correct answer, even though a lower-level classifier has the ability to classify correctly. Thus, using a machine learning approach (not only SVM) is not suitable for deep hierarchically structured class labels. We should arrange a question taxonomy that is suitable for machine learning to achieve the total performance of question classification.</Paragraph> <Paragraph position="4"> The performance by using SVM is better than that by SNoW, even in handling the same feature of BOW. One advantage of using SNoW is its much faster learning and classifying speed than those of SVM. We should thus select the best approach for the purpose, depending on whether speed or accuracy is needed.</Paragraph> </Section> class="xml-element"></Paper>