File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-0624_intro.xml
Size: 2,245 bytes
Last Modified: 2025-10-06 14:03:14
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-0624"> <Title>Sparse Bayesian Classification of Predicate Arguments</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Generalized linear classifiers, in particular Support Vector Machines (SVMs), have recently been successfully applied to the task of semantic role identification and classification (Pradhan et al., 2005), inter alia.</Paragraph> <Paragraph position="1"> Although the SVM approach has a number of properties that make it attractive (above all, excellent software packages exist), it also has drawbacks. First, the resulting classifier is slow since it makes heavy use of kernel function evaluations. This is especially the case in the presence of noise (since each misclassified example has to be stored as a bound support vector). The number of support vectors typically grows with the number of training examples. Although there exist optimization methods that speed up the computations, the main drawback of the SVM approach is still the classification speed.</Paragraph> <Paragraph position="2"> Another point is that it is necessary to tune the parameters (typically C and g). This makes it necessary to train repeatedly using cross-validation to find the best combination of parameter values.</Paragraph> <Paragraph position="3"> Also, the output of the decision function of the SVM is not probabilistic. There are methods to map the decision function onto a probability output using the sigmoid function, but they are considered somewhat ad-hoc (see (Tipping, 2001) for a discussion). In this paper, we apply a recent learning paradigm, namely Sparse Bayesian learning,or more specifically the Relevance Vector learning method, to the problem of role classification. Its principal advantages compared to the SVM approach are: Its significant drawback is that the training procedure relies heavily on dense linear algebra, and is thus difficult to scale up to large training sets and may be prone to numerical difficulties.</Paragraph> <Paragraph position="4"> For a description of the task and the data, see (Carreras and Marquez, 2005).</Paragraph> </Section> class="xml-element"></Paper>