File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-2408_intro.xml

Size: 2,289 bytes

Last Modified: 2025-10-06 14:02:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2408">
  <Title>Modeling Category Structures with a Kernel Function</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 SVMs and Kernel Method
</SectionTitle>
    <Paragraph position="0"> In this section, we explain SVMs and the kernel method, which are the basis of our research. SVMs have achieved high accuracy in various tasks including text categorization (Joachims, 1998; Dumais et al., 1998).</Paragraph>
    <Paragraph position="1"> Suppose a set Dl of ordered pairs consisting of a feature vector and its label</Paragraph>
    <Paragraph position="3"> is given. Dl is called training data. I is the set of feature indices. In SVMs, a separating hyperplane (f(x) = w C/ x !b) with the largest margin (the distance between the hyperplane and its nearest vectors) is constructed.</Paragraph>
    <Paragraph position="4"> Skipping the details of SVMs' formulation, here we just show the conclusion that, using some real numbers fi/i (8i) and b/, the optimal hyperplane is expressed as follows:</Paragraph>
    <Paragraph position="6"> We should note that only dot-products of examples are used in the above expression.</Paragraph>
    <Paragraph position="7"> Since SVMs are linear classifiers, their separating ability is limited. To compensate for this limitation, the kernel method is usually combined with SVMs (Vapnik, 1998).</Paragraph>
    <Paragraph position="8"> In the kernel method, the dot-products in (2) are replaced with more general inner-products K(xi;x) (kernel functions). The polynomial kernel (xiC/xj+1)d (d 2 N+) and the RBF kernel expf!kxi ! xjk2=2 2g are often used. Using the kernel method means that feature vectors are mapped into a (higher dimensional) Hilbert space and linearly separated there. This mapping structure makes non-linear separation possible, although SVMs are basically linear classifiers.</Paragraph>
    <Paragraph position="9"> Another advantage of the kernel method is that although it deals with a high dimensional (possibly infinite) space, explicit computation of high dimensional vectors is not required. Only the general inner-products of two vectors need to be computed. This advantage leads to a relatively small computational overhead.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML