XML Viewer - j98-4002

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/j98-4002_intro.xml
Size: 4,038 bytes
Last Modified: 2025-10-06 14:06:27
<?xml version="1.0" standalone="yes"?>
<Paper uid="J98-4002">
  <Title>Selective Sampling for Example-based Word Sense Disambiguation</Title>
  <Section position="6" start_page="589" end_page="590" type="intro">
    <SectionTitle>
4. Evaluation
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="589" end_page="590" type="sub_section">
      <SectionTitle>
4.1 Comparative Experimentation
</SectionTitle>
      <Paragraph position="0"> ~n order to investigate the effectiveness of our example sampling method, we conducted an experiment in which we compared the following four sampling methods:  Fujii, Inui, Tokunaga, and Tanaka Selective Sampling We elaborate on uncertainty sampling and committee-based sampling in Section 4.2. We compared these sampling methods by evaluating the relation between the number of training examples sampled and the performance of the system. We conducted sixfold cross-validation and carried out sampling on the training set. With regard to the training/test data set, we used the same corpus as that used for the experiment described in Section 2.3. Each sampling method uses examples from IPAL to initialize the system, with the number of example case fillers for each case being an average of about 3.7. For each sampling method, the system uses the Bunruigoihyo thesaurus for the similarity computation. In Table 2 (in Section 2.3), the column of &amp;quot;accuracy&amp;quot; for &amp;quot;BGH&amp;quot; denotes the accuracy of the system with the entire set of training data contained in the database. Each of the four sampling methods achieved this figure at the conclusion of training.</Paragraph>
      <Paragraph position="1"> We evaluated each system performance according to its accuracy, that is the ratio of the number of correct outputs, compared to the total number of inputs. For the purpose of this experiment, we set the sample size to 1 for each iteration, A = 0.5 for Equation (10), and k = 1 for Equation (13). Based on a preliminary experiment, increasing the value of k either did not improve the performance over that for k = 1, or lowered the overall performance. Figure 13 shows the relation between the number of training data sampled and the accuracy of the system. In Figure 13, zero on the x-axis represents the system using only the examples provided by 1PAL. Looking at Figure 13 one can see that compared with random sampling and committee-based sampling, our sampling method reduced the number of the training data required to achieve any given accuracy. For example, to achieve an accuracy of 80%, the number of training data required for our method was roughly one-third of that for random sampling. Although the accuracy of our method was surpassed by that of uncertainty sampling for larger sizes of training data, this minimal difference for larger data sizes is overshadowed by the considerable performance gain attained by our method for smaller data sizes.</Paragraph>
      <Paragraph position="2"> Since IPAL has, in a sense, been manually selectively sampled in an attempt to model the maximum verb sense coverage, the performance of each method is biased by the initial contents of the database. To counter this effect, we also conducted an experiment involving the construction of the database from scratch, without using examples from IPAL. During the initial phase, the system randomly selected one example for each verb sense from the training set, and a human expert provided the correct interpretation to initialize the system. Figure 14 shows the performance of the various methods, from which the same general tendency as seen in Figure 13 is observable.</Paragraph>
      <Paragraph position="3"> However, in this case, our method was generally superior to other methods. Through these comparative experiments, we can conclude that our example sampling method is able to decrease the number of training data, i.e., the overhead for both supervision and searching, without degrading the system performance.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML