File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/02/p02-1064_abstr.xml
Size: 1,223 bytes
Last Modified: 2025-10-06 13:42:30
<?xml version="1.0" standalone="yes"?> <Paper uid="P02-1064"> <Title>An Empirical Study of Active Learning with Support Vector Machines for Japanese Word Segmentation</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> We explore how active learning with Support Vector Machines works well for a non-trivial task in natural language processing. We use Japanese word segmentation as a test case. In particular, we discuss how the size of a pool affects the learning curve. It is found that in the early stage of training with a larger pool, more labeled examples are required to achieve a given level of accuracy than those with a smaller pool. In addition, we propose a novel technique to use a large number of unlabeled examples effectively by adding them gradually to a pool. The experimental results show that our technique requires less labeled examples than those with the technique in previous research. To achieve 97.0 % accuracy, the proposed technique needs 59.3 % of labeled examples that are required when using the previous technique and only 17.4 % of labeled examples with random sampling.</Paragraph> </Section> class="xml-element"></Paper>