File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/n01-1023_intro.xml

Size: 2,240 bytes

Last Modified: 2025-10-06 14:01:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="N01-1023">
  <Title>Applying Co-Training methods to Statistical Parsing</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The current crop of statistical parsers share a similar training methodology. They train from the Penn Tree-bank (Marcus et al., 1993); a collection of 40,000 sentences that are labeled with corrected parse trees (approximately a million word tokens). In this paper, we explore methods for statistical parsing that can be used to combine small amounts of labeled data with unlimited amounts of unlabeled data. In the experiment reported here, we use 9695 sentences of bracketed data (234467 word tokens). Such methods are attractive for the following reasons: #0F Bracketing sentences is an expensive process. A parser that can be trained on a small amount of labeled data will reduce this annotation cost.</Paragraph>
    <Paragraph position="1"> #0F Creating statistical parsers for novel domains and new languages will become easier.</Paragraph>
    <Paragraph position="2"> #0F Combining labeled data with unlabeled data allows exploration of unsupervised methods which can now be tested using evaluations compatible with supervised statistical parsing.</Paragraph>
    <Paragraph position="3"> In this paper we introduce a new approach that combines unlabeled data with a small amount of labeled (bracketed) data to train a statistical parser. We use a Co-Training method (Yarowsky, 1995; Blum and Mitchell,  I would like to thank Aravind Joshi, Mitch Marcus, Mark Liberman, B. Srinivas, David Chiang and the anonymous reviewers for helpful comments on this work. This work was partially supported by NSF Grant SBR8920230, ARO Grant DAAH0404-94-G-0426, and DARPA Grant N66001-00-1-8915.</Paragraph>
    <Paragraph position="4"> 1998; Goldman and Zhou, 2000) that has been used previously to train classifiers in applications like word-sense disambiguation (Yarowsky, 1995), document classification (Blum and Mitchell, 1998) and named-entity recognition (Collins and Singer, 1999) and apply this method to the more complex domain of statistical parsing.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML