File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/n06-1019_intro.xml

Size: 4,619 bytes

Last Modified: 2025-10-06 14:03:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1019">
  <Title>Partial Training for a Lexicalized-Grammar Parser</Title>
  <Section position="2" start_page="0" end_page="144" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> State-of-the-art statistical parsers require large amounts of hand-annotated training data, and are typically based on the Penn Treebank, the largest treebank available for English. Even robust parsers using linguistically sophisticated formalisms, such as TAG (Chiang, 2000), CCG (Clark and Curran, 2004b; Hockenmaier, 2003), HPSG (Miyao et al., 2004) and LFG (Riezler et al., 2002; Cahill et al., 2004), often use training data derived from the Penn Treebank. The labour-intensive nature of the tree-bank development process, which can take many years, creates a significant barrier for the development of parsers for new domains and languages.</Paragraph>
    <Paragraph position="1"> Previous work has attempted parser adaptation without relying on treebank data from the new domain (Steedman et al., 2003; Lease and Charniak, 2005). In this paper we propose the use of annotated data in the new domain, but only partially annotated data, which reduces the annotation effort required (Hwa, 1999). We develop a parsing model whichcanbetrainedusingpartialdata,byexploiting the properties of lexicalized grammar formalisms.</Paragraph>
    <Paragraph position="2"> The formalism we use is Combinatory Categorial Grammar (Steedman, 2000), together with a parsing model described in Clark and Curran (2004b) which we adapt for use with partial data.</Paragraph>
    <Paragraph position="3"> Parsing with Combinatory Categorial Grammar (CCG) takes place in two stages: first, CCG lexical categories are assigned to the words in the sentence, and then the categories are combined by the parser (Clark and Curran, 2004a). The lexical categories can be thought of as detailed part of speech tags and typically express subcategorization information. We exploit the fact that CCG lexical categories contain a lot of syntactic information, and can therefore be used for training a full parser, even though attachment information is not explicitly represented in a category sequence. Our partial training regime only requires sentences to be annotated with lexical categories, rather than full parse trees; therefore the data can be produced much more quickly for a new domain or language (Clark et al., 2004).</Paragraph>
    <Paragraph position="4"> The partial training method uses the log-linear dependency model described in Clark and Curran (2004b), which uses sets of predicate-argument de- null pendencies, ratherthanderivations, fortraining. Our novel idea is that, since there is so much information in the lexical category sequence, most of the correct dependencies can be easily inferred from the categories alone. More specifically, for a given sentence and lexical category sequence, we train on those predicate-argument dependencies which occur in k% of the derivations licenced by the lexical categories. By setting the k parameter high, we can produce a set of high precision dependencies for training. A similar idea is proposed by Carroll and Briscoe (2002) for producing high precision data for lexical acquisition.</Paragraph>
    <Paragraph position="5"> Using this procedure we are able to produce dependency data with over 99% precision and, remarkably, up to 86% recall, when compared against the complete gold-standard dependency data. The high recall figure results from the significant amount of syntactic information in the lexical categories, which reduces the ambiguity in the possible dependency structures. Since the recall is not 100%, we require a log-linear training method which works with partial data. Riezler et al. (2002) describe a partial training method for a log-linear LFG parsing model in which the &amp;quot;correct&amp;quot; LFG derivations for a sentence are those consistent with the less detailed gold standard derivation from the Penn Treebank.</Paragraph>
    <Paragraph position="6"> We use a similar method here by treating a CCG derivation as correct if it is consistent with the highprecisionpartialdependencystructure. Section3explains what we mean by consistency in this context. Surprisingly, the accuracy of the parser trained on partial data approaches that of the parser trained on full data: our best partial-data model is only 1.3% worse in terms of dependency F-score than the full-data model, despite the fact that the partial data does not contain any explicit attachment information.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML