File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-1505_intro.xml
Size: 7,788 bytes
Last Modified: 2025-10-06 14:03:17
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-1505"> <Title>Corrective Modeling for Non-Projective Dependency Parsing</Title> <Section position="3" start_page="0" end_page="43" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Statistical parsing models have been shown to be successful in recovering labeled constituencies (Collins, 2003; Charniak and Johnson, 2005; Roark and Collins, 2004) and have also been shown to be adequate in recovering dependency relationships (Collins et al., 1999; Levy and Manning, 2004; Dubey and Keller, 2003). The most successful models are based on lexicalized probabilistic context free grammars (PCFGs) induced from constituency-based treebanks. The linear-precedence constraint of these grammars restricts the types of dependency structures that can be encoded in such trees.</Paragraph> <Paragraph position="1"> A shortcoming of the constituency-based paradigm for parsing is that it is inherently incapable of representing non-projective dependencies trees (we define non-projectivity in the following section). This is particularly problematic when parsing free word-order languages, such as Czech, due to the frequency of sentences with non-projective constructions.</Paragraph> <Paragraph position="2"> In this work, we explore a corrective model which recovers non-projective dependency structures by training a classifier to select correct dependency pairs from a set of candidates based on parses generated by a constituency-based parser. We chose to use this model due to the observations that the dependency errors made by the parsers are generally local errors. For the nodes with incorrect dependency links in the parser output, the correct governor of a node is often found within a local context of the proposed governor. By considering alternative dependencies based on local deviations of the parser output we constrain the set of candidate governors for each node during the corrective procedure. We examine two state-of-the-art constituency-based parsers in this work: the Collins Czech parser (1999) and a version of the Charniak parser (2001) that was modified to parse Czech.</Paragraph> <Paragraph position="3"> Alternative efforts to recover dependency structure from English are based on reconstructing the movement traces encoded in constituency trees (Collins, 2003; Levy and Manning, 2004; Johnson, 2002; Dubey and Keller, 2003). In fact, the fea- null In order to correctly capture the dependency structure, co-indexed movement traces are used in a form similar to government and Binding theory, GPSG, etc.</Paragraph> <Paragraph position="4"> tive. The tree on the right is non-projective.</Paragraph> <Paragraph position="5"> tures we use in the current model are similar to those proposed by Levy and Manning (2004). However, the approach we propose discards the constituency structure prior to the modeling phase; we model corrective transformations of dependency trees.</Paragraph> <Paragraph position="6"> The technique proposed in this paper is similar to that of recent parser reranking approaches (Collins, 2000; Charniak and Johnson, 2005); however, while reranking approaches allow a parser to generate a likely candidate set according to a generative model, we consider a set of candidates based on local perturbations of the single most likely tree generated. The primary reason for such an approach is that we allow dependency structures which would never be hypothesized by the parser. Specifically, we allow for non-projective dependencies.</Paragraph> <Paragraph position="7"> The corrective algorithm proposed in this paper shares the motivation of the transformation-based learning work (Brill, 1995). We do consider local transformations of the dependency trees; however, the technique presented here is based on a generative model that maximizes the likelihood of good dependents. We consider a finite set of local perturbations of the tree and use a fixed model to select the best tree by independently choosing optimal dependency links.</Paragraph> <Paragraph position="8"> In the remainder of the paper we provide a definition of a dependency tree and the motivation for using such trees as well as a description of the particular dataset that we use in our experiments, the Prague Dependency Treebank (PDT). In Section 3 we describe the techniques used to adapt constituency-based parsers to train from and generate dependency trees. Section 4 describes corrective modeling as used in this work and Section 4.2 describes the particular features with which we have experimented.</Paragraph> <Paragraph position="9"> Section 5 presents the results of a set of experiments we performed on data from the PDT.</Paragraph> <Paragraph position="10"> and a set of dependency links G =</Paragraph> <Paragraph position="12"> is an index into O representing the governor of w</Paragraph> <Paragraph position="14"> =1indicates that the governor of w The index of the nodes represents the surface order of the nodes in the sequence (i.e., w</Paragraph> <Paragraph position="16"> in the sentence if i<j).</Paragraph> <Paragraph position="17"> A tree is projective if for every three nodes: w Figure 1 shows examples of projective and non-projective trees. The rightmost tree, which is non-projective, contains a subtree consisting of w in the linear ordering of the nodes. Projectivity in a dependency tree is akin to the continuity constraint in a constituency tree; such a constraint is The imaginary root node simplifies notation. The dependency structures here are very similar to those described by Mel'Vcuk (1988); however the nodes of the dependency trees discussed in this paper are limited to the words of the sentence and are always ordered according to the surface word-order.</Paragraph> <Paragraph position="18"> in the dependency tree.</Paragraph> <Paragraph position="19"> implicitly imposed by trees generated from context free grammars (CFGs).</Paragraph> <Paragraph position="20"> Strict word-order languages, such as English, exhibit non-projective dependency structures in a relatively constrained set of syntactic configurations (e.g., right-node raising). Traditionally, these movements are encoded in syntactic analyses as traces. In languages with free word-order, such as Czech, constituency-based representations are overly constrained (Sgall et al., 1986). Syntactic dependency trees encode syntactic subordination relationships allowing the structure to be non-specific about the underlying deep representation. The relationship between a node and its subordinates expresses a sense of syntactic (functional) entailment. In this work we explore the dependency structures encoded in the Prague Dependency Treebank (HajiVc, 1998; B&quot;ohmov'a et al., 2002). The PDT 1.0 analytical layer is a set of Czech syntactic dependency trees; the nodes of which contain the word forms, morphological features, and syntactic annotations. These trees were annotated by hand and are intended as an intermediate stage in the annotation of the Tectogrammatical Representation (TR), a deep-syntactic or syntacto-semantic theory of language (Sgall et al., 1986). All current automatic techniques for generating TR structures are based on syntactic dependency parsing.</Paragraph> <Paragraph position="21"> When evaluating the correctness of dependency trees, we only consider the structural relationships between the words of the sentence (unlabeled dependencies). However, the model we propose contains features that are considered part of the dependency rather than the nodes in isolation (e.g., agreement features). We do not propose a model for correctly labeling dependency structures in this work.</Paragraph> </Section> class="xml-element"></Paper>