File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-1504_intro.xml

Size: 3,304 bytes

Last Modified: 2025-10-06 14:03:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-1504">
  <Title>Parsing with Soft and Hard Constraints on Dependency Length[?]</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Many modern parsers identify the head word of each constituent they find. This makes it possible to identify the word-to-word dependencies implicit in a parse.1 (Some parsers, known as dependency parsers, even return these dependencies as their primary output.) Why bother to identify these dependencies? The typical reason is to model the fact that some word pairs are more likely than others to engage in a dependency relationship.2 In this paper, we propose a different reason to identify dependencies in candidate parses: to evaluate not the dependency's word pair but its length (i.e., the string distance between the two words). Dependency lengths differ from [?] This work was supported by NSF ITR grant IIS-0313193 to the first author and a fellowship from the Fannie and John Hertz Foundation to the second author. The views expressed are not necessarily endorsed by the sponsors. The authors thank Mark Johnson, Eugene Charniak, Charles Schafer, Keith Hall, and John Hale for helpful discussion and Elliott Dr'abek and Markus Dreyer for insights on (respectively) Chinese and German parsing. They also thank an anonymous reviewer for suggesting the German experiments.</Paragraph>
    <Paragraph position="1"> 1In a phrase-structure parse, if phrase X headed by word token x is a subconstituent of phrase Y headed by word token y negationslash= x, then x is said to depend on y. In a more powerful compositional formalism like LTAG or CCG, dependencies can be extracted from the derivation tree.</Paragraph>
    <Paragraph position="2"> 2It has recently been questioned whether these &amp;quot;bilexical&amp;quot; features actually contribute much to parsing performance (Klein and Manning, 2003; Bikel, 2004), at least when one has only a million words of training.</Paragraph>
    <Paragraph position="3"> typical parsing features in that they cannot be determined from tree-local information. Though lengths are not usually considered, we will see that bilexical dynamic-programming parsing algorithms can easily consider them as they build the parse.</Paragraph>
    <Paragraph position="4"> Soft constraints. Like any other feature of trees, dependency lengths can be explicitly used as features in a probability model that chooses among trees. Such a model will tend to disfavor long dependencies (at least of some kinds), as these are empirically rare. In the first part of the paper, we show that such features improve a simple baseline dependency parser.</Paragraph>
    <Paragraph position="5"> Hard constraints. If the bias against long dependencies is strengthened into a hard constraint that absolutely prohibits long dependencies, then the parser turns into a partial parser with only finite-state power. In the second part of the paper, we show how to perform chart parsing in asymptotic linear time with a low grammar constant. Such a partial parser does less work than a full parser in practice, and in many cases recovers a more precise set of dependencies (with little loss in recall).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML