File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/p03-1046_intro.xml

Size: 3,043 bytes

Last Modified: 2025-10-06 14:01:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-1046">
  <Title>Parsing with generative models of predicate-argument structure</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> State-of-the-art statistical parsers for Penn Treebank-style phrase-structure grammars (Collins, 1999), (Charniak, 2000), but also for Categorial Grammar (Hockenmaier and Steedman, 2002b), include models of bilexical dependencies defined in terms of local trees. However, this paper demonstrates that such models would be inadequate for languages with freer word order. We use the example of Dutch ditransitives, but our argument equally applies to other languages such as Czech (see Collins et al. (1999)). We argue that this problem can be avoided if instead the bilexical dependencies in the predicate-argument structure are captured, and propose a generative model for these dependencies.</Paragraph>
    <Paragraph position="1"> The focus of this paper is on models for Combinatory Categorial Grammar (CCG, Steedman (2000)). Due to CCG's transparent syntax-semantics interface, the parser has direct and immediate access to the predicate-argument structure, which includes not only local, but also long-range dependencies arising through coordination, extraction and control. These dependencies can be captured by our model in a sound manner, and our experimental results for English demonstrate that their inclusion improves parsing performance. However, since the predicate-argument structure itself depends only to a degree on the grammar formalism, it is likely that parsers that are based on other grammar formalisms could equally benefit from such a model.</Paragraph>
    <Paragraph position="2"> The conditional model used by the CCG parser of Clark et al. (2002) also captures dependencies in the predicate-argument structure; however, their model is inconsistent.</Paragraph>
    <Paragraph position="3"> First, we review the dependency model proposed by Hockenmaier and Steedman (2002b). We then use the example of Dutch ditransitives to demonstrate its inadequacy for languages with a freer word order. This leads us to define a new generative model of CCG derivations, which captures word-word dependencies in the underlying predicate-argument structure. We show how this model can capture long-range dependencies, and deal with the presence of multiple dependencies that arise through the presence of long-range dependencies. In our current implementation, the probabilities of derivations are computed during parsing, and we discuss the difficulties of integrating the model into a probabilistic chart parsing regime. Since there is no CCG treebank for other languages available, experimental results are presented for English, using CCGbank (Hockenmaier and Steedman, 2002a), a translation of the Penn Treebank to CCG. These results demonstrate that this model benefits greatly from the inclusion of long-range dependencies.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML