File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-2303_intro.xml

Size: 6,004 bytes

Last Modified: 2025-10-06 14:04:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2303">
  <Title>Robust Parsing of the Proposition Bank</Title>
  <Section position="2" start_page="0" end_page="11" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Recent successes in statistical syntactic parsing based on supervised learning techniques trained on a large corpus of syntactic trees (Collins, 1999; Charniak, 2000; Henderson, 2003) have brought forth the hope that the same approaches could be applied to the more ambitious goal of recovering the propositional content and the frame semantics of a sentence. Moving towards a shallow semantic level of representation is a first initial step towards the distant goal of natural language understanding and has immediate applications in question-answering and information extraction. For example, an automatic flight reservation system processing the sentence I want to book a flight from Geneva to Trento will need to know that from Geneva denotes the origin of the flight and to Trento denotes its destination. Knowing that these two phrases are prepositional phrases, the information provided by a syntactic parser, is only moderately useful.</Paragraph>
    <Paragraph position="1"> The growing interest in learning deeper information is to a large extent supported and due to the recent development of semantically annotated databases such as FrameNet (Baker et al., 1998) or the Proposition Bank (Palmer et al., 2005), that can be used as training resources for a number of supervised learning paradigms. We focus here on the Proposition Bank (PropBank). PropBank encodes propositional information by adding a layer of argument structure annotation to the syntactic structures of the Penn Treebank (Marcus et al., 1993). Verbal predicates in the Penn Treebank (PTB) receive a label REL and their arguments are annotated with abstract semantic role labels A0-A5 or AA for those complements of the predicative verb that are considered arguments while those complements of the verb labelled with a semantic functional label in the original PTB receive the composite semantic role label AM-X, where X stands for labels such as LOC, TMP or ADV, for locative, temporal and adverbial modifiers respectively. A tree structure with PropBank labels for a sentence from the PTB (section 00) is shown in Figure 1 below. PropBank uses two levels of granularity in its annotation, at least conceptually.</Paragraph>
    <Paragraph position="2"> Arguments receiving labels A0-A5 or AA do not express consistent semantic roles and are specific to a verb, while arguments receiving an AM-X label are supposed to be adjuncts and the respective roles they express are consistent across all verbs.1 Recent approaches to learning semantic role labels are based on two-stage architectures. The first stage selects the elements to be labelled, while the second determines the labels to be assigned to the selected elements. While some of these models are based on full parse trees (Gildea and Jurafsky, 2002; Gildea and Palmer, 2002), other methods have been proposed that eschew the need for a full 1There are thirteen semantic role labels for modifiers. See (Palmer et al., 2005) for a detailed discussion of PropBank semantic roles labels.</Paragraph>
    <Paragraph position="3">  parse (CoNNL, 2004; CoNLL, 2005). Because of the way the problem has been formulated - as a pipeline of parsing (or chunking) feeding into labelling - specific investigations of integrated approaches that solve both the parsing and the semantic role labelling problems at the same time have not been studied.</Paragraph>
    <Paragraph position="4"> We present work to test the hypothesis that a current statistical parser (Henderson, 2003) can output richer information robustly, that is without any significant degradation of the parser's accuracy on the original parsing task, by explicitly modelling semantic role labels as the interface between syntax and semantics.</Paragraph>
    <Paragraph position="5"> Weachievepromisingresultsbothonthesimple parsing task, where the accuracy of the parser is measured on the standard Parseval measures, and also on the parsing task where the more complex labelsofPropBankaretakenintoaccount. Wewill call the former task Penn Treebank parsing (PTB parsing) and the latter task PropBank parsing below. null These results have several consequences. On the one hand, we show that it is possible to build a single integrated robust system successfully. This is a meaningful achievement, as a task combining semantic role labelling and parsing is more complex than simple syntactic parsing. While the shallow semantics of a constituent and its structural position are often correlated, they sometimes diverge. For example, some nominal temporal modifiers occupy an object position without being objects, like Tuesday in Figure 1 below. On the other hand, our results indicate that the proposed models are robust. To model our task accurately, additional parameters must be estimated. However, given the current limited availability of annotated treebanks, this more complex task will have to be solved with the same overall amount of data, aggravating the difficulty of estimating the model's parameters due to sparse data. The limited availability of data is increased further by the high variability of the argumental labels A0-A5 whose semantics is specific to a given verb or a given verb sense. Solving this more complex problem successfully, then, indicates that the models used are robust.</Paragraph>
    <Paragraph position="6"> Finally, we achieve robustness without simplifying the parsing architecture. Specifically, robustness is achieved without resorting to the stipulation of strong independence assumptions to compensate for the limited availability and high variability of data. Consequently, such an achievement demonstrates not only that the robustness of the parsing model, but also its scalability and portability.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML