File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1040_intro.xml
Size: 2,458 bytes
Last Modified: 2025-10-06 14:02:24
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-1040"> <Title>Enriching the Output of a Parser Using Memory-Based Learning</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Background and Motivation </SectionTitle> <Paragraph position="0"> State of the art statistical parsers, e.g., parsers trained on the Penn Treebank, produce syntactic parse trees with bare phrase labels, such as NP, PP, S, although the training corpora are usually much richer and often contain additional grammatical and semantic information (distinguishing various modifiers, complements, subjects, objects, etc.), including non-local dependencies, i.e., relations between phrases not adjacent in the parse tree. While this information may be explicitly annotated in a treebank, it is rarely used or delivered by parsers.1 The reason is that bringing in more information of this type usually makes the underlying parsing model more complicated: more parameters need to be estimated and independence assumptions may no longer hold.</Paragraph> <Paragraph position="1"> Klein and Manning (2003), for example, mention that using functional tags of the Penn Treebank (temporal, location, subject, predicate, etc.) with a simple unlexicalized PCFG generally had a negative effect on the parser's performance. Currently, there are no parsers trained on the Penn Treebank that use the structure of the treebank in full and that are thus 1Some notable exceptions are the CCG parser described in (Hockenmaier, 2003), which incorporates non-local dependencies into the parser's statistical model, and the parser of Collins (1999), which uses WH traces and argument/modifier distinctions. null capable of producing syntactic structures containing all or nearly all of the information annotated in the corpus.</Paragraph> <Paragraph position="2"> In recent years there has been a growing interest in getting more information from parsers than just bare phrase trees. Blaheta and Charniak (2000) presented the first method for assigning Penn functional tags to constituents identified by a parser.</Paragraph> <Paragraph position="3"> Pattern-matching approaches were used in (Johnson, 2002) and (Jijkoun, 2003) to recover non-local dependencies in phrase trees. Furthermore, experiments described in (Dienes and Dubey, 2003) show that the latter task can be successfully addressed by shallow preprocessing methods.</Paragraph> </Section> class="xml-element"></Paper>