File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/n03-1014_intro.xml
Size: 3,101 bytes
Last Modified: 2025-10-06 14:01:42
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-1014"> <Title>Inducing History Representations for Broad Coverage Statistical Parsing</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Unlike most problems addressed with machine learning, parsing natural language sentences requires choosing between an unbounded (or even infinite) number of possible phrase structure trees. The standard approach to this problem is to decompose this choice into an unbounded sequence of choices between a finite number of possible parser actions. This sequence is the parse for the phrase structure tree. We can then define a probabilistic model of phrase structure trees by defining a probabilistic model of each parser action in its parse context, and apply machine learning techniques to learn this model of parser actions.</Paragraph> <Paragraph position="1"> Many statistical parsers (Ratnaparkhi, 1999; Collins, 1999; Charniak, 2000) are based on a history-based model of parser actions. In these models, the probability of each parser action is conditioned on the history of previous actions in the parse. But here again we are faced with an unusual situation for machine learning problems, conditioning on an unbounded amount of information.</Paragraph> <Paragraph position="2"> A major challenge in designing a history-based statistical parser is choosing a finite representation of the unbounded parse history from which the probability of the next parser action can be accurately estimated. Previous approaches have used a hand-crafted finite set of features to represent the parse history (Ratnaparkhi, 1999; Collins, 1999; Charniak, 2000). In the work presented here, we automatically induce a finite set of real valued features to represent the parse history.</Paragraph> <Paragraph position="3"> We perform the induction of a history representation using an artificial neural network architecture, called Simple Synchrony Networks (SSNs) (Lane and Henderson, 2001; Henderson, 2000). This machine learning method is specifically designed for processing unbounded structures. It allows us to avoid making a priori independence assumptions, unlike with hand-crafted history features. But it also allows us to make use of our a priori knowledge by imposing structurally specified and linguistically appropriate biases on the search for a good history representation.</Paragraph> <Paragraph position="4"> The combination of automatic feature induction and linguistically appropriate biases results in a history-based parser with state-of-the-art performance. When trained on just part-of-speech tags, the resulting parser achieves the best current performance of a non-lexicalized parser on the Penn Treebank. When a relatively small vocabulary of words is used, performance is only marginally below the best current parser accuracy. If either the biases are reduced or the induced history representations are replaced with hand-crafted features, performance degrades. null</Paragraph> </Section> class="xml-element"></Paper>