File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-1014_concl.xml

Size: 2,055 bytes

Last Modified: 2025-10-06 13:53:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-1014">
  <Title>Inducing History Representations for Broad Coverage Statistical Parsing</Title>
  <Section position="9" start_page="0" end_page="0" type="concl">
    <SectionTitle>
9 Conclusions
</SectionTitle>
    <Paragraph position="0"> This paper has presented a method for estimating the parameters of a history-based statistical parser which does not require any a priori independence assumptions. A neural network is trained simultaneously to estimate the probabilities of parser actions and to induce a finite representation of the unbounded parse history. The probabilities of parser actions are conditioned on this induced history representation, rather than being conditioned on a set of hand-crafted history features chosen a priori. A beam search is used to search for the most probable parse given the neural network's probability estimates. When trained and tested on the standard Penn Treebank datasets, the parser's performance (89.1% F-measure) is only 0.6% below the best current parsers for this task, despite using a smaller vocabulary and less prior linguistic knowledge.</Paragraph>
    <Paragraph position="1"> The neural network architecture we use, Simple Synchrony Networks, not only allows us to avoid imposing hard independence assumptions, it also allows us to impose linguistically appropriate soft biases on the learning process. SSNs are specifically designed for processing structures, which allows us to design the SSN so that the induced representations of the parse history are biased towards recording structurally local information about the parse. When we modify these biases so that some structurally local information tends to be ignored, performance degrades. When we introduce independence assumptions by cutting off access to information from more distant parts of the structure, performance degrades dramatically. On the other hand, we find that biasing the learning to pay more attention to lexical heads does not improve performance.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML