File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/p06-2089_concl.xml
Size: 2,356 bytes
Last Modified: 2025-10-06 13:55:24
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2089"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics A Best-First Probabilistic Shift-Reduce Parser</Title> <Section position="7" start_page="696" end_page="697" type="concl"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> We have presented a best-first classifier-based parser that achieves high levels of precision and recall, with fast parsing times and low memory requirements. One way to view the parser is as an extension of recent work on classifier-based deterministic parsing. It retains the modularity between parsing algorithms and learning mechanisms associated with deterministic parsers, making it simple to understand, implement, and experiment with.</Paragraph> <Paragraph position="1"> Another way to view the parser is as a variant of probabilistic GLR parsers without an explicit LR table.</Paragraph> <Paragraph position="2"> We have shown that our best-first strategy results in significant improvements in accuracy over deterministic parsing. Although the best-first search makes parsing slower, we have implemented a beam strategy that prunes much of the search space with very little cost in accuracy. This strategy involves a parameter that can be used to control the trade-off between accuracy and speed.</Paragraph> <Paragraph position="3"> At one extreme, the parser is very fast (more than 1,000 words per second) and still moderately accurate (about 85% f-score, or 86% using gold-standard POS tags). This makes it possible to apply parsing to natural language tasks involving very large amounts of text (such as question-answering or information extraction with large corpora). A less aggressive pruning setting results in an f-score of about 88% (or 89%, using gold-standard POS tags), taking 17 minutes to parse the WSJ test set.</Paragraph> <Paragraph position="4"> Finally, we have shown that by multiplying the probabilities assigned by our maximum entropy shift-reduce model to the probabilities of the 10best trees produced for each sentence by the Charniak parser, we can rescore the trees to obtain more accurate results than those produced by either model in isolation. This simple combination of the two models produces an f-score of 90.8% for the standard WSJ test set.</Paragraph> </Section> class="xml-element"></Paper>