File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/n03-1016_intro.xml

Size: 4,124 bytes

Last Modified: 2025-10-06 14:01:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-1016">
  <Title>A* Parsing: Fast Exact Viterbi Parse Selection</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> PCFG parsing algorithms with worst-case cubic-time bounds are well-known. However, when dealing with wide-coverage grammars and long sentences, even cubic algorithms can be far too expensive in practice. Two primary types of methods for accelerating parse selection have been proposed. Roark (2001) and Ratnaparkhi (1999) use a beam-search strategy, in which only the best n parses are tracked at any moment. Parsing time is linear and can be made arbitrarily fast by reducing n. This is a greedy strategy, and the actual Viterbi (highest probability) parse can be pruned from the beam because, while it is globally optimal, it may not be locally optimal at every parse stage. Chitrao and Grishman (1990), Caraballo and Charniak (1998), Charniak et al. (1998), and Collins (1999) describe best-first parsing, which is intended for a tabular item-based framework. In best-first parsing, one builds a figure-of-merit (FOM) over parser items, and uses the FOM to decide the order in which agenda items should be processed. This approach also dramatically reduces the work done during parsing, though it, too, gives no guarantee that the first parse returned is the actual Viterbi parse (nor does it maintain a worst-case cubic time bound). We discuss best-first parsing further in section 3.3.</Paragraph>
    <Paragraph position="1"> Both of these speed-up techniques are based on greedy models of parser actions. The beam search greedily prunes partial parses at each beam stage, and a best-first FOM greedily orders parse item exploration. If we wish to maintain optimality in a search procedure, the obvious thing to try is A* methods (see for example Russell and Norvig, 1995). We apply A* search to a tabular item-based parser, ordering the parse items based on a combination of their known internal cost of construction and a conservative estimate of their cost of completion (see figure 1). A* search has been proposed and used for speech applications (Goel and Byrne, 1999, Corazza et al., 1994); however, it has been little used, certainly in the recent statistical parsing literature, apparently because of difficulty in conceptualizing and computing effective admissible estimates. The contribution of this paper is to demonstrate effective ways of doing this, by precomputing grammar statistics which can be used as effective A* estimates.</Paragraph>
    <Paragraph position="2"> The A* formulation provides three benefits. First, it substantially reduces the work required to parse a sentence, without sacrificing either the optimality of the answer or the worst-case cubic time bounds on the parser.</Paragraph>
    <Paragraph position="3"> Second, the resulting parser is structurally simpler than a FOM-driven best-first parser. Finally, it allows us to easily prove the correctness of our algorithm, over a broad range of control strategies and grammar encodings.</Paragraph>
    <Paragraph position="4"> In this paper, we describe two methods of constructing A* bounds for PCFGs. One involves context summarization, which uses estimates of the sort proposed in Corazza et al. (1994), but considering richer summaries.</Paragraph>
    <Paragraph position="5"> The other involves grammar summarization, which, to our knowledge, is entirely novel. We present the estimates that we use, along with algorithms to efficiently calculate them, and illustrate their effectiveness in a tabular PCFG parsing algorithm, applied to Penn Treebank sentences.</Paragraph>
    <Paragraph position="6">  bination of the cost to build the edge (the Viterbi inside score b) and the cost to incorporate it into a root parse (the Viterbi outside score a). (b) In the corresponding hypergraph, we have exact values for the inside score from the explored hyperedges (solid lines), and use upper bounds on the outside score, which estimate the dashed hyperedges.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML