XML Viewer - p91-1013

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/p91-1013_metho.xml
Size: 21,886 bytes
Last Modified: 2025-10-06 14:12:48
<?xml version="1.0" standalone="yes"?>
<Paper uid="P91-1013">
  <Title>LR RECURSIVE TRANSITION NETWORKS FOR EARLEY AND TOMITA PARSING</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
* Incremental constructions of Tomita's algorithm
</SectionTitle>
    <Paragraph position="0"> (Heering, Klint, and Rekers, 1990) may similarly be viewed as just one point along a continuum of methods.</Paragraph>
    <Paragraph position="1"> * This work was supported in part by grant R29 LM 04707 from the National Library of Medicine, and by the Pittsburgh NMR Institute.</Paragraph>
    <Paragraph position="2"> The apparent distinctions between these related methods follows from the distinct complex formal and mathematical apparati (Lang, 1974; Lang, 1991) currently employed to construct these CF parsing algorithms.</Paragraph>
    <Paragraph position="3"> To effect a uniform synthesis of these methods, in this paper we introduce LR Recursive Transition Networks (LR-RTNs) as a simpler framework on which to build CF parsing algorithms. While RTNs (Woods, 1970) have been widely used in Artificial Intelligence (AI) for natural language parsing, their representational advantages have not been fully exploited for efficiency. The LR-RTNs, however, are efficient, and shall be used to construct&amp;quot;  (1) a nondeterministic parser, (2) a basic LR(0) parser, (3) Earley's algorithm (and the chart parsers), and (4) incremental and compiled versions of Tomita's algorithm.</Paragraph>
    <Paragraph position="4"> Our uniform construction has advantages over the current highly formal, non-RTN-based, nonuniform approaches to CF parsing: * Clarity of algorithm construction, permitting LR, Earley, and Tomita parsers to be understood as a family of related parsing algorithm.</Paragraph>
    <Paragraph position="5"> * Computational motivation and justification for each algorithm in this family.</Paragraph>
    <Paragraph position="6"> * Uniform extensibility of these syntactic methods to semantic parsing.</Paragraph>
    <Paragraph position="7"> * Shared graphical representations, useful in building interactive programming environments for computational linguists.</Paragraph>
    <Paragraph position="8"> * Parallelization of these parsing algorithms. * All of the known advantages of RTNs, together  with efficiencies of LR parsing.</Paragraph>
    <Paragraph position="9"> All of these improvements will be discussed in the paper.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="99" type="metho">
    <SectionTitle>
2. LR RECURSIVE TRANSITION
NETWORKS
</SectionTitle>
    <Paragraph position="0"> A transition network is a directed graph, used as a finite state machine (Hopcroft and Ullman, 1979).</Paragraph>
    <Paragraph position="1"> The network's nodes or edges are labelled; in this paper, we shall label the nodes. When an input sentence is read, state moves from node to node. A sentence is accepted if reading the entire sentence directs the network traversal so as to arrive at an  Rule #1. C. Expanding the symbol node VP. D. Expanding the symbol node NP. E, Expanding the start node S. accepting node. To increase computational power from regular languages to context-free languages, recursive transition networks (RTNs) are introduced.</Paragraph>
    <Paragraph position="2"> instantiation of the Rule's chain indicates the partial progress in sequencing the Rule's right-hand-side symbols.</Paragraph>
    <Paragraph position="3"> An RTN is a forest of disconnected transition networks, each identified by a nonterminal label. All other labels are terminal labels. When, in traversing a transition network, a nonterminal label is encountered, control recursively passes to the beginning of the correspondingly labelled transition network. Should this labelled network be successfully traversed, on exit, control returns back to the labelled calling node.</Paragraph>
    <Paragraph position="4"> The linear text of a context-free grammar can be cast into an RTN structure (Perlin, 1989). This is done by expanding each grammar rule into a linear chain. The top-down expansion amounts to a partial evaluation (Futamura, 1971) of the rule into a computational expectation: an eventual bottom-up data-directed instantiation that will complete the expansion.</Paragraph>
    <Paragraph position="5"> Figure 1, for example, shows the expansion of the grammar rule #1 S---~NP VP. First, the nonterminal S, which labels this connected component, is expanded as a nonterminal node. One method for realizing this nonterminal node, is via Rule#l; its rule node is therefore expanded. Rule#1 sets up the expectation for the VP symbol node, which in turn sets up the expectation for the NP symbol node. NP, the first symbol node in the chain, creates the start node S. In subsequent processing, posting an instance of this start symbol would indicate an expectation to instantiate the entire chain of Rule#l, thereby detecting a nonterminal symbol S. Partial The expansion in Figure 1 constructs an LR-RTN.</Paragraph>
    <Paragraph position="6"> That is, it sets up a Left-to-fight parse of a Rightmost derivation. Such derivations are developed in the next Section. As used in AI natural language parsing, RTNs have more typically been LL-RTNs, for effecting parses of leftmost derivations (Woods, 1970), as shown in Figure 2A. (Other, more efficient, control structures have also been used (Kaplan, 1973).) Our shift from LL to LR, shown in Figure 2B, uses the chain expansion to set up a subsequent data-driven completion, thereby permitting greater parsing efficiency.</Paragraph>
    <Paragraph position="7"> In Figure 3, we show the RTN expansion of the simple grammar used in our first set of examples:</Paragraph>
    <Paragraph position="9"> Chains that share identical prefixes are merged (Perlin, 1989) into a directed acyclic graph (DAG) (Aho, Hopcroft, and Ullman, 1983). This makes our RTN a forest of DAGs, rather than trees. For example, the shared NP start node initiates the chains for Rules #2 and #3 in the NP component.</Paragraph>
    <Paragraph position="10"> In augmented recursive transition networks (ATNs) (Woods, 1970), semantic constraints may be expressed. These constraints can employ case grammars, functional grammars, unification, and so on (Winograd, 1983). In our RTN formulation, semantic testing occurs when instantiating rule nodes: failing a constraint removes a parse from further  processing. This approach applies to every parsing algorithm in this paper, and will not be discussed  connected components correspond to the three nonterminals in the grammar. Each symbol node in the RTN denotes a subsequence originating from its lefimost start symbol.</Paragraph>
  </Section>
  <Section position="6" start_page="99" end_page="99" type="metho">
    <SectionTitle>
3. NONDETERMINISTIC DERIVATIONS
</SectionTitle>
    <Paragraph position="0"> A grammar's RTN can be used as a template for parsing. A sentence (the data) directs the instantiation of individual rule chains into a parse tree. The RTN instances exactly correspond to parse.</Paragraph>
    <Paragraph position="1"> tree nodes. This is most easily seen with nondeterministic rightmost derivations.</Paragraph>
    <Paragraph position="2"> Given an input sentence of n words, we may derive a sentence in the language with the nondeterministic algorithm (Perlin, 1990): Put an instance of nonterminal node S into the last column.</Paragraph>
    <Paragraph position="3"> From right to left, for every column : From top to bottom, within the  column : (i) Recursively expand the column top-down by nondeterministic selection of rule instances.</Paragraph>
    <Paragraph position="4"> (2) Install the next (leftward) symbol instance.</Paragraph>
    <Paragraph position="5">  In substep (1), following selection, a rule node and its immediately downward symbol node are instantiated. The instantiation process creates a new object that inherits from the template RTN node, adding information about column position and local link connections.</Paragraph>
    <Paragraph position="6"> For example, to derive &amp;quot;I Saw A Man&amp;quot; we would nondeterministically select and instantiate the correct rule choices #1, #4, #2, and #3, as in Figure 4. Following the algorithm, the derivation is (two dimensionally) top-down: top-to-bottom and right-toleft. To actually use this nondeterministic derivation algorithm to obtain all parses, one might enumerate and test all possible sequences of rules. This, however, has exponential cost in n, the input size. A more efficient approach is to reverse the top-down derivation, and recursively generate the parse(s) bottom-up from the input data.</Paragraph>
    <Paragraph position="8"> tree) of &amp;quot;I Saw A Man&amp;quot;. Each parse-tree symbol node denotes a subsequence of a recognized RTN chain. Rule #0 connects a word to its terminal symbol(s).</Paragraph>
  </Section>
  <Section position="7" start_page="99" end_page="99" type="metho">
    <SectionTitle>
4. BASIC LR(0) PARSING
</SectionTitle>
    <Paragraph position="0"> To construct a parser, we reverse the above top-down nondeterministic derivation teChnique into a bottom-up deterministic algorithm. We first build an inefficient LR-parser, illustrating the reversal. For efficiency, we then introduce the Follow-Set, and modify our parser accordingly.</Paragraph>
    <Section position="1" start_page="99" end_page="99" type="sub_section">
      <SectionTitle>
4.1 AN INEFFICIENT BLR(0) PARSER
</SectionTitle>
      <Paragraph position="0"> A simple, inefficient parsing algorithm for computing all possible parse-trees is: Put an instance of start node S into the 0 column.</Paragraph>
      <Paragraph position="1"> From left to right, for every  column : From bottom to top, within the column : (i) Initialize the column with the input word.</Paragraph>
      <Paragraph position="2"> (2) Recursively complete the column bottom-up using the INSERT method.</Paragraph>
      <Paragraph position="3">  This reverses the derivation algorithm into bottom-up generation: bottom-to-top, and left-to-right. In the inner loop, the Step (1) initialization is straightforward; we elaborate Step (2).  (I) Link up with predecessor instances.</Paragraph>
      <Paragraph position="4"> (2) Install self.</Paragraph>
      <Paragraph position="5"> (3) ENQUEUE successor instances for insertion. }  In (1), links are constructed between the instance and its predecessor instances. In (2), the instance becomes available for cartesian product formation. In (3), the computationally nontrivial step, the instance enqueues any successor instances within its own column. Most of the INSERT action is done by instances of symbol and rule RTN nodes. Using our INSERT method, a new symbol instance in the parse-tree links with predecessor instances, and installs itself. If the symbol's RTN node leads upwards to a rule node, one new rule instance successor is enqueued; otherwise, not. Rule instances enqueue their successors in a more complicated way, and may require cartesian product formation. A rule instance must instantiate and enqueue all RTN symbol nodes from which they could possibly be derived. At most, this is the set</Paragraph>
      <Paragraph position="7"> the label of N is identical to the label of the rule's nonterminal successor node }.</Paragraph>
      <Paragraph position="8"> For every symbol node in SAME-LABEL(rule), instances may be enqueued. If X * SAME-LABEL(rule) immediately follows a start node, i.e., it begins a chain, then a single instance of it is enqueued.</Paragraph>
      <Paragraph position="9"> If Y e SAME-LABEL(rule) does not immediately follow a start node, then more effort is required. Let X be the unique RTN node to the left of Y. Every instantiated node in the parse tree is the root of some subtree that spans an interval of the input sentence. Let the left border j be the position just to left of this interval, and k be the rightmost position, i.e., the current column.</Paragraph>
      <Paragraph position="10"> Then, as shown in Figure 5, for every instance x of X currently in position j, an instance y (of Y) is a valid extension of subsequence x that has support from the input sentence data. The cartesian product</Paragraph>
      <Paragraph position="12"> forms the set of all valid predecessor pairs for new instances of Y. Each such new instance y of Y is enqueued, with some x and the rule instance as its two predecessors. Each y is a parse-tree node representing further progress in parsing a subsequence.</Paragraph>
      <Paragraph position="14"> symbol node X in the RTN. The instance y of Y is the root ofa parse-subtree that spans (j+l ak).</Paragraph>
      <Paragraph position="15"> Therefore, the rule instance r enqueues (at leasO all instances of y, indexed by the predecessor product: { x in column j } x {r }.</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="99" end_page="101" type="metho">
    <SectionTitle>
4.2. USING THE FOLLOW-SET
</SectionTitle>
    <Paragraph position="0"> Although a rule parse-node is restricted to enqueue successor instances of RTN nodes in SAME-LABEL(rule), it can be constrained further.</Paragraph>
    <Paragraph position="1"> Specifically, if the sentence data gives no evidence for a parse-subtree, the associated symbol node instance need never be generated. This restriction can be determined column-by-column as the parsing progresses.</Paragraph>
    <Paragraph position="2"> We therefore extend our bottom-up parsing algorithm to: Put an instance of start node S into the 0 column.</Paragraph>
    <Paragraph position="3"> From left to right, for every column: From bottom to top, within the  column : (I) Initialize the column with the input word.</Paragraph>
    <Paragraph position="4"> (2) Recursively complete the column bottom-up using the INSERT method.</Paragraph>
    <Paragraph position="5"> (3) Compute the column's (rightward) Follow-Set.</Paragraph>
    <Paragraph position="6">  With the addition of Step (3), this defines our Basic LR(O), or BLR(O), parser. We now describe the Follow-Set.</Paragraph>
    <Paragraph position="7"> Once an RTN node X has been instantiated in some column, it sets up an expectation for * The RTN node(s) Yg that immediately follow it; * For each immediate follower Yg, all those RTN symbol nodes Wg,h that initiate chains that could recursively lead up to Yg.</Paragraph>
    <Paragraph position="8"> This is the Follow-Set (Aho, Sethi, and Ullman, 1986). The Follow-Set(X) is computed directly from the RTN by the recursion:  display) of RTN node V consists of the immediately following nonterminal node NP, and the two nodes immediately following the start NP node, D and N. Since D and N are terminal symbols, the traversal halts.</Paragraph>
    <Paragraph position="9"> The set of symbol RTN nodes that a rule instance r spanning (j+l,k) can enqueue is therefore not SAME-LABEL(rule), but the possibly smaller set of RTN nodes  {r}, otherwise.</Paragraph>
    <Paragraph position="10"> Enqueue all members of PROD as instances of y.</Paragraph>
    <Paragraph position="11"> The cartesian product PROD is nonempty, since an instantiated rule anticipates those elements of PROD mandated by Follow-Sets of preceding columns. The pruning of Nodes by the Follow-Set eliminates all bottom-up parsing that cannot lead to a parse-subtree at column k.</Paragraph>
    <Paragraph position="12"> In the example in Figure 7, Rule instance r is in position 4, with j=3 and k=4. We have: SAME-LABEL(r) = {N 2, N 3 }, i.e, the two symbol nodes labelled N in the sequences of Rules #2 and #3, shown in the</Paragraph>
    <Paragraph position="14"/>
    <Paragraph position="16"> rule instance r can only instantiate the single successor instance N 2. r uses the RTN to find the left RTN neighbor D of N 2. r then computes the cartesian product of instance d with r as {d}x{r}, generating the successor instance of N 2 shown.</Paragraph>
  </Section>
  <Section position="9" start_page="101" end_page="102" type="metho">
    <SectionTitle>
5. EARLEY'S PARSING ALGORITHM
</SectionTitle>
    <Paragraph position="0"> Natural languages such as English are ambiguous.</Paragraph>
    <Paragraph position="1"> A single sentence may have multiple syntactic structures. For example, extending our simple grammar with rules accounting for Prepositions and  the sentence &amp;quot;I saw a man on the hill with a telescope through the window&amp;quot; has 14 valid derivations, In parsing, separate reconstructions of these different parses can lead to exponential cost.</Paragraph>
    <Paragraph position="2"> For parsing efficiency, partially constructed instance-trees can be merged (Earley, 1970). As before, parse-node x denotes a point along a parsesequence, say, v-w-x. The left-border i of this parsesequence is the left-border of the leftmost parse-node in the sequence. All parse-sequences of RTN symbol node X that cover columns i+l through k may be collected into a single equivalence class X(i,k). For  the purposes of (1) continuing with the parse and (2) disambiguating parse-trees, members of X(i,k) are indistinguishable. Over an input sentence of length n, there are therefore no more than O(n 2) equivalence classes of X.</Paragraph>
    <Paragraph position="3"> Suppose X precedes Y in the RTN. When an instance y of Y is added m position k, k.&lt;_n, and the cartesian product is formed, there are only O(k 2) possible equivalence classes of X for y to combine with. Summing over all n positions, there are no more than O(n 3) possible product formations with Y in parsing an entire sentence.</Paragraph>
    <Paragraph position="4"> Merging is effected by adding a MERGE step to  (1) Link up with predecessor instances.</Paragraph>
    <Paragraph position="5"> (2) Install self.</Paragraph>
    <Paragraph position="6"> (3) ENQUEUE successor instances for insertion. }  The parsing merge predicate considers two instantiated sequences equivalent when:  (1) Their RTN symbol nodes X are the same.</Paragraph>
    <Paragraph position="7"> (2) They are in the same column k.</Paragraph>
    <Paragraph position="8"> (3) They have identical left borders i.</Paragraph>
    <Paragraph position="9">  The total number of links formed by INSERT during an entire parse, accounting for every grammar RTN node, is O(n3)xO(IGI). The chart parsers are a family of algorithms that couple efficient parse-tree merging with various control organizations (Winograd, 1983).</Paragraph>
  </Section>
  <Section position="10" start_page="102" end_page="103" type="metho">
    <SectionTitle>
6. TOMITA'S PARSING ALGORITHM
</SectionTitle>
    <Paragraph position="0"> In our BLR(0) parsing algorithm, even with merging, the Follow-Set is computed at every column. While this computation is just O(IGI), it can become a bottleneck with the very large grammars used in machine translation. By caching the requisite Follow-Set computations into a graph, subsequent Follow-Set computation is reduced. This incremental construction is similar to (Heering, Klint, and Rekers, 1990)'s, asymptotically constructing Tomita's all-paths LR parsing algorithm (Tomita, 1986).</Paragraph>
    <Paragraph position="1"> The Follow-Set cache (or LR-table) can be dynamically constructed by Call-Graph Caching (Perlin, 1989) during the parsing. Every time a Follow-Set computation is required, it is looked up in the cache. When not present, the Follow-Set is computed and cached as a graph.</Paragraph>
    <Paragraph position="2"> Following DeRemer (DeRemer, 1971), each cached Follow-Set node is finely partitioned, as needed, into disjoint subsets indexed by the RTN label name, as shown in the graphs of Figure 8. The partitioning reduces the cache size: instead of allowing all possible subsets of the RTN, the cache graph nodes contain smaller subsets of identically labelled symbol nodes.</Paragraph>
    <Paragraph position="3">  dynamically constructed during parsing. Each cache node represents a subset of RTN symbol nodes. The numbers indicate order of appearance; the lettered nodes partition their preceding node by symbol name. Since the cache was created on an as-needed basis, its shape parallels the shape of the parse-tree. (C) Compressing the shape of (B).  Window&amp;quot; (A) without cache node merging, and (B) with merging. grammar symbol nodes as an already existing Follow-Set node, it is merged into the older node's equivalence class. This avoids redundant expansions, without which the cache would be an infinite tree of parse paths, rather than a graph. A comparison is shown in Figure 9. If the entire LR-table cache is needed, an ambiguous sentence containing all possible lexical categories at each position can be presented; convergence follows from the finiteness of the subset construction.</Paragraph>
  </Section>
  <Section position="11" start_page="103" end_page="103" type="metho">
    <SectionTitle>
7. IMPLEMENTATION AND
CURRENT WORK
</SectionTitle>
    <Paragraph position="0"> We have developed an interactive graphical programming environment for constructing LRparsers. It uses the color MAC/II computer in the Object LISP extension of Common LISP. The system is built on CACHE TM (Perlin, (c) 1990), a general Call-Graph Caching system for animating AI algorithms.</Paragraph>
    <Paragraph position="1"> The RTNs are built from grammars. A variety of LR-RTN-based parsers, including BLR(0), with or without merging, and with or without Follow-Set caching have been constructed. Every algorithm described in this paper is implemented. Visualization is heavily exploited. For example, selecting an LR-table cache node will select all its members in the RTN display. The graphical animation component automatically drew all the RTNs and parse-trees in the Figures, and has generated color slides useful in teaching.</Paragraph>
    <Paragraph position="2"> Fine-grained parallel implementations of BLR(0) on the Connection Machine are underway to reduce the costly cartesian product step to constant time. We are also adding semantic constraints.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML