File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/p91-1012_metho.xml

Size: 21,075 bytes

Last Modified: 2025-10-06 14:12:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="P91-1012">
  <Title>COMPOSE-REWUCE PARSING</Title>
  <Section position="4" start_page="0" end_page="88" type="metho">
    <SectionTitle>
L COMPOSE-Rk~nUCE PARSING
</SectionTitle>
    <Paragraph position="0"> Why couldn't a simple breadth-first chart parser achieve linear performance on an appropriate parallel system? If you provided enough processors to immediately process all agenda entries as they were created, would not this give the desired result? No, because the processing of a single word might require many serialised  steps. Consider processing the word &amp;quot;park&amp;quot; in the sentence &amp;quot;The people who ran in the park got wet.&amp;quot; Given a simple traditional sort of grammar, that word completes an sP, which in turn completes a P P, which in turn completes a vP, which in turn completes an s, which in turn completes a REL, which in turn completes an NP.</Paragraph>
    <Paragraph position="1"> The construction/recognition of these constituents is necessarily serialised, so regardless of the number of processors available a constant-time step is impossible. (Note that this only precludes a real-time parse by this route, but not necessarily a linear one.) In the shift-reduce approach to parsing, all this means is that for non-linear grammars, a single shift step may be followed by many reduce steps. This in turn suggested the beginnings of a way out, based on categorial grammar, namely that multiple reduces can be avoided if composition is allowed. To return to our example above, in a simple shift-reduce parser we would have had all the words preceding the word &amp;quot;park&amp;quot; in the stack. When it was shifted in, there would follow six reduce steps. If alternatively following a shift step one was allowed (non-deterministically) a compose step, this could be reduced (!) to a single reduce step. Restricting ourselves to a simpler example, consider just &amp;quot;run in the park&amp;quot; as a vv, given</Paragraph>
    <Paragraph position="3"> With a composition step allowed, the parse would then proceed as follows: null Shift run as a v Shift in as a p Compose v and p to give \[vP v \[PP p * NP\]\] where I use a combination of bracketed strings and the 'dotted rule' notation to indicate the result of composition. The categorial equivalent would have been to notate v as vP/P P, P as PP/NP, and the result of the composition as therefore vP/NP.</Paragraph>
    <Paragraph position="4"> Shift the as d Compose the dotted vp with d to give \[VP v \[PP p \[NP d * n\]\]\] Shift park as n Reduce the dotted vp with n to give the complete result.</Paragraph>
    <Paragraph position="5"> Although a number of details remained to be worked out, this simple move of allowing composition was the enabling step to achieving o(n) parsing. Parallelism would arise by forking processors at each non-deterministic choice point, following the general model of Dixon's earlier work on parallelising the ATMS (Dixon &amp; de Kleer 1988).</Paragraph>
    <Paragraph position="6"> Simply allowing composition is not in itself sufficient to achieve o (n) performance. Some means of guaranteeing that each step is constant time must still be provided. Here we found two different ways forward.</Paragraph>
    <Paragraph position="7"> II. TEn~. FIRST COMPOSE-REDUCE PARSER---CR4 In this parser there is no stack.</Paragraph>
    <Paragraph position="8"> We have simply a current structure, which corresponds to the top node of the stack in a normal shift-reduce parser. This is achieved by extending the appeal to composition to include a form of left-embedded raising, which will be discussed further below.</Paragraph>
    <Paragraph position="9"> Special attention is also required to handle left-recursive rules.</Paragraph>
    <Paragraph position="10">  II.1 The Basic Parsing Algorithm The constant-time parsing step is given below (slightly simplified, in that empty productions and some unit productions are not handled). In this algorithm schema, and in subsequent discussion, the annotation &amp;quot;ND&amp;quot; will be used in situations where a number of alternatives are (or may be) described. The meaning is that these alternatives are to be pursued non-deterministically. null</Paragraph>
  </Section>
  <Section position="5" start_page="88" end_page="89" type="metho">
    <SectionTitle>
Algorithm CR-I
</SectionTitle>
    <Paragraph position="0"> egory wrt the non-unary rules in the grammar for which it is a left corner, and compose the result with the current structure.</Paragraph>
    <Paragraph position="1"> If reduction ever completes a category which is marked as the left corner of one or more left-recursive rules or rule sequences, ND raise* in place wrt those rules (sequences), and propagate the marking.</Paragraph>
    <Paragraph position="2"> Some of these ND steps may at various points produce complete structures. If .the input is exhausted, then those structures are parses, or not, depending on whether or not they have reached the distinguished symbol. If the input is not exhausted, it is of course the incomplete structures, the results of composition or raising, which are carried forward to the next step.</Paragraph>
    <Paragraph position="3"> The operation referred to above as &amp;quot;raise*&amp;quot; is more than simple raising, as was involved in the simple example in section IV. In order to allow for all possible compositions to take place all possible left-embedded raising must be pursued. Consider the following grammar fragment:</Paragraph>
    <Paragraph position="5"> and the utterance &amp;quot;Kim told Robin that the child likes Kim&amp;quot;.</Paragraph>
    <Paragraph position="6"> If we ignore all the ND incorrect paths, the current structure after the &amp;quot;that&amp;quot; has been processed is</Paragraph>
    <Paragraph position="8"> In order for the next word, &amp;quot;the&amp;quot;, to be correctly processed, it must be raised all the way to s, namely we must have \[S \[NP \[d the\] * n\] VP\]\] to compose with the current structure. What this means is that for every entry in the normal bottom-up reachability table pairing a left corner with a top category, we need a set of dotted structures, corresponding to all the ways the grammar can get from that left corner to that top category. It is these structures which are ND made available in step 4b of the parsing step algorithm CR-I above.</Paragraph>
  </Section>
  <Section position="6" start_page="89" end_page="92" type="metho">
    <SectionTitle>
II.2 Handling Left Recursion
</SectionTitle>
    <Paragraph position="0"> Now this in itself is not sufficient to handle left recursive structures, since by definition there could be an arbitrary number of left-embeddings of a left-recursive structure. The final note in the description of algorithm CR-I above is designed to handle this.</Paragraph>
    <Paragraph position="1"> Glossing over some subtleties, left-recursion is handled by marking some of the structures introduced in step 3b, and ND raising in place if the marked structure is ever completed by reduction in the course of a parse. Consider the sentence ~Robin likes the child's dog.&amp;quot; We add the following two rules to the grammar:</Paragraph>
    <Paragraph position="3"> thereby transforming D from a pre-terminal to a non-terminal. When we shift &amp;quot;the&amp;quot;, we will raise to inter alia \[NP \[D \[art the\]\] * n\] r with the NP marked for potential reraising. This structure will be composed with the then current structure  \[n child\] jr\] \] The last reduction will have completed the marked N P introduced above, so we ND left-recursively raise in place, giving</Paragraph>
    <Paragraph position="5"> which will then take us through the rest of the sentence.</Paragraph>
    <Paragraph position="6"> One final detail needs to be cleared up. Although directly left-recursive rules, such as e.g. NP -9 NP PP, are correctly dealt with by the above mechanism, indirectly left-recursive sets of rules, such as the one exemplified above, require one additional subtlety. Care must be taken not to introduce the potential for spurious ambiguity. We will introduce the full details in the next section.</Paragraph>
    <Paragraph position="7"> II.3 Nature of the required tables Steps 3 and 4b of CR-I require tables of partial structures: Closures of unit productions up from pre-terminals, for step 3; left-reachable raisings up from (unit production closures of) preterminals, for step 4b. In this section we discuss the creation of the necessary tables, in particular Raise*, against the background of a simple exemplary grammar, given below as  We have grouped the rules according to type--two kinds of unit productions (from pre-terminals or non-terminals), two kinds of left recursive rules (direct and indirect) and the remainder. null</Paragraph>
    <Paragraph position="9"> As a first step towards computing the table which step 4b above would use, we can pre-compute the partial structures given above in Table 2.</Paragraph>
    <Paragraph position="10"> c l* contains all backbone fragments constructable from the unit productions, and is already essentially what we require for step 3 of the algorithm. LRdir contains all directly left-recursive structures. LRindir2 contains all indirectly left-recursive structures involving exactly two rules, and there might be LRindir3, 4,... as well. R s* contains all non-recursive tree fragments constructable from leftembedding of binary or greater rules and non-terminal unit productions.</Paragraph>
    <Paragraph position="11"> The superscripts denote loci where left-recursion may be appropriate, and identify the relevant structures.</Paragraph>
    <Paragraph position="12"> In order to get the full Raise* table needed for step 4b, first we need to project the non-terminal left daughters of rules such as \[ s NpI' 2 VP \] down to terminal left daughters. We achieve this by substituting terminal entries from Cl* wherever we can in LRdir, LRindir2 and Rs* to give us Table 3 from Table 2 (new embeddings are underlined).</Paragraph>
    <Paragraph position="13"> Left recursion has one remaining problem for us. Algorithm CR-I only checks for annotations and ND raises in place after a reduction completes a constituent. But in the last line of Ras* above there are unit constituents  n\]l, 2 with annotations. Being already complete, they will not ever be completed, and consequently the annotations will never be checked. So we pre-compute the desired result, augmenting the above list with expansions of those units via the indicated left recursions. This gives us the final version of Raise *, now shown with dots included, in Table 4.</Paragraph>
    <Paragraph position="14"> This table is now suited to its role in the algorithm. Every entry has a lexical left daughter, all annotated constituents are incomplete, and all unit productions are factored in. It is interesting to note that with these tree fragments, taken together with the terminal entries in Cl*, as the initial trees and LRdir, LRindir2 , etc. as the auxiliary trees we have a Tree Adjoining Grammar (Joshi 1985) which is strongly equivalent to the CF-PSG we started with. We might call it the left-lexical TAG for that CF-PSG, after Schabes et al. (1988). Note further that if a TAG parser respected the annotations as restricting adjunction, no spuriously ambiguous parses would be produced.</Paragraph>
    <Paragraph position="15"> Indeed it was via this relationship with TAGs that the details were worked out of how the annotations are distributed, not presented here to conserve space.</Paragraph>
    <Paragraph position="16"> II.4 Implementation and Efficiency Only a serial pseudo-parallel implementation has been written.</Paragraph>
    <Paragraph position="17"> Because of the high degree of pre-computation of structure, this version even though serialised runs quite efficiently. There is very little computation at each step, as it is straight-forward to double index the mai s e* table so that only structures which will compose with the current structure are retrieved.</Paragraph>
    <Paragraph position="18"> The price one pays for this efficiency, whether in serial or parallel versions, is that only left-common structure is shared. Right-common structure, as for instance in P P attachment ambiguity, is not shared between analysis paths. This causes no difficulties for the parallel approach in one sense, in that it does not compromise the real-time performance of the parser. Indeed, it is precisely because no recombination is attempted that the basic parsing step is constant time.</Paragraph>
    <Paragraph position="19"> But it does mean that if the CF-PSG being parsed is the first half of a two step process, in which additional con- null straints are solved in the second pass, then the duplication of structure will give rise to duplication of effort. Any parallel parser which adopts the strategy of forking at non-deterministic choice points will suffer from this weakness, including CR-II below.</Paragraph>
  </Section>
  <Section position="7" start_page="92" end_page="92" type="metho">
    <SectionTitle>
III. THE SECOND COMPOSE-R~nUCE
PARSER CR-II
</SectionTitle>
    <Paragraph position="0"> Our second approach to compose-reduce parsing differs from the first in retaining a stack, having a more complex basic parsing step, while requiring far less pre-processing of the grammar. In particular, no special treatment is required for left-recursive rules. Nevertheless, the basic step is still constant time, and despite the stack there is no potential processing 'balloon' at the end of the input.</Paragraph>
    <Paragraph position="1"> III. 1 The Basic Parsing Algorithm  egory with the top of the stack--if results are complete and there is input remaining, pop the stack;  the top of the stack, replacing it.</Paragraph>
    <Paragraph position="2"> This is not an easy algorithm to understand. In the next section we present a number of different ways of motivating it, together with an illustrative example.</Paragraph>
  </Section>
  <Section position="8" start_page="92" end_page="93" type="metho">
    <SectionTitle>
III.2 CR-II Explained
</SectionTitle>
    <Paragraph position="0"> Let us first consider how CR-II will operate on purely left-branching and purely right-branching structures. In each case we will consider the sequence of algorithm steps along the non-deterministically correct path, ignoring the others. We will also restrict ourselves to considering binary branching rules, as pre-terminal unit productions are handled entirely by step 3 of the algorithm, and non-terminal unit productions must be factored into the grammar. On the other hand, interior daughters of non-binary nodes are all handled by step 4 without changing the depth of the stack.</Paragraph>
    <Paragraph position="1"> III.2.1 Left-branching analysis For a purely left-branching structure, the first word will be processed by steps 1, 2, 5a and 5b, producing a stack with one entry which we can schematise as in Figure 1, where filled circles are processed nodes and unfilled ones are waiting.</Paragraph>
    <Paragraph position="2">  All subsequent words except the last will be processed by steps 4, 5a and 5b (here and subsequently we will not mention steps 1 and 2, which occur for all words), effectively replacing the previous sole entry in the stack with the one given in Figure 2.</Paragraph>
    <Paragraph position="3">  It should be evident that the cycle of steps 4, 5a and 5b constructs a left-branching structure of increasing depth as the sole stack entry, with one right daughter, of the top node, waiting to be filled. The last input word of course is simply processed by step 4 and, as there is no further input, left on the stack as the final result. The complete sequence of steps for any left-branching analysis is thus raiseJreduce&amp;raise*--reduce. An ordinary shift-reduce or left-corner parser would go through the same sequence of steps.</Paragraph>
  </Section>
  <Section position="9" start_page="93" end_page="95" type="metho">
    <SectionTitle>
III.2.2 Right-branching analysis
</SectionTitle>
    <Paragraph position="0"> The first word of a purely right-branching structure is analysed exactly as for a left-branching one, that  Subsequent words, except the last, are processed via steps 5a and 5c, with the result remaining as the sole stack entry, as in Figure 4.</Paragraph>
    <Paragraph position="1">  Again it should be evident that cycling steps 5a and 5c will construct a right-branching structure of increasing depth as the sole stack entry, with one right daughter, of the most embedded node, waiting to be filled. Again, the last input word will be processed by step 4. The complete sequence of steps for any right-branching analysis is thus raisem raise&amp;compose*--reduce. A categorial grammar parser with a composefirst strategy would go through an isomorphic sequence of steps.</Paragraph>
    <Paragraph position="2"> III.2.3 Mixed Left- and Right-branching Analysis All the steps in algorithm CR-II have now been illustrated, but we have yet to see the stack grow beyond one entry. This will occur in where an individual word, as opposed to a completed complex constituent, is processed by steps 5a and 5b, that is, where steps 5a and 5b apply other than to the results of step 4.</Paragraph>
    <Paragraph position="3"> Consider for instance the sentence &amp;quot;the child believes that the dog likes biscuits. ~ With a grammar which I trust will be obvious, we would arrive at the structure shown in Figure 5 after processing &amp;quot;the child believes that ~, having done raise--reduce&amp;  raiseJraise&amp;compose-raise&amp;compose, that is, a bit of left-branching analysis, followed by a bit of right-branching analysis.</Paragraph>
    <Paragraph position="4">  thai Flr~hle~ir~ili~\[::~: be dorieS the child believes t~ v~p with &amp;quot;the&amp;quot; which will allow immediate integration with this. The ND correct path applies steps 5a and 5b, raise&amp;push, giving a stack as shown in Figure 6:  We can then apply steps 4, 5a and 5c, reduce&amp;raise&amp;compose, to &amp;quot;dog&amp;quot;, with the result shown in Figure 7.</Paragraph>
    <Paragraph position="5"> This puts uss back on the standard right-branching path for the rest of the  Returning to a question raised earlier, we can now see how a chart parser could be modified in order to run in real-time given enough processors to empty the agenda as fast as it is filled. We can reproduce the processing of CR-II within the active chart parsing framework by two modifications to the fundamental rule (see e.g. Gazdar and Mellish 1989 or Thompson and Ritchie 1984 for a tutorial introduction to active chart parsing). First we restrict its normal operation, in which an active and an inactive edge are combined, to apply only in the case of pre-terminal inactive edges. This corresponds to the fact that in CR-II step 4, the reduction step, applies only to pre-terminal categories (continuing to ignore unit productions). Secondly we allow the fundamental rule to combine two active edges, provided the category to be produced by one is what is required by the other. This effects composition. If we now run our chart parser left-to-right, left-corner and breadth-first, it will duplicate CR-II.  The maximum number of edges along a given analysis path which can be introduced by the processing of a single word is now at most four, corresponding to steps 2, 4, 5a and 5c of CR-IIDthe pre-terminal itself, a constituent completed by it, an active edge containing that constituent as left daughter, created by left-corner rule invocation, and a further active edge combining that one with one to its left. This in turn means that there is a fixed limit to the amount of processing required for each word.</Paragraph>
    <Paragraph position="6"> III.4 Implementation and Efficiency Although clearly not benefiting from as much pre-computation of structure as CR-I, CR-II is also quite efficient. Two modifications can be added to improve efficiencyDa reachability filter on step 5b, and a shaper test (Kuno 1965), also on 5b. For the latter, we need simply keep a count of the number of open nodes on the stack (equal to the number of stack entries if all rules are binary), and ensure that this number never exceeds the number of words remaining in the input, as each entry will require a number of words equal to the number of its open nodes to pop it off the stack. This test actually cuts down the number of non-deterministic paths quite dramatically, as the ND optionality of step 5b means that quite deep stacks would otherwise be pursued along some search paths. Again this reduction in search space is of limited significance in a true parallel implementation, but in the serial simulation it makes a big difference.</Paragraph>
    <Paragraph position="7"> Note also that no attention has been paid to unit productions, which we pre-compute as in CR-I. Furthermore, neither CR-I nor CR-II address empty productions, whose effect would also need to be pre-computed.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML