XML Viewer - p98-1061

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1061_metho.xml
Size: 23,955 bytes
Last Modified: 2025-10-06 14:14:54
<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1061">
  <Title>A structure-sharing parser for lexicalized grammars</Title>
  <Section position="3" start_page="0" end_page="372" type="metho">
    <SectionTitle>
2 Automaton-based parsing
</SectionTitle>
    <Paragraph position="0"> Conventional LTAG parsers (Vijay-Shanker and Joshi, 1985; Schabes and Joshi, 1988; Vijay-Shanker and Weir, 1993) maintain a parse table, a set of items corresponding to complete and partial constituents. Parsing proceeds by first seeding the table with items anchored on the input string, and then repeatedly scanning the table for parser actions. Parser actions introduce new items into the table licensed by one or more items already in the table. The main types of parser actions are: 1. extending a constituent by incorporating a complete subconstituent (on the left or 1However, due to lack of space, no proofs and only minimal informal descriptions are given in this paper.  right); 2. extending a constituent by adjoining a surrounding complete auxiliary constituent; 3. predicting the span of the foot node of an  auxiliary constituent (to the left or right). Parsing is complete when all possible parser actions have been executed.</Paragraph>
    <Paragraph position="1"> In a completed parse table it is possible to trace the sequence of items corresponding to the recognition of an elementary tree from its lexical anchor upwards. Each item in the sequence corresponds to a node in the tree (with the sequence as a whole corresponding to a complete traversal of the tree), and each step corresponds to the parser action that licensed the next item, given the current one. From this perspective, parser actions can be restated relative to the items in such a sequence as:  1. substitute a complete subconstituent (on the left or right); 2. adjoin a surrounding complete auxiliary constituent; 3. predict the span of the tree's foot node (to  the left or right).</Paragraph>
    <Paragraph position="2"> The recognition of the tree can thus be viewed as the computation of a finite state automaton, whose states correspond to a traversal of the tree and whose input symbols are these relao tivised parser actions.</Paragraph>
    <Paragraph position="3"> This perspective suggests a re-casting of the conventional LTAG parser in terms of such automata 2. For this automaton-based parser, the grammar structures are not trees, but automata corresponding to tree traversals whose inputs are strings of relativised parser actions. Items in the parse table reference automaton states instead of tree addresses, and if the automaton state is final, the item represents a complete constituent. Parser actions arise as before, but are executed by relativising them with respect to the incomplete item participating in the action, and passing this relativised parser action as the next input symbol for the automaton referenced by that item. The resulting state of that automaton is then used as the referent of the newly licensed item.</Paragraph>
    <Paragraph position="4"> On a first pass, this re-casting is exactly that: it does nothing new or different from the original 2Evans and Weir (1997) provides a longer informal introduction to this approach.</Paragraph>
    <Paragraph position="5"> parser on the original grammar. However there are a number of subtle differences3: * the automata are more abstract than the trees: the only grammatical information they contain are the input symbols and the root node labels, indicating the category of the constituent the automaton recognises; * automata for several trees can be merged together and optimised using standard well-studied techniques, resulting in a single automaton that recognises many trees at once, sharing as many of the common parser actions as possible.</Paragraph>
    <Paragraph position="6"> It is this final point which is the focus of this paper. By representing trees as automata, we can merge trees together and apply standard optimisation techniques to share their common structure. The parser will remain unchanged, but will operate more efficiently where structure has been shared. Additionally, because the automata are more abstract than the trees, capturing precisely the parser's view of the trees, sharing may occur between trees which are structurally quite different, but which happen to have common parser actions associated with them.</Paragraph>
  </Section>
  <Section position="4" start_page="372" end_page="373" type="metho">
    <SectionTitle>
3 Merging and minimising automata
</SectionTitle>
    <Paragraph position="0"> Combining the automata for several trees can be achieved using a variety of standard algorithms (Huffman, 1954; Moore, 1956). However any transformations must respect one important feature: once the parser reaches a final state it needs to know what tree it has just recognised 4. When automata for trees with different root categories are merged, the resulting automaton needs to somehow indicate to the parser what trees are associated with its final states.</Paragraph>
    <Paragraph position="1"> In Evans and Weir (1997), we combined automata by introducing a new initial state with e-transitions to each of the original initial states, 3A further difference is that the traversal encoded in the automaton captures part of the parser's control strategy. However for simplicity we assume here a fixed parser control strategy (bottom-up, anchor-out) and do not pursue this point further - Evans and Weir (1997) offers some discussion.</Paragraph>
    <Paragraph position="2"> 4For recognition alone it only needs to know the root category of the tree, but to recover the parse it needs to identify the tree itself.</Paragraph>
    <Paragraph position="3">  and then determinising the resulting automaton to induce some sharing of structure. To recover trees, final automaton states were annotated with the number of the tree the final state is associated with, which the parser can then readily access.</Paragraph>
    <Paragraph position="4"> However, the drawback of this approach is that differently annotated final states can never be merged, which restricts the scope for structure sharing (minimisation, for example, is not possible since all the final states are distinct). To overcome this, we propose an alternative approach as follows: * each automaton transition is annotated with the set of trees which pass through it: when transitions are merged in automaton optimisation, their annotations are unioned; * the parser maintains for each item in the table the set of trees that are valid for the item: initially this is all the valid trees for the automaton, but gets intersected with the annotation of any transition followed; also if two paths through the automaton meet (i.e., an item is about to be added for a second time), their annotations get unioned.</Paragraph>
    <Paragraph position="5"> This approach supports arbitrary merging of states, including merging all the final states into one. The parser maintains a dynamic record of which trees are valid for states (in particular final states) in the parse table. This means that we can minimise our automata as well as determinising them, and so share more structure (for example, common processing at the end of the recognition process as well as the beginning).</Paragraph>
  </Section>
  <Section position="5" start_page="373" end_page="373" type="metho">
    <SectionTitle>
4 Recognition and parse recovery
</SectionTitle>
    <Paragraph position="0"> We noted above that a parsing algorithm needs to be able to access the tree that an automaton has recognised. The algorithm we describe below actually needs rather more information than this, because it uses a two-phase recognition/parse-recovery approach.</Paragraph>
    <Paragraph position="1"> The recognition phase only needs to know, for each complete item, what the root label of the tree recognised is. This can be recovered from the 'valid tree' annotation of the complete item itself (there may be more than one valid tree, corresponding to a phrase which has more than one parse which happen to have been merged together). Parse recovery, however, involves running the recogniser 'backwards' over the completed parse table, identifying for each item, the items and actions which licensed it.</Paragraph>
    <Paragraph position="2"> A complication arises because the automata, especially the merged automata, do not directly correspond to tree structure. The recogniser returns the tree recognised, and a search of the parse table reveals the parser action which completed its recognition, but that information in itself may not be enough to locate exactly where in the tree the action took place. However, the additional information required is static, and so can be pre-compiled as the automata themselves are built up. For each action transition (the action, plus the start and finish states) we record the tree address that the transition reaches (we call this the action-site, or just a-site for short). During parse recovery, when the parse table indicates an action that licensed an item, we look up the relevant transition to discover where in the tree (or trees, if we are traversing several simultaneously) the present item must be, so that we can correctly construct a derivation tree.</Paragraph>
  </Section>
  <Section position="6" start_page="373" end_page="376" type="metho">
    <SectionTitle>
5 Technical details
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="373" end_page="374" type="sub_section">
      <SectionTitle>
5.1 Constructing the automata
</SectionTitle>
      <Paragraph position="0"> We identify each node in an elementary tree 7 with an elementary address 7/i. The root of 7 has the address 7/e where e is the empty string. Given a node 7/i, its n children are addressed from left to right with the addresses 7/il,...&amp;quot;//in, respectively. For convenience, let anchor (7) and foot (7) denote the elementary address of the node that is the anchor and footnode (if it has one) of 7, respectively; and label (7/i) and parent (7/i) denote the label of 7/i and the address of the parent of 7/i, respectively. null In this paper we make the following assumuptions about elementary trees. Each tree has a single anchor node and therefore a single spine 5. In the algorithms below we assume that nodes not on the spine have no children. In practice, not all elementary LTAG trees meet these conditions, and we discuss how the approach described here might be extended to the more gen- null eral case in Section 6.</Paragraph>
      <Paragraph position="1"> Let &amp;quot;y/i be an elementary address of a node on the spine of 7 with n children &amp;quot;y/il,... ,7/ik,... ,7~in for n &gt; 1, where k is such that 7/ik dominates anchor (7).</Paragraph>
      <Paragraph position="3"> next defines a function that traverses a spine, starting at the anchor. Traversal of an elementary tree during recognition yields a sequence of parser actions, which we annotate as follows: the two actions A and ~ indicate a substitution of a tree rooted with A to the left or right, respectively; A and +A indicate the presence of the foot node, a node labelled A, to the left or right, respectively; Finally A indicates an adjunct+-on of a tree with root and foot labelled A. These actions constitute the input language of the automaton that traverses the tree. This automaton is defined as follows (note that we use e-transitions between nodes to ease the construction - we assume these are removed using a standard algorithm).</Paragraph>
      <Paragraph position="4"> Let 9' be an elementary tree with terminal and nonterminal alphabets VT and VN, respectively.</Paragraph>
      <Paragraph position="5"> Each state of the following automaton specifies the elementary address 7/i being visited. When the node is first visited we use the state _L\[-y/i\]; when ready to move on we use the state T\[7/i\].</Paragraph>
      <Paragraph position="6"> Define as follows the finite state automaton M = (Q, E, \]_\[anchor (7)\],6, F). Q is the set of states, E is the input alphabet, q0 is the initial state, (~ is the transition relation, and F is the set of final states.</Paragraph>
      <Paragraph position="8"/>
      <Paragraph position="10"> In order to recover derivation trees, we also define the partial function a-site(q,a,q') for (q, a, q') E ~ which provides information about the site within the elementary tree of actions occurring in the automaton.</Paragraph>
      <Paragraph position="11"> a-site(q, a, q') = { &amp;quot;y/i if a C/ e &amp; q' -- T\['l/i\] undefined otherwise</Paragraph>
    </Section>
    <Section position="2" start_page="374" end_page="375" type="sub_section">
      <SectionTitle>
5.2 Combining Automata
</SectionTitle>
      <Paragraph position="0"> Suppose we have a set of trees F -{71,... ,% }. Let M~I,... ,M~, be the e-free automata that are built from members of the set F using the above construction, where for 1 &lt; k &lt; n, Mk = (Qk, P,k, qk,~k, Fk).</Paragraph>
      <Paragraph position="1"> Construction of a single automaton for F is a two step process. First we build an automaton that accepts all elementary computations for trees in F; then we apply the standard automaton determinization and minimization algorithms to produce an equivalent, compact automaton. The first step is achieved simply by introducing a new initial state with e-transitions to each of the qk: Let M = (Q, ~, qo, 6, F) where</Paragraph>
      <Paragraph position="3"> We determinize and then minimize M using the standard set-of-states constructions to produce Mr -- (Q', P,, Q0, (V, F'). Whenever two states are merged in either the determinizing or minimizing algorithms the resulting state is named by the union of the states from which it is formed.</Paragraph>
      <Paragraph position="4"> For each transition (Q1, a, Q2) E (V we define the function a-sites(Q1, a, Q2) to be a set of elementary nodes as follows: a-sites(Q1, a, Q2) = Uq, eq,,q=eq= a-site(ql, a, q2) Given a transition in Mr, this function returns all the nodes in all merged trees which that tran- null sition reaches.</Paragraph>
      <Paragraph position="5"> Finally, we define: cross(Q1, a, Q2) = { 7 \['y/i E a-sites(Q1, a, Q2) } This gives that subset of those trees whose elementary computations take the Mr through state Q1 to Q2. These are the transition annotations referred to above, used to constrain the parser's set of valid trees.</Paragraph>
    </Section>
    <Section position="3" start_page="375" end_page="376" type="sub_section">
      <SectionTitle>
5.3 The Recognition Phase
</SectionTitle>
      <Paragraph position="0"> This section illustrates a simple bottom-up parsing algorithm that makes use of minimized automata produced from sets of trees that anchor the same input symbol.</Paragraph>
      <Paragraph position="1"> The input to the parser takes the form of a sequence of minimized automata, one for each of the symbols in the input. Let the input string be w = at...ar~ and the associated automata be M1,...Mn where Mk = (Qk, Ek, qk,(~k, Fk) for 1 _&lt; k &lt; n. Let treesof(Mk) = Fk where Fk is a set of the names of those elementary trees that were used to construct the automata Mk.</Paragraph>
      <Paragraph position="2"> During the recognition phase of the algorithm, a set I of items are created. An item has the form (T, q, \[l, r,l', r'\]) where T is a set of elementary tree names, q is a automata state and l, r, l', r' * { 0,... , n, - } such that either l&lt;_l'&lt;_r ~&lt;_rorl&lt;randl ~=r'=-. Theindices l, l', #, r are positions between input symbols (position 0 is before the first input symbols and position n is after the final input symbol) and we use wp,p, to denote that substring of the input w between positions p and p~. I can be viewed as a four dimensional array, each entry of which contains a set of pairs comprising of a set of nonterminals and an automata state.</Paragraph>
      <Paragraph position="3"> Roughly speaking, an item (T, q, \[l, r, l', r\]) is included in I when for every 't * T, anchored by some ak (where I &lt; k &lt; r and ifl I ~ then k &lt; l ~ or r t &lt; k); q is a state in Qk, such that some elementary subcomputation reaching q from the initial state, qk, of Mk is an initial substring of the elementary computation for 't that reaches the elementary address &amp;quot;t/i, the subtree rooted at &amp;quot;t/i spans Wl,r, and if't/i dominates a foot node then that foot node spans Wl, r, , otherwise l ~ = r ~ = -.</Paragraph>
      <Paragraph position="4"> The input is accepted if an item (T, qs,\[O,n,-,-\]) is added to I where T contains some initial tree rooted in the start symbol S and qf * Fk for some k.</Paragraph>
      <Paragraph position="5"> When adding items to I we use the procedure add(T, q, \[/, r, l', r'\]) which is defined such that if there is already an entry (T ~, q, \[/, r, l ~, rq/ * I for some T ~ then replace this with the entry (T U T', q, \[/, r, l', #\])6; otherwise add the new entry {T, q, \[l, r, l', r'\]) to I.</Paragraph>
      <Paragraph position="6"> I is initialized as follows. For each k * { 1,... ,n } call add(T, qk,\[k- 1, k,-,-\]) where T = treesof(Mk) and qk is the initial state of the automata Mk.</Paragraph>
      <Paragraph position="7"> We now present the rules with which the complete set I is built. These rules correspond closely to the familiar steps in existing bottom-up LTAG parser, in particular, the way that we use the four indices is exactly the same as in other approaches (Vijay-Shanker and Joshi, 1985). As a result a standard control strategy can be used to control the order in which these rules are applied to existing entries of I.</Paragraph>
      <Paragraph position="8">  1. If (T,q,\[l,r,l',r'\]),(T',qI,\[r,r&amp;quot;,-,-\]) e I,  ql E Fk for some k, (q, A, q,) E ~k' for some k r, label ('//e) = A from some 't' E T' &amp; T&amp;quot; = T n cross(q,A, qt) then call  add(T&amp;quot;, q', If, r&amp;quot;, l', r'\]).</Paragraph>
      <Paragraph position="9"> 2. If (T, q, \[l, r, l r, rq), (T', ql, \[l&amp;quot;, l, -, -\]) * I, ql * Fk for some k, (q,A,q~) * ~k' for some k t, label ('t~/e) = A from some 't~ * T ~ &amp; T&amp;quot; = T N cross(q,A,q~) then call add(T&amp;quot;, q', \[l&amp;quot;, r, l', r'\]).</Paragraph>
      <Paragraph position="10"> 3. If (T,q,\[l,r,-,-\]) * I, (q,_A.,q,) * ~k for some k &amp; T' = T n cross(q,_A.,q') then for each r' such that r &lt; r' &lt; n call m add(T', q', \[l, r', r, r'\]}.</Paragraph>
      <Paragraph position="11"> 4. If (T, q, \[l, r, -, -\]) * I, (q,/A,q') * ~k for some k &amp; T ~ = Tncross(q,.A,q~) then for each I r such that 0 &lt; l ~ &lt; l call add(T', q', \[l', r, l', l\]).</Paragraph>
      <Paragraph position="12"> 5. If (T,q,\[l,r,l',r'\]),(T',q/,\[l&amp;quot;,r&amp;quot;,l,r\]) * I,  ql * Fk for some k, (q,A,q') * (fk, for some k ~, label ('t~/e) = A from some 't~ * T' &amp; T&amp;quot; = T r'l cross(q, A,q,) then call add(T&amp;quot;, q', \[l&amp;quot;, r&amp;quot;, l', r'\]). 6This replacement is treated as a new entry in the table. If the old entry has already licenced other entries, this may result in some duplicate processing. This could be eliminated by a more sophisticated treatment of tree sets.</Paragraph>
      <Paragraph position="13">  The running time of this algorithm is O(n 6) since the last rule must be embedded within six loops each of which varies with n. Note that although the third and fourth rules both take O(n) steps, they need only be embedded within the l and r loops.</Paragraph>
    </Section>
    <Section position="4" start_page="376" end_page="376" type="sub_section">
      <SectionTitle>
5.4 Recovering Parse Trees
</SectionTitle>
      <Paragraph position="0"> Once the set of items I has been completed, the final task of the parser is to a recover derivation tree 7. This involves retracing the steps of the recognition process in reverse. At each point, we look for a rule that would have caused the inclusion of item in I. Each of these rules involves some transition (q, a, ql) * 5k for some k where a is one of the parser actions, and from this transition we consult the set of elementary addresses in a-sites(q, a, q~) to establish how to build the derivation tree. We eventually reach items added during the initialization phase and the process ends. Given the way our parser has been designed, some search will be needed to find the items we need. As usual, the need for such search can be reduced through the inclusion of pointers in items, though this is at the cost of increasing parsing time. There are various points in the following description where nondeterminism exists. By exploring all possible paths, it would be straightforward to produce an AND/OR derivation tree that encodes all derivation trees for the input string.</Paragraph>
      <Paragraph position="1"> We use the procedure der((T, q, If, r, l', r'\]), r) which completes the partial derivation tree r by backing up through the moves of the automata in which q is a state.</Paragraph>
      <Paragraph position="2"> A derivation tree for the input is returned by the call der((T, ql, \[0, n, -, -\]), ~-) where (T, qs,\[O,n,-,-\]) * I such that T contains some initial tree 7 rooted with the start non-terminal S and ql is the final state of some automata Mk, 1 &lt;_ k &lt;_ n. r is a derivation tree containing just one node labelled with name % In general, on a call to der((T, q, \[l, r, l ~, rq), T) we examine I to find a rule that has caused this item to be included in I. There are six rules to consider, corresponding to the five recogniser rules, plus lexical introduction, as follows: 1. If (T', q', \[l, r&amp;quot;, l', r'\]), (T', ql, \[r&amp;quot;, r, -, -\]) * 7Derivation trees axe labelled with tree names and edges axe labelled with tree addresses.</Paragraph>
      <Paragraph position="3"> I, qs E Fk for some k, (q', A, q) E ~k' for some k ~, &amp;quot;), is the label of the root of r, &amp;quot;)' E T', label (7'/e) = A from some &amp;quot;y' E T&amp;quot; &amp; &amp;quot;y/i e a-sites(q', A, q), then let r' be the derivation tree containing a single node labelled &amp;quot;/', and let r '~ be the result of attaching der((T&amp;quot;, ql, Jr&amp;quot;, r, -, -\]), r') under the root of r with an edge labelled the tree address i. We then complete the derivation tree by calling der((T', q', \[l, r I', l', r'\]), T').  derivation tree containing a single node labelled -y~, and let T ~ be the result of attaching der((T&amp;quot;, ql, \[l, r', -, -\]), r I) under the root of T with an edge labelled the tree address i. We then complete the derivation tree by calling der((T', q', \[r '~, r, l ~, rq), r'~). 3. If r = r ~, (T~,q~,\[l,l~,-,-\]) * I and (q~,_A,,q) * 5k for some k, &amp;quot;y is the label of the root of 7-, ~/ * T' and foot ('),) * a-sites(q t, A/, q) then make the call der((T', q', \[l, l',-,-\]), r).</Paragraph>
      <Paragraph position="4"> 4. If / = l', (T', q', \[r', r, -, -\]) E I and (q,,+A,ql) * 5k for some k, &amp;quot;), is the label of the root of ~-, -), E T ~ and foot (~/) * a-sites(q', +A, q) then make the call der((T', ql, Jr', r, -, -\]), r).</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="376" end_page="376" type="metho">
    <SectionTitle>
5. If (T~,q ', \[l',r'~,l~,r'\]), (T~I, qs, \[l,r,l',r&amp;quot;\]) *
</SectionTitle>
    <Paragraph position="0"> I, ql * Fk for some k, (q~, A, q) * 5k, for some k ~, ~, is the label of the root of r, &amp;quot;), * T ~, label ('y~/e) = A from some ~/' * T&amp;quot; and &amp;quot;I/i * a-sites(q', A,q), then let T' be the derivation tree containing a single node labelled &amp;quot;/~, and let T&amp;quot; be the result of attaching der((T&amp;quot;, q/, \[l, r, l&amp;quot;, r&amp;quot;\]), ~-') under the root of r with an edge labelled the tree address i. We then complete the derivation tree by calling der((T', ql, \[In, r 'l, l', r'\]), Tll).  6. If l + 1 = r, r ~ = l ~ ---- -- q is the initial state of Mr, &amp;quot;)' is the label of the root ofT, &amp;quot;,/* T, then return the final derivation tree T.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML