File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/95/e95-1011_metho.xml

Size: 20,176 bytes

Last Modified: 2025-10-06 14:14:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="E95-1011">
  <Title>A Tractable Extension of Linear Indexed Grammars</Title>
  <Section position="4" start_page="0" end_page="77" type="metho">
    <SectionTitle>
2 From Stacks to Trees
</SectionTitle>
    <Paragraph position="0"> An Indexed Grammar (IG) can be viewed as a CFG in which each nonterminal is associated with a stack of indices. Productions specify not only how nonterminals can be rewritten but also how their associated stacks are modified. 1_16, which were first described by Gazdar (1988), are constrained such that stacks are passed from the mother to at most a single daughter.</Paragraph>
    <Paragraph position="1"> For I_IG, the size of the domain of nonterminals and associated stacks (the analogue of the nonterminals in CFG) is not bound by the grammar. However, Vijay-Shanker and Weir (1993) demonstrate that polynomial time performance  can be achieved through the use of structure-sharing made possible by constraints in the way that LI6 use stacks. Although stacks of unbounded size can arise during a derivation, it is not possible for a LIG to specify that two dependent, unbounded stacks must appear at distinct places in the derivation tree. Structure-sharing can therefore be used effectively because checking the applicability of rules at each step in the derivation involves the comparison of structures of limited size.</Paragraph>
    <Paragraph position="2"> Our goal is to generalize the constraints inherent in LIG to a formalism that manipulates feature structures rather than stacks. As a guidl ing heuristic we will avoid formalisms that generate tree sets with an unbounded number of unbounded, dependent branches. It appears that the structure-sharing techniques used with LIG cannot be generalized in a straightforward way to such formalisms.</Paragraph>
    <Paragraph position="3"> Suppose that we generalize LIG to allow the stack to be passed from the mother to two daughters. If this is done recursion can be used to produce an unbounded number of unbounded, dependent branches. An alternative is to allow an unbounded stack to be shared between two (or more) daughters but not with the mother. Thus, rules may mention more than one unbounded stack, but the stack associated with the mother is still associated with at most one daughter. We refer to this extension as Partially Linear Indexed Grammars (PLIG).</Paragraph>
    <Paragraph position="4"> Example 1 The PLIG with the following productions generates the language { anbmcnd m In, m &gt; 1 } and the tree set shown in Figure 1. Because a single PUG production may mention more than one unbounded stack, variables (x, y) are introduced to distinguish between them. The notation A\[xa\] is used to denote the nonterminal A associated with any stack whose top symbol is ~r.</Paragraph>
    <Paragraph position="6"> Ak \[x~r\] --+ ak Ak \[x\], Ak ~ --* A.</Paragraph>
    <Paragraph position="7"> In PLIG, stacks shared amongst siblings cannot be passed to the mother. As a consequence, there is no possibility that recursion can be used to increase the number of dependent branches. In fact, the number of dependent branches is bounded by the length of the right-hand-side of productions.</Paragraph>
    <Paragraph position="8"> By the same token, however, PUG may only generate structural descriptions in which dependent  another. Note that the tree shown in Figure 2 is unobtainable because the branch rooted at 7/1 is dependent on more than one of the branches originating at its sibling r/2.</Paragraph>
    <Paragraph position="9"> This limitation can be overcome by moving to a formalism that manipulates trees rather than stacks. We consider an extension of CFG in which each nonterminal A is associated with a tree r.</Paragraph>
    <Paragraph position="10"> Productions now specify how the tree associated with the mother is related to the trees associated with the daughters. We denote trees with first order terms. For example, the following production requires that the x and y subtrees of the mother's tree are shared with the B and C daughters, respectively. In addition, the daughters have in common the subtree z.</Paragraph>
    <Paragraph position="12"> There is a need to incorporate some kind of generalized notion of linearity into such a system.</Paragraph>
    <Paragraph position="13"> Corresponding to the linearity restriction in \[16 we require that any part of the mother's tree is passed to at most one daughter. Corresponding to the partial linearity of PIAG, we permit subtrees that are not shared with the mother to be shared amongst the daughters. Under these conditions, the tree set shown in Figure 2 can be generated.</Paragraph>
    <Paragraph position="14">  The nodes 71 and r/2 share the tree rn, which occurs twice at the node r/2. At r12 the two copies of rn are distributed across the daughters.</Paragraph>
    <Paragraph position="15"> The formalism as currently described can be used to simulate arbitrary Turing Machine computations. To see this, note that an instantaneous description of a Turing Machine can be encoded with a tree as shown in Figure 3. Moves of the Turing Machine can be simulated by unary productions. The following production may be glossed: &amp;quot;if in state q and scanning the symbol X, then change state to q~, write the symbol Y and move left&amp;quot; 1 A\[q(W(x), X, y)\] --* A\[q'(x, W, r(y))\] One solution to this problem is to prevent a single daughter sharing more than one of its subtrees with the mother. However, we do not impose this restriction because it still leaves open the possibility of generating trees in which every branch has the same length, thus violating the condition that trees have at most a bounded number of unbounded, dependent branches. Figure 4 shows how a set of such trees can be generated by illustrating the effect of the following production. A\[c~(cr(x, y), a(x', y'))\] ---* A\[a(z, x)\] A\[cr(z, y)\] d\[~(z, z')\] u')\] To see this, assume (by induction) that all four of the daughter nonterminals are associated with the full binary tree of height i (v 0. All four of these trees are constrained to be equal by the production given above, which requires that they have identical left (i.e. z) subtrees (these subtrees must be the full binary tree vi-1). Passing the right subtrees (x, y, z' and //I) to the mother as shown allows the construction of a full binary tree with height i + 1 (ri+l). This can be repeated an  unbounded number of times so that all full binary trees are produced.</Paragraph>
    <Paragraph position="16"> To overcome both of these problems we impose the following additional constraint on the productions of a grammar. We require that subtrees of the mother that are passed to daughters that share subtrees with one another must appear as siblings in the mother's tree. Note that this condition rules out the production responsible for building full binary trees since the x, y, x' and y' subtrees are not siblings in the mother's tree despite the fact that all of the daughters share a common subtree z.</Paragraph>
    <Paragraph position="17"> Moreover, since a daughter shares subtrees with itself, a special case of the condition is that sub-trees occurring within some daughter can only appear as siblings in the mother. This condition also rules out the Turing Machine simulation. We refer to this formalism as Partially Linear Tree Grammars (PLTG). As a further illustration of the constraints places on shared subtrees, Figure 5 shows a local tree that could appear in a derivation tree. This local tree is licensed by the following production which respects all of the constraints on PLT6 productions.</Paragraph>
    <Paragraph position="19"> Note that in Figure 5 the daughter nodes labelled B and D share a common subtree and the sub-trees shared between the mother and the B and D daughters appear as siblings in the tree associated</Paragraph>
    <Paragraph position="21"/>
    <Paragraph position="23"> and the tree set shown in Figure 2.</Paragraph>
    <Paragraph position="25"> Example 5 The PLTG with the following productions generates the language of strings consisting of k copies of strings of matching parenthesis, i.e., the language where k k 1 and D is the set of strings in { (,) }* that have balanced brackets, i. e, the Dyck language</Paragraph>
    <Paragraph position="27"/>
  </Section>
  <Section position="5" start_page="77" end_page="78" type="metho">
    <SectionTitle>
3 Trees to Feature Structures
</SectionTitle>
    <Paragraph position="0"> Finally, we note that acyclic feature structures without re-entrancy can be viewed as trees with branches labelled by feature names and atomic values only found at leaf nodes (interior nodes  being unlabelled). Based on this observation, we can consider the constraints we have formulated for the tree system PI_TG as constraints on a unification-based grammar formalism such as PARR. We will call this system Partially Linear PATR (PI_PATR). Having made the move from trees to feature structures, we consider the possibility of re-entrancy in PI_PATR.</Paragraph>
    <Paragraph position="1"> Note that the feature structure at the root of a PI_PATR derivation tree will not involve reentrancy. However, for the following reasons we believe that this does not constitute as great a limitation as it might appear. In unification-based grammar, the feature structure associated with the root of the tree is often regarded as the structure that has been derived from the input (i.e., the output of a parser). As a consequence there is a tendency to use the grammar rules to accumulate a single, large feature structure giving a complete encoding of the analysis. To do this, unbounded feature information is passed up the tree in a way that violates the constraints developed in this paper. Rather than giving such prominenc.e to the root feature structure, we suggest that the entire derivation tree should be seen as the object that is derived from the input, i.e., this is what the parser returns. Because feature structures associated with all nodes in the tree are available, feature information need only be passed up the tree when it is required in order to establish dependencies within the derivation tree. When this approach is taken, there may be less need for re-entrancy in the root feature structure. Furthermore, re-entrancy in the form of shared feature structures within and across nodes will be found in PI_PATR (see for example Figure 5).</Paragraph>
  </Section>
  <Section position="6" start_page="78" end_page="78" type="metho">
    <SectionTitle>
4 Generative Capacity
</SectionTitle>
    <Paragraph position="0"> HG are more powerful than CI=G and are known to be weakly equivalent to Tree Adjoining Grammar, Combinatory Categorial Grammar, and Head Grammar (Vijay-Shanker and Weir, 1994). PI_IG are more powerful than I_IG since they can generate the k-copy language for any fixed k (see Example 2). Slightly more generally, PI_IG can generate the language {w~\]weR} for any k &gt; 1 and regular language R. We believe that the language involving copies of strings of matching brackets described in Example 5 cannot be generated by PI_IG but, as shown in Exampie 5, it can be generated by P/T(:; and therefore PLPATR. Slightly more generally, PLTG can generate the language</Paragraph>
    <Paragraph position="2"> for any k &gt; 1 and context-free language L. It appears that the class of languages generated by PI_TG is included in those languages generated by</Paragraph>
    <Section position="1" start_page="78" end_page="78" type="sub_section">
      <SectionTitle>
Linear Context-Free Rewriting Systems (Vijay-
</SectionTitle>
      <Paragraph position="0"> Shanker et al., 1987) since the construction involved in a proof of this underlies the recognition algorithm discussed in the next section.</Paragraph>
      <Paragraph position="1"> As is the case for the tree sets of 16, 1_16 and Tree Adjoining Grammar, the tree sets generated by PI_TG have path sets that. are context-free languages. In other words, the set of all strings labelling root to frontier paths of derivation trees is a context-free language. While the tree sets of lAG and Tree Adjoining Grammars have independent branches, PI_T6 tree sets exhibit dependent branches, where the number of dependent branches in any tree is bounded by the grammar.</Paragraph>
      <Paragraph position="2"> Note that the number of dependent branches in the tree sets of 16 is not bounded by the grammar (e.g., they generate sets of all full binary trees).</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="78" end_page="79" type="metho">
    <SectionTitle>
5 Tractable Recognition
</SectionTitle>
    <Paragraph position="0"> In this section we outline the main ideas underlying a polynomial time recognition algorithm for PlPATR that generalizes the CKY algorithm (Kasami, 1965; Younger, 1967). The key to this algorithm is the use of structure sharing techniques similar to those used to process I_lG efficiently (Vijay-Shanker and Weir, 1993). To understand how these techniques are applied in the case of PLPATR, it is therefore helpful to consider first the somewhat simpler case of I_lG.</Paragraph>
    <Paragraph position="1"> The CKY algorithm is a bottom-up recognition algorithm for CI=G. For a given grammar G and input string al ... a,~ the algorithm constructs an array P, having n 2 elements, where element P\[i, j\] stores all and only those nonterminals of G that derive the substring ai...aj. A naive adaptation of this algorithm for I_lG recognition would involve storing a set of nonterminals and their associated stacks. But since stack length is at least proportional to the length of the input string, the resultant algorithm would exhibit exponential space and time complexity in the worst case. Vijay-Shanker and Weir (1993) showed that the behaviour of the naive algorithm can be improved upon. In I_lG derivations the application of a rule cannot depend on more than a bounded portion of the top of the stack. Thus, rather than storing the whole of the. potentially unbounded stack in a particular array entry, it suffices to store just  a bounded portion together with a pointer to the residue.</Paragraph>
    <Paragraph position="2"> Consider Figure 6. Tree (a) shows a LIG derivation of the substring hi...aj from the object A\[aaa'\]. In this derivation tree, the node labelled B\[aa\] is a distinguished descendant of the root s and is the first point below A\[c~rcr ~\] at which the top symbol (or) of the (unbounded) stack aa is exposed. This node is called the terminator of the node labelled A\[acr\]. It is not difficult to show that only that portion of the derivation below the terminator node is dependent on more than the top of the stack ha. It follows that for any stack a'a, if there is a derivation of the substring %...he from B\[c~'c~\] (see tree (b)), then there is a corresponding derivation of ai...aj from A\[al~rcr '\] (see tree (c)). This captures the sense in which I_IG derivations exhibit &amp;quot;context-freeness&amp;quot;. Efficient storage of stacks can therefore be achieved by storing in Pit, j\] just that bounded amount of information (nonterminal plus top of stack) relevant to rule application, together with a pointer to any entry in Pip, q\] representing a subderivation from an object B\[c~'a\].</Paragraph>
    <Paragraph position="3"> 2The stack aa associated with B is &amp;quot;inherited&amp;quot; from the stack associated with A at the root of the tree.</Paragraph>
    <Paragraph position="4"> Before describing how we adapt this technique to the case of PLPATR we discuss the sense in which PLPATR derivations exhibit a &amp;quot;context-freeness&amp;quot; property. The constraints on PLPATR which we have identified in this paper ensure that these feature values can be manipulated independently of one another and that they behave in a stack-like way. As a consequence, the storage technique used effectively for LIG recognition may be generalized to the case of PLPATR.</Paragraph>
    <Paragraph position="5"> Suppose that we have the derived tree shown in Figure 7 where the nodes at the root of the subtrees T1 and 7&amp;quot;2 are the so-called f-terminator and g-terminator of the tree's root, respectively.</Paragraph>
    <Paragraph position="6"> Roughly speaking, the f-terminator of a node is the node from which it gets the value for the feature f, Because of the constraints on the form of PLPATR productions, the derivations between the root of 7- and these terminators cannot in general depend on more than a bounded part of the feature structures \[\] and \[-~. At the root of the figure the feature structures \[-i-\] and \[\] have been expanded to show the extent of the dependency in this example. In this case, the value of the feature in \[-~ must be a, whereas, the feature g is Y not fixed. Furthermore, the value of the feature g in must be b, whereas, the feature f is not fixed.</Paragraph>
    <Paragraph position="7"> This means, for example, that the applicability of the productions used on the path from the root of rl to the root of r depends on the feature f in \[\] having the value a but does not depend on the value of the feature g in \[~\]. Note that in this tree the value of the feature g in \[-~ is \[,:c\] FI= 9 Fa and the value of the feature f in \[~ is F~= g:d Suppose that, in addition to the tree shown in  shown in Figure 8. Notice that while the feature structures at the root of r~ and r4 are not compatible with ~ and \[~\], they do agree with respect to those parts that are fully expanded at v's root node. The &amp;quot;context-freeness&amp;quot; of PI_PATR means that given the three trees shown in Figures 7 and 8 the tree shown in Figure 9 will also be generated by the grammar.</Paragraph>
    <Paragraph position="8"> This gives us a means of efficiently storing the potentially unbounded feature structures associated with nodes in a derivation tree (derived feature structures). By analogy with the situation for  LIG, derived feature structures can be viewed as consisting of a bounded part (relevant to rule application) plus unbounded information about the values of features. For each feature, we store in the recognition array a bounded amount of information about its value locally, together with a pointer to a further array element. Entries in this element of the recognition array that are compatible (i.e. unifiable) with the bounded, local information correspond to different possible values for the feature. For example, we can use a single entry in the recognition array to store the fact that all of the feature structures that can appear at the root of the trees in Figure 9 derive the substring ai...aj. This entry would be underspecified, for example, the value of feature \[-~ would be specified to be any feature stored in the array entry for the substring ap... aq whose feature f had the value a.</Paragraph>
    <Paragraph position="9"> However, this is not the end of the story. In contrast to LIG, PLPATR licenses structure sharing on the right hand side of productions. That is, partial linearity permits feature values to be shared between daughters where they are not also shared with the mother. But in that case, it appears that checking the applicability of a production at some point in a derivation must entail the comparison of structures of unbounded size. In fact, this is not so. The PLPATR recognition algorithm employs a second array (called the compatibility array), which encodes information about the compatibility of derived feature structures. Tuples of compatible derived feature structures are stored in the compatibility array using exactly the same approach used to store feature structures in the main recognition array. The presence of a tuple in the compatibility array (the indices of which encode which input substrings are spanned) indicates the existence of derivations of compatible feature structures. Due to the &amp;quot;context-freeness&amp;quot; of PLPATR, new entries can be added to the compatibility array in a bottom-up manner based on existing entries without the need to reconstruct complete feature structures.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML