XML Viewer - p96-1014

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/p96-1014_metho.xml
Size: 20,486 bytes
Last Modified: 2025-10-06 14:14:19
<?xml version="1.0" standalone="yes"?>
<Paper uid="P96-1014">
  <Title>Computing Optimal Descriptions for Optimality Theory Grammars with Context-Free Position Structures</Title>
  <Section position="4" start_page="102" end_page="103" type="metho">
    <SectionTitle>
3 The Dynamic Programming Table
</SectionTitle>
    <Paragraph position="0"> The Dynamic Programming (DP) Table is here a three-dimensional, pyramid-shaped data structure.</Paragraph>
    <Paragraph position="1"> It resembles the tables used for context-free chart parsing (Kay, 1980) and maximum likelihood computation for stochastic context-free grammars (Lari and Young, 1990) (Charniak, 1993). Each cell of the table contains a partial description (a part of a structural description), and the Harmony of that partial description. A partial description is much like an edge in chart parsing, covering a contiguous substring of the input. A cell is identified by three indices, and denoted with square brackets (e.g., \[X,a,c\]). The first index identifying the cell (X) indicates the cell category of the cell. The other two indices (a and c) indicate the contiguous substring of the input string covered by the partial description contained in the cell (input segments ia through ic).</Paragraph>
    <Paragraph position="2"> In chart parsing, the set of cell categories is precisely the set of non-terminals in the grammar, and thus a cell contains a subtree with a root non-terminal corresponding to the cell category, and with leaves that constitute precisely the input substring covered by the cell. In the algorithm presented here, the set of cell categories are the non-terminals of the position structure grammar, along with a category for each left-aligned substring of the right hand side of each position grammar rule. Example 5 gives the set of cell categories for the position structure grammar in Example 1.</Paragraph>
    <Paragraph position="3"> Example 5 The Set of Cell Categories S, F, Y, M, P, MF The last category in Example 5, MF, comes from the rule Y =:~ MFM of Example 1, which has more than two non-terminals on the right hand side. Each such category corresponds to an incomplete edge in normal chart parsing; having a table cell for each such category eliminates the need for a separate data structure containing edges. The cell \[MF,a,c\] may contain an ordered pair of subtrees, the first with root M covering input \[a,b\], and the second with root F covering input \[b+l,c\].</Paragraph>
    <Paragraph position="4"> The DP Table is perhaps best envisioned as a set of layers, one for each category. A layer is a set of all cells in the table indexed by a particular cell category.</Paragraph>
    <Paragraph position="6"> For each substring length, there is a collection of rows, one for each category, which will collectively be referred to as a level. The first level contains the cells which only cover one input segment; the number of cells in this level will he the number of input segments multiplied by the number of cell categories.</Paragraph>
    <Paragraph position="7"> Level two contains cells which cover input substrings of length two, and so on. The top level contains one cell for each category. One other useful partition of the DP table is into blocks. A block is a set of all cells covering a particular input subsequence. A block has one cell for each cell category.</Paragraph>
    <Paragraph position="8"> A cell of the DP Table is filled by comparing the results of several operations, each of which try to fill a cell. The operation producing the partial description with the highest Harmony actually fills the cell. The operations themselves are discussed in Section 4.</Paragraph>
    <Paragraph position="9"> The algorithm presented in Section 6 fills the table cells level by level: first, all the cells covering only one input segment are filled, then the cells covering two consecutive segments are filled, and so forth. When the table has been completely filled, cell \[S,1,J\] will contain the optimal description of the input, and its Harmony. The table may also be filled in a more left-to-right manner, bottom-up, in the spirit of CKY. First, the cells covering only segment il, and then i2, are filled. Then, the cells  covering the first two segments are filled, using the entries in the cells covering each of il and is. The cells of the next diagonal are then filled.</Paragraph>
  </Section>
  <Section position="5" start_page="103" end_page="103" type="metho">
    <SectionTitle>
4 The Operations Set
</SectionTitle>
    <Paragraph position="0"> The Operations Set contains the operations used to fill DP Table cells. The algorithm proceeds by considering all of the operations that could be used to fill a cell, and selecting the one generating the partial description with the highest Harmony to actually fill the cell. There are three main types of operations, corresponding to underparsing, parsing, and overparsing actions. These actions are analogous to the three primitive actions of sequence comparison (Sankoff and Kruskal, 1983): deletion, correspondence, and insertion.</Paragraph>
    <Paragraph position="1"> The discussion that follows makes the assumption that the right hand side of every production is either a string of non-terminals or a single terminal. Each parsing operation generates a new element of structure, and so is associated with a position structure grammar production. The first type of parsing operation involves productions which generate a single terminal (e.g., P:=~p). Because we are assuming that an input segment may only be parsed into at most one position, and that a position may have at most one input segment parsed into it, this parsing operation may only fill a cell which covers exactly one input segment, in our example, cell \[P,I,1\] could be filled by an operation parsing il into a p position, giving the partial description P(p filled with il).</Paragraph>
    <Paragraph position="2"> The other kinds of parsing operations are matched to position grammar productions in which a parent non-terminal generates child non-terminals. One of these kinds of operations fills the cell for a category by combining cell entries for two factor categories, in order, so that the substrings covered by each of them combine (concatenatively, with no overlap) to form the input substring covered by the cell being filled. For rule Y =~ MFM, there will be an operation of this type combining entries in \[M,a,b\] and \[F,b+l,c\], creating the concatenated structure s \[M,a,b\]+\[F,b+l,c\], to fill \[MF,a,c\]. The final type of parsing operation fills a cell for a category which is a single non-terminal on the left hand side of a production, by combining two entries which jointly form the entire right hand side of the production. This operation would combining entries in \[MF,a,c\] and \[M,c/l,d\], creating the structure Y(\[MF,a,c\],\[M,c+l,d\]), to fill \[Y,a,d\]. Each of these operations involves filling a cell for a target category by using the entries in the cells for two factor categories.</Paragraph>
    <Paragraph position="3"> The resulting Harmony of the partial description created by a parsing operation will be the combina2This partial description is not a single tree, but an ordered pair of trees. In general, such concatenated structures will be ordered lists of trees.</Paragraph>
    <Paragraph position="4"> tion of the marks assessed each of the partial descriptions for the factor categories, plus any additional marks incurred as a result of the structure added by the production itself. This is true because the constraints must be local: any new constraint violations are determinable on the basis of the cell category of the factor partial descriptions, and not any other internal details of those partial descriptions.</Paragraph>
    <Paragraph position="5"> All possible ways in which the factor categories, taken in order, may combine to cover the substring, must be considered. Because the factor categories must be contiguous and in order, this amounts to considering each of the ways in which the substring can be split into two pieces. This is reflected in the parsing operation descriptions given in Section 6.2.</Paragraph>
    <Paragraph position="6"> Underparsing operations are not matched with position grammar productions. A DP Table cell which covers only one input segment may be filled by an underparsing operation which marks the input segment as underparsed. In general, any partial description covering any substring of the input may be extended to cover an adjacent input segment by adding that additional segment marked as underparsed. Thus, a cell covering a given substring of length greater than one may be filled in two mirror-image ways via underparsing: by taking a partial description which covers all but the leftmost input segment and adding that segment as underparsed, and by taking a partial description which covers all but the rightmost input segment and adding that segment as underparsed.</Paragraph>
    <Paragraph position="7"> Overparsing operations are discussed in Section 5.</Paragraph>
  </Section>
  <Section position="6" start_page="103" end_page="105" type="metho">
    <SectionTitle>
5 The Overparsing Operations
</SectionTitle>
    <Paragraph position="0"> Overparsing operations consume no input; they only add new unfilled structure. Thus, a block of cells (the set of cells each covering the same input substring) is interdependent with respect to overparsing operations, meaning that an overparsing operation trying to fill one cell in the block is adding structure to a partial description from a different cell in the same block. The first consequence of this is that the overparsing operations must be considered after the underparsing and parsing operations for that block.</Paragraph>
    <Paragraph position="1"> Otherwise, the cells would be empty, and the overparsing operations would have nothing to add on to.</Paragraph>
    <Paragraph position="2"> The second consequence is that overparsing operations may need to be considered more than once, because the result of one overparsing operation (if it fills a cell) could be the source for another overparsing operation. Thus, more than one pass through the overparsing operations for a block may be necessary.</Paragraph>
    <Paragraph position="3"> In the description of the algorithm given in Section 6.3, each Repeat-Until loop considers the overparsing operations for a block of cells. The number of loop iterations is the number of passes through the overparsing operations for the block. The loop iterations stop when none of the overparsing operations  is able to fill a cell (each proposed partial description is less harmonic than the partial description already in the cell).</Paragraph>
    <Paragraph position="4"> In principle, an unbounded number of overparsing operations could apply, and in fact descriptions with arbitrary numbers of unfilled positions are contained in the output space of Gen (as formally defined). The algorithm does not have to explicitly consider arbitrary amounts of overparsing, however.</Paragraph>
    <Paragraph position="5"> A necessary property of the faithfulness constraints, given constraint locality, is that a partial description cannot have overparsed structures repeatedly added to it until the resulting partial description falls into the same cell category as the original prior to overparsing, and be more Harmonic. Such a sequence of overparsing operations can be considered a overparsing cycle. Thus, the faithfulness constraints must ban overparsing cycles. This is not solely a computational requirement, but is necessary for the grammar to be well-defined: overparsing cycles must be harmonically suboptimal, otherwise arbitrary amounts of overparsing will be permitted in optimal descriptions. In particular, the constraints should prevent overparsing from adding an entire overparsed non-terminal more than once to the same partial description while passing through the overparsing operations. In Example 2, the constraints FILL m and FILL p effectively ban overparsing cycles: no matter where these constraints are ranked, a description containing an overparsing cycle will be less harmonic (due to additional FILL violations) than the same description with the cycle removed.</Paragraph>
    <Paragraph position="6"> Given that the universal constraints meet this criterion, the overparsing operations may be repeatedly considered for a given level until none of them increase the Harmony of the entries in any of the cells.</Paragraph>
    <Paragraph position="7"> Because each overparsing operation maps a partial description in one cell category to one for another cell category, a partial description cannot undergo more consecutive overparsing operations than there are cell categories without repeating at least one cell category, thereby creating a cycle. Thus, the number of cell categories places a constant bound on the number of passes made through the overparsing operations for a block.</Paragraph>
    <Paragraph position="8"> A single non-terminal may dominate an entire subtree in which none of the syllable positions at the leaves of the tree are filled. Thus, the optimal &amp;quot;unfilled structure&amp;quot; for each non-terminal, and in fact each cell category, must be determined, for use by the overparsing operations. The optimal overparsing structure for category X is denoted with IX,0\], and such an entity is referred to as a base overparsing structure. A set of such structures must be computed, one for each category, before filling input-dependent DP table cells. Because these values are not dependent upon the input, base overparsing structures may be computed and stored in advance. Computing them is just like computing other cell entries, except that only overparsing operations are considered. First, consider (once) the overparsing operations for each non-terminal X which has a production rule permitting it to dominate a terminal x: each tries to set IX,0\] to contain the corresponding partial description with the terminal x left unfilled.</Paragraph>
    <Paragraph position="9"> Next consider the other overparsing operations for each cell, choosing the most Harmonic of those operations' partial descriptions and the prior value of  for each production X t ::~ x k create Xt(x k filled with ia) Parsing operations for \[X*,a,c\], where c&gt;a and all X are cell categories: for each production X t =~ XkX m for b = a+l to c-1 create X* (\[Xk,a,b\],\[X'~,b+ 1,c\]) for each production X u :=~ X/:xmxn...</Paragraph>
    <Paragraph position="10"> where X t = XkX'~: for b=a+l to c-1 create \[Xk,a,b\]+\[X'~,b+l,c\] Overparsing operations for \[X t,0\]: for each production X t =~ x k create Xt(x k unfilled) for each production X t =~ XkX m create xt (\[Xk,0\],\[Xm,0\]) for each production X ~ ~ XkXmXn...</Paragraph>
    <Paragraph position="11"> where X t -- xkxm: create \[Xk,0\]+\[Xm,0\] Overparsing operations for \[X t,a,a\]: same as for \[X*,a,c\] Overparsing operations for \[X t,a,c\]: for each production X t ~ X k create X t (\[X k ,a,c\])  for each production X t ::V xkx &amp;quot;~</Paragraph>
    <Paragraph position="13"> create \[Xk,a,c\]+\[Xm,0\] create \[Xk,0\]+\[Xm,a,c\]</Paragraph>
    <Section position="1" start_page="105" end_page="105" type="sub_section">
      <SectionTitle>
6.3 The Main Algorithm
</SectionTitle>
      <Paragraph position="0"> /* create the base overparsing structures */ Repeat For each X t, Set \[Xt,0\] to maxH{\[Xt,0\], overparsing ops for \[Xt,0\]} Until no IX t,0\] has changed during a pass /* fill the cells covering only a single segment */ For a = 1 to J For each X t, Set \[Xt,a,a\] to maxH{underparsing ops for \[Xt,a,a\]} For each X t, Set \[Xt,a,a\] to maxH{\[Xt,a,a\], parsing ops for \[Xt,a,a\]} Repeat For each X t, Set \[Xt,a,a\] to maxH{\[Xt,a,a\], overparsing ops for \[Xt,a,a\]} Until no \[X t,a,a\] has changed during a pass /* fill the rest of the cells */ For d=l to (J-l) For a=l to (J-d) For each X t, Set \[Xt,a,a+d\] to maxH{underparsing ops for \[Xt,a,a+d\]} For each X ~, Set \[Xt,a,a+d\] maxH{\[Xt,a,a+d\], parsing ops for \[Xt,a,a+d\]} Repeat For each X t, Set \[Xt,a,a+d\] to maxH{\[Xt,a,a+d\], overparsing ops for \[Xt,a,a+d\]} Until no \[Xt,a,a+d\] has changed during a pass Return \[S,1,J\] as the optimal description</Paragraph>
    </Section>
    <Section position="2" start_page="105" end_page="105" type="sub_section">
      <SectionTitle>
6.4 Complexity
</SectionTitle>
      <Paragraph position="0"> Each block of cells for an input subsequence is processed in time linear in the length of the subsequence. This is a consequence of the fact that in general parsing operations filling such a cell must consider all ways of dividing the input subsequence into two pieces. The number of overparsing passes through the block is bounded from above by the number of cell categories, due to the fact that overparsing cycles are suboptimal. Thus, the number of passes is bounded by a constant, for any fixed position structure grammar. The number of such blocks is the number of distinct, contiguous input subsequences (equivalently, the number of cells in a layer), which is on the order of the square of the length of the input. If N is the length of the input, the algorithm has computational complexity O(N3).</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="105" end_page="106" type="metho">
    <SectionTitle>
7 Discussion
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="105" end_page="105" type="sub_section">
      <SectionTitle>
7.1 Locality
</SectionTitle>
      <Paragraph position="0"> That locality helps processing should he no great surprise to computationalists; the computational significance of locality is widely appreciated. Further, locality is often considered a desirable property of principles in linguistics, independent of computational concerns. Nevertheless, locality is a sufficient but not necessary restriction for the applicability of this algorithm. The locality restriction is really a special case of a more general sufficient condition.</Paragraph>
      <Paragraph position="1"> The general condition is a kind of &amp;quot;Markov&amp;quot; property. This property requires that, for any substring of the input for which partial descriptions are constructed, the set of possible partial descriptions for that substring may be partitioned into a finite set of classes, such that the consequences in terms of constraint violations for the addition of structure to a partial description may he determined entirely by the identity of the class to which that partial description belongs. The special case of strict locality is easy to understand with respect to context-free structures, because it states that the only information needed about a subtree to relate it to the rest of the tree is the identity of the root non-terminal, so that the (necessarily finite) set of non-terminals provides the relevant set of classes.</Paragraph>
    </Section>
    <Section position="2" start_page="105" end_page="106" type="sub_section">
      <SectionTitle>
7.2 Underparsing and Derivational
Redundancy
</SectionTitle>
      <Paragraph position="0"> The treatment of the underparsing operations given above creates the opportunity for the same partial description to be arrived at through several different paths. For example, suppose the input is ia...ibicid...ie , and there is a constituent in \[X,a,b\] and a constituent \[Y,d,e\]. Further suppose the input segment ic is to be marked underparsed, so that the final description \[S,a,e\] contains \[X,a,b\] (i~) \[Y,d,e\]. That description could be arrived at either by combining \[X,a,b\] and (ic) to fill \[X,a,c\], and then combine \[X,a,c\] and \[Y,d,e\], or it could be arrived at by combining (i~) and \[Y,d,e\] to fill \[Y,c,e\], and then combine \[X,a,b\] and \[Y,c,e\]. The potential confusion stems from the fact that an underparsed segment is part of the description, but is not a proper constituent of the tree.</Paragraph>
      <Paragraph position="1"> This problem can be avoided in several ways. An obvious one is to only permit underparsings to be added to partial descriptions on the right side. One exception would then have to be made to permit input segments prior to any parsed input segments to be underparsed (i.e., if the first input segment is underparsed, it has to be attached to the left side of some constituent because it is to the left of everything in the description).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML