File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/w05-1502_metho.xml

Size: 17,943 bytes

Last Modified: 2025-10-06 14:09:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-1502">
  <Title>Predict A f[vectorB] := Ps Ph [?] &lt;Ps&gt; [A f[ * vectorB]; Ph; </Title>
  <Section position="3" start_page="0" end_page="11" type="metho">
    <SectionTitle>
1 Introductory definitions
</SectionTitle>
    <Paragraph position="0"> A record is a structure G = {r1 = a1;...;rn = an}, where all ri are distinct. That this can be seen as a set of feature-value pairs. This means that we can define a simple version of record unification G1 unionsqG2 as the union G1[?]G2, provided that there is no r such that G1.r negationslash= G2.r. We sometimes denote a sequence X1,...,Xn by the more compact vectorX. To update the ith record in a list of records, we write vectorG[i := G]. To substitute a variable Bk for a record Gk in any data structure G, we write G[Bk/Gk].</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.1 Decorated Context-Free Grammars
</SectionTitle>
      <Paragraph position="0"> The context-free approximation described in section 4 uses a form of CFG with decorated rules of the form f : A - a, where f is the name of the rule, and a is a sequence of terminals and categories subscripted with information needed for post-processing of the context-free parse result. In all other respects a decorated CFG can be seen as a straight-forward CFG.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="11" type="sub_section">
      <SectionTitle>
1.2 Linear Context-Free Rewriting Systems
</SectionTitle>
      <Paragraph position="0"> A linear context-free rewriting system (LCFRS; Vijay-Shanker et al., 1987) is a linear, non-erasing multiple context-free grammar (MCFG; Seki et al., 1991). An MCFG rule is written1 A - f[B1 ... Bd] := {r1 = a1; ...; rn = an } where A and Bi are categories, f is the name of the rule, ri are record labels and ai are sequences of terminals and argument projections of the form Bi.r. The language L(A) of a category A is a set of string records, and is defined recursively as</Paragraph>
      <Paragraph position="2"> It is the possibility of discontinuous constituents that makes LCFRS/MCFG more expressive than context-free grammars. If the grammar only consists of single-label records, it generates a context-free language.</Paragraph>
      <Paragraph position="3"> Example A small example grammar is shown in figure 1, and generates the language</Paragraph>
      <Paragraph position="5"> where shm is the homomorphic mapping such that each a in s is translated to c, and each b is translated to d. Examples of generated strings are ac, abcd and bbaddc. However, neither abc nor abcdabcd will be  {sshm |s [?] (a [?] b)[?] } S - f[A] := {s = A.p A.q} A - g[A1 A2] := {p = A1.p A2.p; q = A1.q A2.q} A - ac[] := {p = a; q = c} A - bd[] := {p = b; q = d}  generated. The language is not context-free since it contains a combination of multiple and crossed agreement with duplication.</Paragraph>
      <Paragraph position="6"> If there is at most one occurrence of each possible projection Ai.r in a linearization record, the MCFG rule is linear. If all rules are linear the grammar is linear. A rule is erasing if there are argument projections that have no realization in the linearization. A grammar is erasing if it contains an erasing rule. It is possible to transform an erasing grammar to non-erasing form (Seki et al., 1991). Example The example grammar is both linear and nonerasing. However, given that grammar, the rule</Paragraph>
      <Paragraph position="8"> is both non-linear (since A.p occurs more than once) and erasing (since it does not mention A.q).</Paragraph>
    </Section>
    <Section position="3" start_page="11" end_page="11" type="sub_section">
      <SectionTitle>
1.3 Ranges
</SectionTitle>
      <Paragraph position="0"> Given an input string w, a range r is a pair of indices, (i,j) where 0 [?] i [?] j [?] |w |(Boullier, 2000). The entire string w = w1 ...wn spans the range (0,n). The word wi spans the range (i [?] 1,i) and the substring wi+1, ...,wj spans the range (i,j). A range with identical indices, (i,i), is called an empty range and spans the empty string.</Paragraph>
      <Paragraph position="1"> A record containing label-range pairs,</Paragraph>
      <Paragraph position="3"> is called a range record. Given a range r = (i,j), the ceiling of r returns an empty range for the right index, ceilingleftrceilingright = (j,j); and the floor of r does the same for the left index floorleftrfloorright = (i,i). Concatenation of two ranges is</Paragraph>
      <Paragraph position="5"> In order to retrieve the ranges of any substring s in a sentence w = w1 ...wn we define range restriction of s with respect to w as &lt;s&gt; w = {(i,j)  |s = wi+1 ... wj }, i.e. the set of all occurrences of s in w. If w is understood from the context we simply write &lt;s&gt; .</Paragraph>
      <Paragraph position="6"> Range restriction of a linearization record Ph is written &lt;Ph&gt; , which is a set of records, where every terminal token s is replaced by a range from &lt;s&gt; . The range restriction of two terminals next to each other fails if range concatenation fails for the resulting ranges. Any unbound variables in Ph are unaffected by range restriction.</Paragraph>
      <Paragraph position="7"> Example Given the string w = abba, range restricting the terminal a yields</Paragraph>
      <Paragraph position="9"> The other possible solutions fail since they cannot be range concatenated.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="11" end_page="12" type="metho">
    <SectionTitle>
2 Parsing as deduction
</SectionTitle>
    <Paragraph position="0"> The idea with parsing as deduction (Shieber et al., 1995) is to deduce parse items by inference rules. A parse item is a representation of a piece of information that the parsing algorithm has acquired. An inference rule is written</Paragraph>
    <Paragraph position="2"> where g is the consequence of the antecedents g1 ...gn, given that the side conditions in C hold.</Paragraph>
    <Section position="1" start_page="11" end_page="12" type="sub_section">
      <SectionTitle>
2.1 Parsing decorated CFG
</SectionTitle>
      <Paragraph position="0"> Decorated CFG can be parsed in a similar way as standard CFG. For our purposes it suffices to say that the algorithm returns items of the form,</Paragraph>
      <Paragraph position="2"> saying that A spans the range r, and each daughter Bi spans ri.</Paragraph>
      <Paragraph position="3"> The standard inference rule combine might look like this for decorated CFG:</Paragraph>
      <Paragraph position="5"> Note that the subscript x in Bx is the decoration that will only be used in post-processing.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="12" end_page="12" type="metho">
    <SectionTitle>
3 The Naive algorithm
</SectionTitle>
    <Paragraph position="0"> Seki et al. (1991) give an algorithm for MCFG, which can be seen as an extension of the CKY algorithm (Younger, 1967). The problem with that algorithm is that it has to find items for all daughters at the same time. We modify this basic algorithm to be able to find one daughter at the time.</Paragraph>
    <Paragraph position="1"> There are two kinds of items. A passive item [A; G] has the meaning that the category A has been found spanning the range record G. An active item for the rule A - f[vectorB vectorBprime] := Ps has the form</Paragraph>
    <Paragraph position="3"> in which the categories to the left of the dot, vectorB, have been found with the linearizations in the list of range records vectorG. Ph is the result of substituting the projections in Ps with ranges for the categories found in vectorB.</Paragraph>
    <Section position="1" start_page="12" end_page="12" type="sub_section">
      <SectionTitle>
3.1 Inference rules
</SectionTitle>
      <Paragraph position="0"> There are three inference rules, Predict, Combine and</Paragraph>
      <Paragraph position="2"> Prediction gives an item for every rule in the grammar, where the range restriction Ph is what has been found from the beginning. The list of daughters is empty since none of the daughters in vectorB have been found yet.</Paragraph>
      <Paragraph position="4"> An active item looking for Bk and a passive item that has found Bk can be combined into a new active item. In the new item we substitute Bk for Gk in the linearization record. We also add Gk to the new item's list of daughters.</Paragraph>
      <Paragraph position="6"> [A; G] Every fully instantiated active item is converted into a passive item. Since the linearization record Ph is fully instantiated, it is equivalent to the range</Paragraph>
      <Paragraph position="8"> The subscripted numbers are for distinguishing the two categories from each other, since they are equivalent.</Paragraph>
      <Paragraph position="9"> Here A.q is a context-free category of its own, not a record projection.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="12" end_page="13" type="metho">
    <SectionTitle>
4 The Approximative algorithm
</SectionTitle>
    <Paragraph position="0"> Parsing is performed in two steps in the approximative algorithm. First we parse the sentence using a context-free approximation. Then the resulting context-free chart is recovered to a LCFRS chart.</Paragraph>
    <Paragraph position="1"> The LCFRS is converted by creating a decorated context-free rule for every row in a linearization record. Thus, the rule A - f[vectorB] := {r1 = a1; ...; rn = an } will give n context-free rules f : A.ri - ai. The example grammar from figure 1 is converted to a decorated CFG in figure 2.</Paragraph>
    <Paragraph position="2"> Parsing is now initiated by a context-free parsing algorithm returning decorated items as in section 2.1. Since the categories of the decorated grammar are projections of LCFRS categories, the final items will be of the form</Paragraph>
    <Paragraph position="4"> Since the decorated CFG is over-generating, the returned parse chart is unsound. We therefore need to retrieve the items from the decorated CFG parse chart and check them against the LCFRS to get the discontinuous constituents and mark them for validity.</Paragraph>
    <Paragraph position="5"> The initial parse items are of the form, [A - f[vectorB]; r = r; vectorG] where vectorG is extracted from a corresponding decorated item [f : (A.r)/r - b], by partitioning the daughters in b such that Gi = {r = r|(B.r)i/r [?] b }. In other words, Gi will consist of all r = r such that B.r is subscripted by i in the decorated item.</Paragraph>
    <Paragraph position="6"> Example Given b = (A.p)2/rprime (B.q)1/rprimeprime (A.q)2/rprimeprimeprime, we get the two range records G1 = {q = rprimeprime} and</Paragraph>
    <Paragraph position="8"> Apart from the initial items, we use three kinds of parse items. From the initial parse items we first build LCFRS items, of the form [A - f[vectorB]; G * ri ...rn; vectorG] where ri ...rn is a list of labels, vectorG is a list of |vectorB |range records, and G is a range record for the labels r1 ...ri[?]1. In order to recover the chart we use mark items</Paragraph>
    <Paragraph position="10"> The idea is that vectorG has been verified as range records spanning the daughters vectorB. When all daughters have been verified, a mark item is converted to a passive item [A; G].</Paragraph>
    <Section position="1" start_page="13" end_page="13" type="sub_section">
      <SectionTitle>
4.1 Inference rules
</SectionTitle>
      <Paragraph position="0"> There are five inference rules, Pre-Predict, Pre-Combine, Mark-Predict, Mark-Combine and Convert.</Paragraph>
      <Paragraph position="2"> item. Since the context-free items contain information about a1 ...an, we only need to use the labels r1,..., rn. vectorGd is a list of |vectorB |empty range records.</Paragraph>
      <Paragraph position="4"> [R; {G;r = r}* ri ...rn; vectorGprimeprime] If there is an initial parse item for the rule R with label r, we can combine it with an LCFRS item looking for r, provided the daughters' range records can be unified.</Paragraph>
      <Paragraph position="6"> When all record labels have been found, we can start to check if the items have been derived in a valid way by marking the daughters' range records for correctness. null</Paragraph>
      <Paragraph position="8"> [A; G] An item that has marked all daughters as correct is converted to a passive item.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="13" end_page="14" type="metho">
    <SectionTitle>
5 The Active algorithm
</SectionTitle>
    <Paragraph position="0"> The active algorithm parses without using any context-free approximation. Compared to the Naive algorithm the dot is used to traverse the linearization record of a rule instead of the categories in the right-hand side.</Paragraph>
    <Paragraph position="1"> For this algorithm we use a special kind of range, repsilon1, which denotes simultaneously all empty ranges (i,i). Range restricting the empty string gives &lt;epsilon1&gt; = repsilon1. Concatenation is defined as r*repsilon1 = repsilon1*r = r. Both the ceiling and the floor of repsilon1 are identities, ceilingleftrepsilon1ceilingright = floorleftrepsilon1floorright = repsilon1. There are two kinds of items. Passive items [A; G] say that we have found category A inside the range record G.</Paragraph>
    <Paragraph position="2"> An active item for the rule A - f[vectorB] := {Ph; r = ab; Ps} is of the form [A - f[vectorB]; G, r = r * b, Ps; vectorG] where G is a range record corresponding to the linearization rows in Ph and a has been recognized spanning r. We are still looking for the rest of the row, b, and the remaining linearization rows Ps. vectorG is a list of range records containing information about the daughters vectorB.</Paragraph>
    <Section position="1" start_page="13" end_page="14" type="sub_section">
      <SectionTitle>
5.1 Inference rules
</SectionTitle>
      <Paragraph position="0"> There are five inference rules, Predict, Complete, Scan, Combine and Convert.</Paragraph>
      <Paragraph position="2"> For every rule in the grammar, predict a corresponding item that has found the empty range. vectorGd is a list of |vectorB |empty range records since nothing has been found yet.</Paragraph>
      <Paragraph position="3"> Complete [R; G, r = r * epsilon1, {rprime = a;Ph}; vectorG] [R; {G;r = r}, rprime = repsilon1 * a,Ph; vectorG] When an item has found an entire linearization row we continue with the next row by starting it off with the empty range.</Paragraph>
      <Paragraph position="5"> When the next symbol to read is a terminal, its range restriction is concatenated with the range for what the row has found so far.</Paragraph>
      <Paragraph position="6">  If the next thing to find is a projection on Bi, and there is a passive item where Bi is the category, where Gprime is consistent with Gi, we can move the dot past the projection. Gi is updated with Gprime, since it might contain more information about the ith daughter. null Convert [A - f[vectorB]; G, r = r * epsilon1, {}; vectorG] [A; {G;r = r}] An active item that has fully recognized all its linearization rows is converted to a passive item.</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="14" end_page="14" type="metho">
    <SectionTitle>
6 The Incremental algorithm
</SectionTitle>
    <Paragraph position="0"> An incremental algorithm reads one token at the time and calculates all possible consequences of the token before the next token is read2. The Active algorithm as described above is not incremental, since we do not know in which order the linearization rows of a rule are recognized. To be able to parse incrementally, we have to treat the linearization records as sets of feature-value pairs, instead of a sequence.</Paragraph>
    <Paragraph position="1"> The items for a rule A - f[vectorB] := Ph have the same form as in the Active algorithm: [A - f[vectorB]; G, r = r * b, Ps; vectorG] However, the order between the linearization rows does not have to be the same as in Ph. Note that in this algorithm we do not use passive items. Also note that since we always know where in the input we are, we cannot make use of a distinguished epsilon1-range. Another consequence of knowing the current input position is that there are fewer possible matches for the Combine rule.</Paragraph>
    <Paragraph position="2"> 2See e.g. the ACL 2004 workshop &amp;quot;Incremental Parsing: Bringing Engineering and Cognition Together&amp;quot;.</Paragraph>
    <Section position="1" start_page="14" end_page="14" type="sub_section">
      <SectionTitle>
6.1 Inference rules
</SectionTitle>
      <Paragraph position="0"> There are four inference rules, Predict, Complete, Scan and Combine.</Paragraph>
      <Paragraph position="2"> An item is predicted for every linearization row r and every input position k. vectorGd is a list of |vectorB |empty range records.</Paragraph>
      <Paragraph position="3"> Complete [R; G, r = r * epsilon1, {Ph;rprime = a;Ps}; vectorG] ceilingleftrceilingright [?] k [?] |w| [R; {G;r = r}, rprime = (k,k) * a, {Ph;Ps}; vectorG] Whenever a linearization row r is fully traversed, we predict an item for every remaining linearization row rprime and every remaining input position k.</Paragraph>
    </Section>
    <Section position="2" start_page="14" end_page="14" type="sub_section">
      <SectionTitle>
Scan
</SectionTitle>
      <Paragraph position="0"> [R; G, r = r * sa, Ph; vectorG] rprime [?] r *&lt;s&gt; [R; G, r = rprime * a, Ph; vectorG] If the next symbol in the linearization row is a terminal, its range restriction is concatenated to the range for the partially recognized row.</Paragraph>
      <Paragraph position="1"> Combine [R; G, r = r * Bi.rprime a, Ph; vectorG]</Paragraph>
      <Paragraph position="3"> [R; G, r = rprimeprime * a, Ph; vectorG[i := {Gprime;rprime = rprime}]] If the next item is a record projection Bi.rprime, and there is an item for Bi which has found rprime, then move the dot forward. The information in Gi must be consistent with the information found for the Bi item, {Gprime;rprime = rprime}.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML