File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-2038_intro.xml

Size: 5,433 bytes

Last Modified: 2025-10-06 14:03:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-2038">
  <Title>Speeding Up Full Syntactic Parsing by Leveraging Partial Parsing Decisions</Title>
  <Section position="4" start_page="295" end_page="296" type="intro">
    <SectionTitle>
2 Methods
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="295" end_page="295" type="sub_section">
      <SectionTitle>
2.1 Software Optimizations
</SectionTitle>
      <Paragraph position="0"> While each of the following optimizations, individually, had a smaller effect on our parser's speed than the CYK restrictions, collectively, simple software engineering improvements resulted in the largest speed increase to our syntactic parser.</Paragraph>
      <Paragraph position="1"> In the experiments section, we will refer to this as the Optimized version.</Paragraph>
      <Paragraph position="2"> Optimization of the training data and internal symbolization: We discovered that our parser was bound by the number of probability hash-table lookups. We changed the format for our training data/hash keys so that they were as short as possible, eliminating deliminators and using integers to represent a closed set of POS tags that were seen in the training data, reducing the two to four byte POS tags such as VP or ADJP down to single byte integers. In the most extreme cases, this reduces the length of non-word characters in a hash from 28 characters to 6. The training data takes up less space, hashes faster, and many string comparisons are reduced to simple integer comparisons. Optimization of our hash-table implementation: The majority of look ups in the hash-table at runtime were for non-existent keys. We put a bloom lter on each hash bucket so that such lookups would often be trivially rejected, instead of having to compare the lookup key with every key in the bucket. We also switched to the Fowler/Noll/Vo (Noll, 2005) hash function, which is faster and has less collisions then our previous hash function.</Paragraph>
      <Paragraph position="3"> Optimization of critical areas: There were several areas in our code that were optimized after pro ling our parser.</Paragraph>
      <Paragraph position="4"> Rules based pre/post-processing: We were able to get very minor increases in precision, recall and speed by adding hard coded rules to our parser that handle things that are handled poorly, speci cally parenthetical phrases and quotations.</Paragraph>
    </Section>
    <Section position="2" start_page="295" end_page="296" type="sub_section">
      <SectionTitle>
2.2 CYK restrictions
</SectionTitle>
      <Paragraph position="0"> In this section, we describe modi cations that restrict the chart search based on the output of a partial parser (in this case, a chunker) that marks groups of constituents.</Paragraph>
      <Paragraph position="1"> First, we de ne a span to be a pair c = (s, t), where s is the index of the rst word in the span and t is the index of the last word in the span. We then de ne a set S, where S is the set of spans  c1, . . . , cn that represent the restrictions placed on the CYK parse. We say that c1 and c2 overlap iff s1 &lt; s2 t1 &lt; t2 or s2 &lt; s1 t2 &lt; t1, and we note it as c1 c2.2 When using the output of a chunker, S is the set of spans that describe the non-VP, non-PP chunks where ti si &gt; 0.</Paragraph>
      <Paragraph position="2"> During the CYK parse, after a span's start and end points are selected, but before iterating across all splits of that span and their generative rules, we propose that the span in question be checked to make sure that it does not overlap with any span in set S. We give the pseudocode in Algorithm 1, which is a modi cation of the parse() function given in Appendix B of (Collins, 1999). Algorithm 1 The modi ed parse() function initialize() for span = 2 to n do for start = 1 to n span + 1 do</Paragraph>
      <Paragraph position="4"> For example, given the chunk parse: [The red balloon]NP [ ew]V P [away]ADV P , S = f(1, 3)g because there is only one chunk with a length greater than 1.</Paragraph>
      <Paragraph position="5"> Suppose we are analyzing the span (3, 4) on the example sentence above. This span will be rejected, as it overlaps with the chunk (1, 3); the leaf nodes balloon and ew are not going to be children of the same parsetree parent node. Thus, this method does not compute the generative rules for all the splits of the spans f(2, 4), (2, 5), (3, 4), (3, 5)g. This will also reduce the number of calculations done when calculating higher spans. When computing (1, 4) in this example, time will be saved since the spans (2, 4) and (3, 4) were not considered. This example is visualized in Figure 2.</Paragraph>
      <Paragraph position="6"> A more complex, real world example from section 23 of Treebank is visualized in Fig2This notation was originally used in (Carreras et al., 2002).</Paragraph>
      <Paragraph position="7"> ure 3, using the sentence Under an agreement signed by the Big Board and the Chicago Mercantile Exchange, trading was temporarily halted in Chicago. This sentence has three usable chunks, [an agreement]NP , [the Big Board]NP , and [the Chicago Mercantile Exchange]NP . This example shows the effects of the above algorithm on a longer sentence with multiple chunks.</Paragraph>
      <Paragraph position="8">  lated, while half toned box spans do not have to calculate as many possibilities because they depend on an uncalculated span.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML