File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-0307_evalu.xml

Size: 7,060 bytes

Last Modified: 2025-10-06 13:59:14

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0307">
  <Title>A Statistical Constraint Dependency Grammar (CDG) Parser</Title>
  <Section position="7" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
5 Evaluation and Discussion
</SectionTitle>
    <Paragraph position="0"> All of the evaluations were performed on the Wall Street Journal Penn Treebank task. Following the traditional data setup, sections 02-21 are used for training our parser, section 23 is used for testing, and section 24 is used as the development set for parameter tuning and debugging. As in (Ratnaparkhi, 1999; Charniak, 2000; Collins, 1999), we evaluate on all sentences with length * 40 words (2,245 sentences) and length * 100 words (2,416 sentences).</Paragraph>
    <Paragraph position="1"> For training our probabilistic CDG parser on this task, the CFG bracketing of the training set is trans-</Paragraph>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
BASIC PARSING ALGORITHM
</SectionTitle>
    <Paragraph position="0"> 1. Using SuperARV tagging on word sequence w1; : : : ; wn, obtain a set of N-best SuperARV sequences with each element consisting of n (word, SuperARV) tuples, denoted hw1; s1i; : : : ;hwn; sni, which we will call an assignment.</Paragraph>
    <Paragraph position="1"> 2. For each SuperARV assignment, initialize the stack of parse prefixes with this assignment: =/ From left-to-right, process each hword; tagi of the assignment and generate parse prefixes /= for k: = 1; n do =/ Step a: /= /* decide left dependents of hwk; ski from the nearest to the farthest */ for c from 0 to N(L(sk)) ! 1 do =/ Choose a position for the (c + 1)th left dependent of hwk; ski from the set of possible positions</Paragraph>
    <Paragraph position="3"> =/ In the following equations, different left dependent assignments will generate different parse prefixes, each of which is stored in the stack / = for each dep(k;!(c + 1)) from positions C = f1; : : : ; dep(k;!c) ! 1g =/ Check whether the lexical category of the choice matches the modifiee lexical category of the (c + 1)th left dependent of hwk; ski/ =</Paragraph>
    <Paragraph position="5"> =/ End of choosing left dependents of hwk; ski for this parse prefix /= =/ Step b: /= =/ For the word/tag pair hwk; ski, check whether it could be a right dependent of any previously seen word within a parse prefix of hw1; s1i; : : : ;hwk!1; sk!1i/= for p: = 1; k ! 1 do =/ If hwp; spi still has right dependents left unspecified, then try outhwk; ski as a right dependent */ if D(R(sp)) 6= N(R(sp)) then d : = D(R(sp)) =/ If the lexical category of hwk; ski matches the modifiee lexical category of the(d + 1)th right dependent of hwp; spi; then sk might be hwp; spi's (d + 1)th right dependent / = if Cat(sk) == ModCat(sp; d + 1) then Pr(T) : = Pr(T) PS Pr(link(sk; sp; d + 1)jsyn;H), where H = hw; sip;hw; sik;hw; sidep(p;d)dep(p;1) Sort the parse prefixes in the stack according to logPr(T) and apply pruning using the thresholds. 3. After processing w1; : : : ; wn, pick the parse with the highest logPr(T) in the stack as the parse for that sentence.  prefix hypotheses incrementally when processing each input word.</Paragraph>
    <Paragraph position="6"> formed into CDG annotations using a CFG-to-CDG transformer (Wang, 2003). Note that the soundness of the CFG-to-CDG transformer was evaluated by examining the CDG parses generated from the transformer on the Penn Treebank development set to ensure that they were correct given our grammar definition.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 Contribution of Model Factors
</SectionTitle>
      <Paragraph position="0"> First, we investigate the contribution of the model additions described in Section 3 to parse accuracy.</Paragraph>
      <Paragraph position="1"> Since these factors are independent of the coupling between the SuperARV tagger and modifiee specification, we investigate their impact on a loosely integrated SCDG parser by comparing four models: (1) the basic loosely integrated model; (2) the basic model with crossing dependencies; (3) model 2 with distance and barrier information; (4) model 3 with SuperARVs augmented with additional modifiee lexical feature constraints. Each model uses a trigram SuperARV tagger to generate 40-best SuperARV sequences prior to modifiee specification.</Paragraph>
      <Paragraph position="2"> Table 2 shows the results for each of the four models including SuperARV tagging accuracy (%) and role value labeled precision and recall (%). Allowing crossing dependencies improves the overall parsing accuracy, but using distance information with verb barrier and punctuation heuristics produces an even greater improvement especially on the longer sentences. The accuracy is further improved by the additional modifiee lexical feature constraints added to the SuperARVs. Note that RLR is lower than RLP in these investigations possibly due to SuperARV tagging errors and the use of a tight stack pruning threshold.</Paragraph>
      <Paragraph position="3"> Next, we evaluate the impact of increasing the context of the SuperARV tagger to a 4-gram while increasing the size of the N-best list passed from the tagger to the modifiee specification step of the parser. For this evaluation, we use model (4)  bank for four loosely-coupled model variations. The evaluation metrics, RLR and RLP, are our dependency-based role value labeled precision and recall. Note: Model (1) denotes the basic model, Model (2) denotes (1)+crossing dependencies, Model (3) denotes (2)+distance (punctuation) model, and Model (4) denotes  from Table 2, the most accurate model so far. We also evaluate whether a tight integration of left-to-right SuperARV tagging and modifiee specification produces a greater parsing accuracy than the best loosely coupled counterpart. Table 3 shows the SuperARV tagging accuracy (%) and role value labeled precision and recall (%) for each model.</Paragraph>
      <Paragraph position="4"> Consistent with our intuition, a stronger SuperARV tagger and a larger search space of SuperARV sequences produces greater parse accuracy. However, tightly integrating SuperARV prediction with modifiee specification achieves the greatest overall accuracy. Note that SuperARV tagging accuracy and parse accuracy improve in tandem, as can be seen in Tables 2 and 3. These results are consistent with the observations of (Collins, 1999) and (Eisner, 1996). It is important to note that each of the factors contributing to improved parse accuracy in these two experiments also improved the word prediction capability of the corresponding parser-based LM (Wang and Harper, 2003).</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Comparing to Other Parsers
</SectionTitle>
      <Paragraph position="0"> Charniak's state-of-the-art PCFG parser (Charniak, 2000) has achieved the highest PARSEVAL LP/LR when compared to Collins' Model 2 and Model</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML