XML Viewer - h94-1051

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/94/h94-1051_evalu.xml
Size: 2,045 bytes
Last Modified: 2025-10-06 14:00:17
<?xml version="1.0" standalone="yes"?>
<Paper uid="H94-1051">
  <Title>AUTOMATIC GRAMMAR ACQUISITION</Title>
  <Section position="7" start_page="269" end_page="270" type="evalu">
    <SectionTitle>
5. EXPERIMENTAL RESULTS
</SectionTitle>
    <Paragraph position="0"> Each of the grammars was learned from a training set of 731 sentences (16,733 words) from the Wall Street Journal Treebank corpus. A separate test set of 49 sentences (1289 wordsi was compiled from the same corpus. Parse quality was evaluated using Parseval, which reports three different measures of correctness: recall, precision, and crossings. Each parse tree to be evaluated (the candidate parse) is compared against the corresponding parse as found in Treebank (the standard parse).</Paragraph>
    <Paragraph position="1"> Recall measures the percentage of the constituents in the standard parse which are present in the candidate parse.</Paragraph>
    <Paragraph position="2"> Precision measures the percentage of the constituents in the candidate parse which are correct (i.e., present in the standard parse). Crossings measures the number of constituents in the candidate parse which are incompatible with the constituents in the standard parse, where incompatibility means that the constituent,crosses brackets with a constituent in the standard.</Paragraph>
    <Paragraph position="3"> For more details on the evaluation procedure, see \[Black, et al., 91\] The results of the test are shown in Table 2. As expected, the performance of the simple ccontext-free grammar is substantially worse than the performance of both the context-dependent grammar and the probabilistic context-flee grammar. It is interesting to note that although recall for the P-CFG and CDG is essentially equal, the P-CFG has a higher precision. This  suggests that probabilistic modeling is more successful at reducing overgeneration than simple examination of context.</Paragraph>
    <Paragraph position="4"> The P-CFG also shows a lesser average number of crossings per sentence.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML