File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/92/m92-1015_evalu.xml

Size: 1,697 bytes

Last Modified: 2025-10-06 14:00:10

<?xml version="1.0" standalone="yes"?>
<Paper uid="M92-1015">
  <Title>The &amp;quot;ALL TEMPLATES&amp;quot; results of our &amp;quot;official&amp;quot; runs were as follows : RECALL PRECISION</Title>
  <Section position="8" start_page="125" end_page="126" type="evalu">
    <SectionTitle>
GRAMMAR EVALUATION
</SectionTitle>
    <Paragraph position="0"> To understand why some systems did better than others, we need some glass-box evaluation of individua l components. As we know, it is very hard to define any glass-box evaluation which can be applied across systems .</Paragraph>
    <Paragraph position="1">  We have experimented with one aspect of this, grammar (parse) evaluation, which can at least be applied acros s those systems which generate a full sentence parse.</Paragraph>
    <Paragraph position="2"> We use as our standard for comparison the Univ . of Pennsylvania Tree Bank, which includes parse trees for a portion of the MUC terrorist corpus . We take our parse trees, restructure them (automatically) to conform better to the Penn parses, strip labels from brackets, and then compare the bracket structure to that of the Tree Bank . The result is a recall/precision score which should be meaningful across systems .</Paragraph>
    <Paragraph position="3"> We have experimented with a number of parsing strategies, and found that parse recall is well correlate d with template recall [2] .</Paragraph>
    <Paragraph position="4"> In principle, we would like to try to extend these comparisons to &amp;quot;deeper&amp;quot; relations, such as functiona l subject/object relations. These will be harder to define, but may be applicable over a broader range of systems .</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML