File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/h92-1063_metho.xml

Size: 3,893 bytes

Last Modified: 2025-10-06 14:13:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1063">
  <Title>A NEW APPROACH TO TEXT UNDERSTANDING</Title>
  <Section position="5" start_page="318" end_page="319" type="metho">
    <SectionTitle>
&amp;quot;POLICE HAVE REPORTED THAT TERRORISTS
TONIGHT BOMBED THE EMBASSIES OF THE PRC
AND THE SOVIET UNION. THE BOMBS CAUSED
</SectionTitle>
    <Paragraph position="0"> Precision, the percent of data correctly extracted out of all the information extracted, should be relatively unaffected in a compositional, domain-independent system. That is, if the lexicon is declarative rather than itself containing rules, the quality of answers produced should be unaffected.</Paragraph>
    <Paragraph position="1"> Precision in tests corresponding to the recall data plotted in  difference between having only 20% of the lexicon to having the full lexicon was only 2% in precision.</Paragraph>
    <Section position="1" start_page="319" end_page="319" type="sub_section">
      <SectionTitle>
3.2 Deterministic parser and grammar of
</SectionTitle>
      <Paragraph position="0"> English versus Fragment combining.</Paragraph>
      <Paragraph position="1"> In the experiment reported here, only a small set of fragment combining rules were tested, those deemed to be most useful in the ability to extract information fro MUC3; no attempt to provide coverage for the full variety of English syntax has been made. The fragment combining rules were as follows ranked by frequency of occurrence in  the experiment are as follows: - PP attachment to an NP (55%) - PP attachment to a VP (14%) merging of several N's into a single NP (13%) combing appositive NPs (7%) attaching a conjoined NP (6%) PP attachment to an ADJP (3%)  attaching time NP to VP (1%) repairing dates (&lt; 1%) To evaluate the relative contribution of the deterministic parser and the fragment combining component, we used recently developed grammar evaluation software \[Black, et al., 1991\]. This software uses TREEBANK parse trees as a reference answer. To factor out most grammatical idiosyncracies where legitimate theoretical differences may exist, a TREEBANK tree is reduced by a homomorphism to essential phrase bracketings, such as that in Figure 5. The user of the evaluation software then writes a homomorphism component that reduces his/her parser's output to a similar bracketed form. Then a comparator in the evaluation software counts three things: * Recall, the number of bracketed phrases in both answers divided by the number of bracketed phrases in the reference answer</Paragraph>
    </Section>
    <Section position="2" start_page="319" end_page="319" type="sub_section">
      <SectionTitle>
Parser Evaluation
</SectionTitle>
      <Paragraph position="0"> In using the evaluation software, it became readily apparent that the absolute numbers output for our deterministic parser were not particularly informative, though the relative performance change from one parser run to another was instructive. To see this, consider the example in Figure 5. The input sentence contains a prepositional phrase whose attachment is ambiguous.</Paragraph>
      <Paragraph position="1"> Therefore, the system, by design, closes the constituents up until the prepositional phrase; however, the evaluator counts this as three crossings (three errors) for the one design feature. Since permanent predictable ambiguity occurs frequently in the long, textual sentences of the MUC corpus, this multiplicative penalty is applied very often.</Paragraph>
      <Paragraph position="2"> However, relative comparison of one parser to measure system improvement (or retrenchment) over time is valuable. For instance, on a test set of 900 sentences, our fragment combining component successfully found 1,000 more phrases than running the deterministic parser alone, eliminated 250 incorrect structures, and reduced the total number of crossings by 300.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML