File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/h92-1064_metho.xml

Size: 3,945 bytes

Last Modified: 2025-10-06 14:13:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1064">
  <Title>NEAL-MONTGOMERY NLP SYSTEM EVALUATION METHODOLOGY</Title>
  <Section position="4" start_page="323" end_page="323" type="metho">
    <SectionTitle>
2. PROJECT SELF-ASSESSMENT
</SectionTitle>
    <Paragraph position="0"> In March and September of 1991 rigorous project assessments provided valuable feedback into the design of the Neal-Montgomery NLP System Evaluation Methodology. For each assessment, three people applied the methodology to each of three NLP systems, for a total of eighteen applications. Assessment personnel, knowledgeable with respect to interface technology but not trained linguists, were distinct from the methodology development team.</Paragraph>
    <Paragraph position="1"> The consistency of system profiles resulting from these applications, the examination of test inputs composed during the assessments, records of oral commentary by evaluators, and responses to a post-evaluation questionnaire have been used as measures of the accuracy of methodology results. For the September assessment phase, Figure 3 shows, for each section of the methodology, the percentage of items for which the assessment team gave the same score to each system. For example: the data points for the adverbial section indicate that all three people gave the same assessment of System 2's skills for adverbials (they agreed in every instance), they agreed 60% of the time on System l's adverbial skills, and they agreed only 20% of the time for System 3's adverbial skills. The inconsistency of scores in this section has prompted the development team to refine the methodology's adverbial section.</Paragraph>
    <Paragraph position="2"> NLP systems used for assessments to date have included three NL database query systems and two MUC-3 systems.</Paragraph>
    <Paragraph position="3"> Focusing on reliability rather than feedback into methodology design, four people will apply the Neal-Montgomery NLP Evaluation Methodology to each of two systems for the third (and final) project self-assessment in April 1992.</Paragraph>
  </Section>
  <Section position="5" start_page="323" end_page="324" type="metho">
    <SectionTitle>
3. TOWARD THE FUTURE
</SectionTitle>
    <Paragraph position="0"> Evaluation &amp;quot;standards&amp;quot; are not developed and adopted without a period of review, rumination, and tweaking by the relevant user community. It is our hope therefore, in distributing the Neal-Montgomery NLP System Evaluation Methodology to the technical community, to stir interest that may lead to the eventual consideration of the methodology as the basis for a standard evaluation tool for NLP system capabilities.</Paragraph>
    <Paragraph position="1"> The Neal-Montgomery NLP System Evaluation Methodology is due for completion and delivery to Rome Laboratory in May of 1992. It will become immediately available at that time to all interested parties. Requests should be made to the author of this paper. Reviewer comment, critique, and suggestions for the methodology are invited.</Paragraph>
    <Section position="1" start_page="324" end_page="324" type="sub_section">
      <SectionTitle>
3.1 What-questions
3.1.1 What as Pronoun
</SectionTitle>
      <Paragraph position="0"/>
      <Paragraph position="2"> .................. .c.)...w.~.~....~... ..........................</Paragraph>
    </Section>
    <Section position="2" start_page="324" end_page="324" type="sub_section">
      <SectionTitle>
3.2 Who-questions
</SectionTitle>
      <Paragraph position="0"> a) with verb ................. ..b)....w.!.t.h.. ~..O. ..........................</Paragraph>
    </Section>
    <Section position="3" start_page="324" end_page="324" type="sub_section">
      <SectionTitle>
3.3 Where-questions
</SectionTitle>
      <Paragraph position="0"> My sincere thanks to Jeannette Neal of the Calspan Corporation and to Beth Sundheim for their valuable critique on early versions of this paper.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML