File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/97/a97-1026_evalu.xml

Size: 4,063 bytes

Last Modified: 2025-10-06 14:00:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="A97-1026">
  <Title>An Automatic Scoring System For Advanced Placement Biology Essays</Title>
  <Section position="7" start_page="177" end_page="178" type="evalu">
    <SectionTitle>
RESULTS
</SectionTitle>
    <Paragraph position="0"> Table 1 shows the results of using the automatic scoring prototype to score 85 Excellent test essays, and 20 Poor test essays. Coverage (Cov) illustrates how many essays were assigned a score. Accuracy (Acc) indicates percentage of agreement between the computer-based score and the human rater score. Accuracy within 1 (w/i 1) or 2 points (w/i 2) shows the amount of agreement between the computer scores and human raters scores, within 1 or 2 points of human rater scores, respectively. For Excellent essays computer-based scores would be 1 or 2 points below the 9 point minimum, and for Poor essays, they would be 1 or 2 points above the</Paragraph>
  </Section>
  <Section position="8" start_page="178" end_page="178" type="evalu">
    <SectionTitle>
ERROR ANALYSIS
</SectionTitle>
    <Paragraph position="0"> An error analysis of the data indicated the following two error categories that reflected a methodological problem: a) Lexicon Deficiency and b) Concept Grammar Rule Deficiency. These error categories are discussed briefly below. Both error types could be resolved in future research.</Paragraph>
    <Paragraph position="1"> Scoring errors can be linked to data entry errors, morphological stripping errors, parser errors, and erroneous rules generated due to misinterpretations of the scoring guide. These errors, however, are peripheral to the underlying methods applied in this study.</Paragraph>
    <Section position="1" start_page="178" end_page="178" type="sub_section">
      <SectionTitle>
Lexical Deficiency
</SectionTitle>
      <Paragraph position="0"> Recall that the lexicon in this study was built from relevant vocabulary in the set of 100 training essays. Therefore, vocabulary which occurs in the test data, but not in the training data was ignored during the process of concept-extraction. This yielded incomplete CSRs, and degraded scoring resulted. For instance, while the core concept of the commonly occurring phrase one band is more often than not expressed as one band, or one fragment, other equivalent expressions existed in the test data some of which did not occur in the training data. From our 185 essays we extracted possible substitutions of the term one fragment.</Paragraph>
      <Paragraph position="1"> These are: one spot, one band, one inclusive line, one probe, one group, one bond, one segment, one length of nucleotides, one marking, one strand, one solid clump, in one piece, one bar, one mass, one stripe, one bar, and one blot. An even larger sample of essays could contain more alternate word or phrase substitutions than those are listed here.</Paragraph>
      <Paragraph position="2"> Perhaps, increased coverage for the test data can be achieved ff additional standard dictionary sources are used to create a lexicon, in conjunction with the example based method used in this study (Richardson et al., 1993). Corpus-based techniques using domain-specific texts (e.g., Biology textbooks) might also be helpful (Church and Hanks, 1990).</Paragraph>
      <Paragraph position="3"> Concept Grammar Rule Deficiency In our error analysis, we found cases in which information in a test essay was expressed in a novel way that is not represented in the set of concept grammar rules. In these cases, essay scores were degraded. For example, the sentence, &amp;quot;The action of this mutation would nullify the effect of the site, so the enzyme Y would not affect the site of the mutation. &amp;quot; is expressed uniquely, as compared to its paraphrases in the training set.</Paragraph>
      <Paragraph position="4"> This response says in a somewhat roundabout way that due to the mutation, the enzyme will not recognize the site and will not cut the DNA at this point. No rule was found to match the CSR generated for this test response.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML