File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-2023_intro.xml
Size: 3,389 bytes
Last Modified: 2025-10-06 14:01:24
<?xml version="1.0" standalone="yes"?> <Paper uid="C02-2023"> <Title>A reliable approach to automatic assessment of short answer free responses</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Most assessment for placement, diagnosis, progress and achievement in our language programs are presently administered in paper and pencil (P&P) format. This format carries a number of administrative costs and inefficiencies. It requires new hard copy forms of assessments for each course and class, incurring costs associated with copying, handling, distributing, and collecting test booklets and answer sheets to test takers. Although some of the assessments can be scored by machine, teachers score those with free responses, such as open-ended questions and cloze (gap filling) tests.</Paragraph> <Paragraph position="1"> WebLAS addresses the problems of a P&P format. It provides an integrated approach to assessing language ability for the purpose of making decisions about placement, diagnosis, progress and achievement in the East Asian language programs, as the content specifications of the assessment system for these languages are based directly on the course content, as specified in scope and sequence charts, and utilize tasks that are similar to those used in classroom instruction. WebLAS is thus being designed with the following expected advantages as objectives: 1. Greater administrative efficiency 2. More authentic, interactive and valid assessments of language ability such as integration of course and assessment content and incorporation of cutting edge and multimedia technology for assessment Nested within these objectives is the ability to automatically assess limited production free responses. Existing systems such as e-Rater (Burstein et al) focus on holistic essay scoring. Even so, systems such as PEG (Page 1966) disregard content and simply perform surface feature analysis, such as a tabulation of syntactical usage. Others like LSA (Foltz et al 1998) require a large corpora as basis for comparison. Lately, there has been more interested in approaching the short answer scoring problem. These few such as MITRE (Hirschman et al, 2000) and ATM (Callear et al, 2001) are extraordinarily programming intensive however, and incomprehensible to educators. Additionally, they do not permit a partial credit scoring system, thereby introducing subjectivity into the scoring (Bachman 1990). None are truly suited for short answer scoring in an educational context, since the scores produced are neither easily explanable nor justifiable to testtakers.</Paragraph> <Paragraph position="2"> WebLAS is developed in response to the needs of the language assessors. Current methods for scoring P&P tests require the test creators to construct a scoring rubrid, by which human scorers reference as they score student responses. Weblas imitates this process by prompting the test creator for the scoring rubrid. It tags and parses the model answer, extracts relevant elements from within the model answer and proposes possible alternatives interactively with the educator. It also tags, parses, and extracts the same from the student responses.</Paragraph> <Paragraph position="3"> Elements are then pattern matched and scored.</Paragraph> </Section> class="xml-element"></Paper>