File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/p01-1051_intro.xml
Size: 11,351 bytes
Last Modified: 2025-10-06 14:01:10
<?xml version="1.0" standalone="yes"?> <Paper uid="P01-1051"> <Title>Error Profiling: Toward a Model of English Acquisition for Deaf Learners</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> There have been many theories of language acquisition proposing a stereotypical order of acquisition of language elements followed by most learners, and there has been empirical evidence of such an order among morphological elements of language (cf. (Bailey et al., 1974; Dulay and Burt, 1975; Larsen-Freeman, 1976)) and some syntactic structures (cf. (Brown and Hanlon, 1970; Gass, 1979)). There is indication that these results may be applied to any L1 group acquiring English (Dulay and Burt, 1974; Dulay and Burt, 1975), and some research has focused on developing a general account of acquisition across a broad range of morphosyntactic structures (cf. (Pienemann and H@akansson, 1999)). In this work, we explore how our second language instruction system, ICICLE, has generated the need for modeling such an account, and we discuss the results of a corpus analysis we have undertaken to fulfill that need.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.1 ICICLE: an Overview </SectionTitle> <Paragraph position="0"> ICICLE (Interactive Computer Identification and Correction of Language Errors) is an intelligent tutoring system currently under development (Michaud and McCoy, 1999; Michaud et al., 2000; Michaud et al., 2001). Its primary function is to tutor deaf students on their written English.</Paragraph> <Paragraph position="1"> Essential to performing that function is the ability to correctly analyze user-generated language errors and produce tutorial feedback to student performance which is both correct and tailored to the student's language competence. Our target learners are native or near-native users of American Sign Language (ASL), a distinct language from English (cf. (Baker and Cokely, 1980)), so we view the acquisition of skills in written English as the acquisition of a second language for this population (Michaud et al., 2000).</Paragraph> <Paragraph position="2"> Our system uses a cycle of user input and system response, beginning when a user submits a piece of writing to be reviewed by the system.</Paragraph> <Paragraph position="3"> The system determines the grammatical errors in the writing, and responds with tutorial feedback aimed at enabling the student to perform corrections. When the student has revised the piece, it is re-submitted for analysis and the cycle begins again. As ICICLE is intended to be used by an individual over time and across many pieces of writing, the cycle will be repeated with the same individual many times.</Paragraph> <Paragraph position="4"> Figure 1 contains a diagram of ICICLE's over-all architecture and the cycle we have described. It executes between the User Interface, the Error Identification Module (which performs the syntactic analysis of user writing), and the Response Generation Module (which builds the feedback to the user based on the errors the user has committed). The work described in this paper focuses on the development of one of the sources of knowledge used by both of these processes, a component of the User Model representing the user's grammatical competence in written English.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.2 A Need for Modeling L2A Status What currently exists of the ICICLE system is </SectionTitle> <Paragraph position="0"> a prototype application implemented in a graphical interface connected to a text parser that uses a wide-coverage English grammar augmented by &quot;mal-rules&quot; capturing typical errors made by our learner population. It can recognize and label many grammatical errors, delivering &quot;canned&quot; one- or two-sentence explanations of each error on request. The user can then make changes and resubmit the piece for additional analysis.</Paragraph> <Paragraph position="1"> We have discussed in (Schneider and McCoy, 1998) the performance of our parser and mal-ruleaugmented grammar and the unique challenges &quot;She is teach piano on Tuesdays.&quot; Beginner: Inappropriate use of auxiliary and verb morphology problems.</Paragraph> <Paragraph position="2"> &quot;She teaches piano on Tuesdays.&quot; Intermediate: Missing appropriate +ing morphology.</Paragraph> <Paragraph position="3"> &quot;She is teaching piano on Tuesdays.&quot; Advanced: Botched attempt at passive formation.</Paragraph> <Paragraph position="4"> &quot;She is taught piano on Tuesdays.&quot; grammatical user text.</Paragraph> <Paragraph position="5"> faced when attempting to cover non-grammatical input from this population.</Paragraph> <Paragraph position="6"> In its current form, when the parser obtains more than one possible parse of a user's sentence, the interface chooses arbitrarily which one it will assume to be representative of which structures the user was attempting. This is undesirable, as one challenge that we face with this particular population is that there is quite a lot of variability in the level of written English acquisition. A large percentage of the deaf population has reading/writing proficiency levels significantly below their hearing peers, and yet the population represents a broad range of ability. Among deaf 18year-olds, about half read at or below a fourth grade level, while about 10% read above the eighth-grade level (Strong, 1988). Thus, even when focused on a subset of the deaf population (e.g., deaf high school or college students), there is significant variability in the writing proficiency. The impact of this variability is that a particular string of words may have multiple interpretations and the most likely one may depend upon the proficiency level of the student, as illustrated in Figure 2. We are therefore currently developing a user model to address the system's need to make these parse selections intelligently and to adapt tutoring choices to the individual (Michaud and McCoy, 2000; Michaud et al., 2001).</Paragraph> <Paragraph position="7"> The model we are developing is called SLALOM. It is a representation of the user's ability to correctly use each of the grammatical &quot;features&quot; of English, which we define as incorporating both morphological rules such as pluralizing a noun with +S and syntactic rules such as the construction of prepositional phrases and S V O sentence patterns. Intuitively, each unit in SLALOM corresponds to a set of grammar rules and mal-rules which realize the feature. The information stored in each of these units represents observations based on the student's performance over the submission of multiple pieces of writing. These observations will be abstracted into three tags, representing performance that is consistently good (acquired), consistently flawed (unacquired), or variable (ZPD1) to record the user's ability to correctly execute each structure in his or her written text.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.3 An Incomplete Model </SectionTitle> <Paragraph position="0"> A significant problem that we must face in generating the tags for SLALOM elements is that we would like to infer tags on performance elements not yet seen in a writer's production, basing those tags on what performance we have been able to observe so far. We have proposed (Mc-Coy et al., 1996; Michaud and McCoy, 2000) that SLALOM be structured in such a way as to capture these expectations by explicitly representing the relationships between grammatical structures in terms of when they are acquired; namely, indicating which features are typically acquired before other features, and which are typically acquired at the same time. With this information available in the model, SLALOM will be able to suggest that a feature typically acquired before one marked &quot;acquired&quot; is most likely also acquired, or that a feature co-acquired with one marked &quot;ZPD&quot; may also be something currently being mastered by the student. The corpus analysis we have undertaken is meant to provide this structure by indicating a partial ordering on the acquisition of grammatical features by this population of learners.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.4 Applications </SectionTitle> <Paragraph position="0"> Having the SLALOM model marked with grammatical features as being acquired, unacquired, or ZPD will be very useful in at least two different Coy, 2000) for discussion. These are presumably the features the learner is currently in the process of acquiring and thus we expect to see variation in the user's ability to execute them.</Paragraph> <Paragraph position="1"> ways. The first is when deciding which possible parse of the input best describes a particular sentence produced by a learner. When there are multiple parses of an input text, some may place the &quot;blame&quot; for detected errors on different constituents. In order for ICICLE to deliver relevant instruction, it needs to determine which of these possibilities most likely reflects the actual performance of the student. We intend for the parse selection process to proceed on the premise that future user performance can be predicted based on the patterns of the past. The system can generally prefer parses which use rules representing well-formed constituents associated with &quot;acquired&quot; features, mal-rules from the &quot;unacquired&quot; area, and either correct rules or mal-rules for those features marked &quot;ZPD.&quot; A second place SLALOM will be consulted is in deciding which errors will then become the subjects of tutorial explanations. This decision is important if the instruction is to be effective.</Paragraph> <Paragraph position="2"> It is our wish for ICICLE to ignore &quot;mistakes&quot; which are slip-ups and not indicative of a gap in language knowledge (Corder, 1967) and to avoid instruction on material beyond the user's current grasp. It therefore will focus on features marked &quot;ZPD&quot;--those in that &quot;narrow shifting zone dividing the already-learned skills from the not-yetlearned ones&quot; (Linton et al., 1996), or the frontier of the learning process. ICICLE will select those errors which involve features from this learner's learning frontier and use them as the topics of its tutorial feedback.</Paragraph> <Paragraph position="3"> With the partial order of acquisition represented in the SLALOM model as we have described, these two processes can proceed on the combination of the data contained in the previous utterances supplied by a given learner and the &quot;intuitions&quot; granted by information on typical learners, supplementing empirical data on the specific user's mastery of grammatical forms with inferences on what that means with respect to other forms related to those through the order of acquisition. null</Paragraph> </Section> </Section> class="xml-element"></Paper>