File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-0210_metho.xml

Size: 17,306 bytes

Last Modified: 2025-10-06 14:08:20

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0210">
  <Title>A Hybrid Text Classi cation Approach for Analysis of Student Essays</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Student Essay Analysis
</SectionTitle>
    <Paragraph position="0"> We cast the Student Essay Analysis problem as a text classi cation problem where we classify each sentence in the student's essay as an expression one of a set of correct answer aspects , or nothing in the case where no correct answer aspect was expressed.</Paragraph>
    <Paragraph position="1"> After a student attempts an initial answer to the question, the system analyzes the student's essay to assess which key points are missing from the student's argument. The system then uses its analysis of the student's essay to determine which help to offer that student. In order to do an effective job at selecting appropriate interventions for helping students improve their explanations, the system must perform a highly accurate analysis of the student's essay. Identifying key points as present in essays when they are not (i.e., false alarms), cause the system to miss opportunities to help students improve their essays. On the other hand, failing to identify key points that are indeed present in student essays causes the system to offer help where it is not needed, which can frustrate and even confuse students. A highly accurate inventory of the content of student essays is required in order to avoid missing opportunities to offer needed instruction and to avoid offering inappropriate feedback, especially as the completeness of student essays increases (Ros*e et al., 2002a; Ros*e et al., 2002c).</Paragraph>
    <Paragraph position="2"> In order to compute which set of key points, i.e., correct answer aspects , are included in a student essay, we rst segment the essay at sentence boundaries. Note that run-on sentences are broken up. Once an essay is segmented, each segment is classi ed as corresponding to one of the set of key points or nothing if it does not include any key point. We then take an inventory of the classi cations other than nothing that were assigned to at least one segment. Thus, our approach is similar in spirit to that taken in the AUTO-TUTOR system (Wiemer-Hastings et al., 1998), where Latent Semantic Analysis (LSA) (Landauer et al., 1998; Laham, 1997) was used to tally which subset of correct answer aspects students included in their natural language responses to short essay questions about computer literacy.</Paragraph>
    <Paragraph position="3"> We performed our evaluation over essays collected from students interacting with our tutoring system in response to the question Suppose you are running in a straight line at constant speed. You throw a pumpkin straight up. Where will it land? Explain. , which we refer to as the Pumpkin Problem. Thus, there are a total of six alternative classi cations for each segment: Class 1 Sentence expresses the idea that after the release the only force acting on the pumpkin is the downward force of gravity.</Paragraph>
    <Paragraph position="4"> Class 2 Sentence expresses the idea that the pumpkin continues to have a constant horizontal velocity after it is released.</Paragraph>
    <Paragraph position="5"> Class 3 Sentence expresses the idea that the horizontal velocity of the pumpkin continues to be equal to the horizontal velocity of the man.</Paragraph>
    <Paragraph position="6"> Class 4 Sentence expresses the idea that the pumpkin and runner cover the same distance over the same time.</Paragraph>
    <Paragraph position="7"> Class 5 Sentence expresses the idea that the pumpkin will land on the runner.</Paragraph>
    <Paragraph position="8"> Class 6 Sentence does not adequately express any of the above speci ed key points.</Paragraph>
    <Paragraph position="9"> Note that this classi cation task is strikingly different from those typically used for evaluating text classi cation systems. First, these classi cations represent speci c whole propositions rather than general topics, such as those used for classifying web pages (Craven et al., 1998), namely student , faculty , staff , etc. Secondly, the texts are much shorter, i.e., one sentence in comparison with a whole web page, which is a disadvantage for bag of words approaches.</Paragraph>
    <Paragraph position="10"> In some cases what distinguishes sentences from one class and sentences from another class is very subtle.</Paragraph>
    <Paragraph position="11"> For example, Thus, the pumpkin's horizontal velocity, which is equal to that of the man when he released it, will remain constant. belongs to Class 2 although it could easily be mistaken for Class 3. Similarly, So long as no other horizontal force acts upon the pumpkin while it is in the air, this velocity will stay the same. , belongs to Class 2 although looks similar on the surface to either Class 1 or 3. A related problem is that sentences that should be classi ed as nothing may look very similar on the surface to sentences belonging to one or more of the other classes. For example, It will land on the ground where the runner threw it up. contains all of the words required to correctly express the idea corresponding to Class 5, although it does not express this idea, and in fact expresses a wrong idea. These very subtle distinctions also pose problems for bag of words approaches since they base their decisions only on which words are present regardless of their order or the functional relationships between them. That might suggest that a symbolic approach involving syntactic and semantic interpretation might be more successful. However, while symbolic approaches can be more precise than bag of words approaches, they are also more brittle. And approaches that rely both on syntactic and semantic interpretation require a larger knowledge engineering effort as well.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 CarmelTC
</SectionTitle>
    <Paragraph position="0"> of a sentence.</Paragraph>
    <Paragraph position="1"> Sentence: The pumpkin moves slower because the man is not exerting a force on it.</Paragraph>
    <Paragraph position="3"> The hybrid CarmelTC approach induces decision trees using features from both a deep syntactic functional analysis of an input text as well as a prediction from the Rainbow Naive Bayes text classi er (McCallum, 1996; Mc-Callum and Nigam, 1998) to make a prediction about the correct classi cation of a sentence. In addition, it uses features that indicate the presence or absence of words found in the training examples. Since the Naive Bayes classi cation of a sentence is more informative than any single one of the other features provided, CarmelTC can be conceptualized as using the other features to decide whether or not to believe the Naive Bayes classi cation, and if not, what to believe instead.</Paragraph>
    <Paragraph position="4"> From the deep syntactic analysis of a sentence, we ex- null from the deep syntactic parse of a sentence.</Paragraph>
    <Paragraph position="5"> Sentence: The pumpkin moves slower because the man is not exerting a force on it.</Paragraph>
    <Paragraph position="7"> ships between syntactic heads (e.g., (subj-throw man)), tense information (e.g., (tense-throw past)), and information about passivization and negation (e.g., (negationthrow +) or (passive-throw -)). See Figures 1 and 2. Rainbow has been used for a wide range of text classi cation tasks. With Rainbow, P(doc,Class), i.e., the probability of a document belonging to class Class, is estimated by multiplying P(Class), i.e., the prior probability of the class, by the product over all of the words a0a2a1 found in the text of a3a5a4 a0a7a6a9a8a10a12a11a14a13a9a15a16a15a18a17 , i.e., the probability of the word given that class. This product is normalized over the prior probability of all words. Using the individual features extracted from the deep syntactic analysis of the input as well as the bag of words Naive Bayes classi cation of the input sentence, CarmelTC builds a vector representation of each input sentence, with each vector position corresponding to one of these features. We then use the ID3 decision tree learning algorithm (Mitchell, 1997; Quinlin, 1993) to induce rules for identifying sentence classes based on these feature vectors.</Paragraph>
    <Paragraph position="8"> The symbolic features used for the CarmelTC approach are extracted from a deep syntactic functional analysis constructed using the CARMEL broad coverage English syntactic parsing grammar (Ros*e, 2000) and the large scale COMLEX lexicon (Grishman et al., 1994), containing 40,000 lexical items. For parsing we use an incremental version of the LCFLEX robust parser (Ros*e et al., 2002b; Ros*e and Lavie, 2001), which was designed for ef cient, robust interpretation. While computing a deep syntactic analysis is more computationally expensive than computing a shallow syntactic analysis, we can do so very ef ciently using the incrementalized version of LCFLEX because it takes advantage of student typing time to reduce the time delay between when students submit their essays and when the system is prepared to respond.</Paragraph>
    <Paragraph position="9"> Syntactic feature structures produced by the CARMEL grammar factor out those aspects of syntax that modify the surface realization of a sentence but do not change its deep functional analysis. These aspects include tense, negation, mood, modality, and syntactic transformations such as passivization and extraction. In order to do this reliably, the component of the grammar that performs the deep syntactic analysis of verb argument functional relationships was generated automatically from a feature representation for each of COMLEX's verb subcategorization tags. It was veri ed that the 91 verb subcategorization tags documented in the COMLEX manual were covered by the encodings, and thus by the resulting grammar rules. These tags cover a wide range of patterns of syntactic control and predication relationships. Each tag corresponds to one or more case frames. Each case frame corresponds to a number of different surface realizations due to passivization, relative clause extraction, and whmovement. Altogether there are 519 syntactic patterns covered by the 91 subcategorization tags, all of which are covered by the grammar.</Paragraph>
    <Paragraph position="10"> There are nine syntactic functional roles assigned by the grammar. These roles include subj (subject), causesubj (causative subject), obj (object), iobj (indirect object), pred (descriptive predicate, like an adjectival phrase or an adverb phrase), comp (a clausal complement), modi er, and possessor. The roles pertaining to the relationship between a verb and its arguments are assigned based on the subcat tags associated with verbs in COM-LEX. However, in some cases, arguments that COM-LEX assigns the role of subject get rede ned as causesubj (causative subject). For example, the subject in the pumpkin moved is just a subject but in the man moved the pumpkin , the subject would get the role causesubj instead since 'move' is a causative-inchoative verb and the obj role is lled in in the second case 1. The modi er role is used to specify the relationship between any syntactic head and its adjunct modi ers. Possessor is used to describe the relationship between a head noun and its genitive speci er, as in man in either the man's pumpkin or the pumpkin of the man .</Paragraph>
    <Paragraph position="11"> With the hybrid CarmelTC approach, our goal has been to keep as many of the advantages of both symbolic analysis as well as bag of words classi cation approaches as possible while avoiding some of the pitfalls of each.</Paragraph>
    <Paragraph position="12"> Since the CarmelTC approach does not use the syntactic analysis as a whole, it does not require that the system be able to construct a totally complete and correct syntactic analysis of the student's text input. It can very effectively 1The causative-inchoative verb feature is one that we added to verb entries in COMLEX, not one of the features provided by the lexicon originally.</Paragraph>
    <Paragraph position="13"> make use of partial parses. Thus, it is more robust than purely symbolic approaches where decisions are based on complete analyses of texts. And since it makes use only of the syntactic analysis of a sentence, rather than also making use of a semantic interpretation, it does not require any sort of domain speci c knowledge engineering.</Paragraph>
    <Paragraph position="14"> And yet the syntactic features provide information normally not available to bag of words approaches, such as functional relationships between syntactic heads and scope of negation and other types of modi ers.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Related Work: Combining Symbolic and
</SectionTitle>
    <Paragraph position="0"> Bag of Words Approaches CarmelTC is most similar to the text classi cation approach described in (Furnkranz et al., 1998). In the approach described in (Furnkranz et al., 1998), features that note the presence or absence of a word from a text as well as extraction patterns from AUTOSLOG-TS (Riloff, 1996) form the feature set that are input to the RIPPER (Cohen, 1995), which learns rules for classifying texts based on these features. CarmelTC is similar in spirit in terms of both the sorts of features used as well as the general sort of learning approach. However, CarmelTC is different from (Furnkranz et al., 1998) in several respects. Where (Furnkranz et al., 1998) make use of AUTOSLOG-TS extraction patterns, CarmelTC makes use of features extracted from a deep syntactic analysis of the text. Since AUTOSLOG-TS performs a surface syntactic analysis, it would assign a different representation to all aspects of these texts where there is variation in the surface syntax. Thus, the syntactic features extracted from our syntactic analyses are more general. For example, for the sentence The force was applied by the man to the object , our grammar assigns the same functional roles as for The man applied the force to the object and also for the noun phrase the man that applied the force to the object . This would not be the case for AUTOSLOGTS. null Like (Furnkranz et al., 1998), we also extract word features that indicate the presence or absence of a root form of a word from the text. However, in contrast for CarmelTC one of the features for each training text that is made available to the rule learning algorithm is the classi cation obtained using the Rainbow Naive Bayes classi er (McCallum, 1996; McCallum and Nigam, 1998).</Paragraph>
    <Paragraph position="1"> Because the texts classi ed with CarmelTC are so much shorter than those of (Furnkranz et al., 1998), the feature set provided to the learning algorithm was small enough that it was not necessary to use a learning algorithm as sophisticated as RIPPER (Cohen, 1995). Thus, we used ID3 (Mitchell, 1997; Quinlin, 1993) instead with excellent results. Note that in contrast to CarmelTC, the (Furnkranz et al., 1998) approach is purely symbolic.</Paragraph>
    <Paragraph position="2"> Thus, all of its features are either word level features or surface syntactic features.</Paragraph>
    <Paragraph position="3"> Recent work has demonstrated that combining multiple predictors yields combined predictors that are superior to the individual predictors in cases where the individual predictors have complementary strengths and weaknesses (Larkey and Croft, 1996; Larkey and Croft, 1995). We have argued that this is the case with symbolic and bag of words approaches. Thus, we have reason to expect a hybrid approach that makes a prediction based on a combination of these single approaches would yield better results than either of these approaches alone. Our results presented in Section 5 demonstrate that this is true. Other recent work has demonstrated that symbolic and Bag of Words approaches can be productively combined. For example, syntactic information can be used to modify the LSA space of a verb in order to make LSA sensitive to different word senses (Kintsch, 2002). However, this approach has only been applied to the analysis of mono-transitive verbs. Furthermore, it has never been demonstrated to improve LSA's effectiveness at classifying texts.</Paragraph>
    <Paragraph position="4"> In the alternative Structured Latent Semantic Analysis (SLSA) approach, hand-coded subject-predicate information was used to improve the results obtained by LSA for text classi cation (Wiemer-Hastings and Zipitria, 2001), but no fully automated evaluation of this approach has been published.</Paragraph>
    <Paragraph position="5"> In contrast to these two approaches, CarmelTC is both fully automatic, in that the symbolic features it uses are obtained without any hand coding whatsoever, and fully general, in that it applies to the full range of verb subcategorization frames covered by the COMLEX lexicon, not only mono-transitive verbs. In Section 5 we demonstrate that CarmelTC outperforms both LSA and Rainbow, two alternative bag of words approaches, on the task of student essay analysis.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML