File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/p05-1065_concl.xml
Size: 1,295 bytes
Last Modified: 2025-10-06 13:54:53
<?xml version="1.0" standalone="yes"?> <Paper uid="P05-1065"> <Title>Reading Level Assessment Using Support Vector Machines and Statistical Language Models</Title> <Section position="7" start_page="528" end_page="529" type="concl"> <SectionTitle> 6 Conclusions and Future Work </SectionTitle> <Paragraph position="0"> Statistical LMs were used to classify texts based on reading level, with trigram models being noticeably more accurate than bigrams and unigrams.</Paragraph> <Paragraph position="1"> Combining information from statistical LMs with other features using support vector machines provided the best results. Future work includes testing additional classi er features, e.g. parser likelihood scores and features obtained using a syntax-based language model such as Chelba and Jelinek (2000) or Roark (2001). Further experiments are planned on the generalizability of our classi er to text from other sources (e.g. newspaper articles, web pages); to accomplish this we will add higher level text as negative training data. We also plan to test these techniques on languages other than English, and incorporate them with an information retrieval system to create a tool that may be used by teachers to help select reading material for their students.</Paragraph> </Section> class="xml-element"></Paper>