XML Viewer - c04-1139

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/c04-1139_concl.xml

Size: 1,535 bytes

Last Modified: 2025-10-06 13:53:58

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1139">
  <Title>Linguistic profiling of texts for the purpose of language verification</Title>
  <Section position="6" start_page="0" end_page="0" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> The results show that language verification is indeed possible, as long as we accept that near-native texts produced by non-natives will not be filtered out.</Paragraph>
    <Paragraph position="1"> Furthermore, whenever a verification filter is needed, it will be necessary to create a new filter, based on a seed corpus which contains both native and non-native texts as similar as possible in type to the texts which are to be filtered.</Paragraph>
    <Paragraph position="2"> There are now two avenues open for future research. First of all, we would like to explore the classification procedure linguistically: a) examine the distinguishing features in more detail and compare our findings with those in the literature, and b) examine the correlation of the nativeness score of the various texts to extra-linguistic text variables such as mother tongue and learner level.</Paragraph>
    <Paragraph position="3"> Secondly, once more insight is gained into the linguistic workings of the procedure, the classification process can be refined. At this point, we would also like to examine the effects of domain shift in more detail, and attempt to estimate a minimum size for seed corpora for use in filtering internet material.</Paragraph>
  </Section>
class="xml-element"></Paper>

Download Original XML