File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-1308_concl.xml

Size: 5,338 bytes

Last Modified: 2025-10-06 13:54:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1308">
  <Title>Modelling syntactic development in a cross-linguistic context</Title>
  <Section position="6" start_page="59" end_page="59" type="concl">
    <SectionTitle>
8 Conclusion
</SectionTitle>
    <Paragraph position="0"> In this paper, we have shown that the same simple model accounts for the data in three languages that differ substantially in their underlying structure. To our knowledge, this is the only model of language acquisition which simultaneously (1) learns from naturalistic input (actual child-directed utterances), where the statistics and frequency distribution of the input are similar to that experienced by children; (2) produces actual utterances, which can be directly compared to those of children; (3) has a developmental component; (4) accounts for speech generativity and increasing MLU; (5) makes quantitative predictions; and (6) has simulated phenomena from more than one language.</Paragraph>
    <Paragraph position="1"> An essential feature of our approach is to limit the number of degrees of freedom in the simulations. We have used an identical model for simulating the same class of phenomena in three languages. The method of data analysis was also the same, and, in all cases, the model s and child s output were coded automatically and identically.</Paragraph>
    <Paragraph position="2"> The use of realistic input was also crucial in that it guaranteed that cross-linguistic differences were reflected in the input.</Paragraph>
    <Paragraph position="3"> The simulations showed that simple mechanisms were sufficient for obtaining a good fit to the data in three different languages, in spite of obvious syntactic differences and very different proportions of optional-infinitive errors. The interaction between a sentence final processing bias and increasing MLU enabled us to capture the reason why English, Dutch and Spanish offer different patterns of optional-infinitive errors: the difference in the relative position of finites and non-finites is larger in Dutch than in English, and Spanish verbs are predominantly finite. We suggest that any model that learns to produce progressively longer utterances from realistic input, and in which learning is biased towards the end of utterances, will simulate these results.</Paragraph>
    <Paragraph position="4"> The production of actual utterances (as opposed to abstract output) by the model makes it possible to analyse the output with respect to several (seemingly) unrelated phenomena, so that the nontrivial predictions of the learning mechanisms can be assessed. Thus, the same output can be utilized to study phenomena such as optional-infinitive errors (as in this paper), evidence for verb-islands (Jones et al., 2000), negation errors (Croker et al., 2003), and subject omission (Freudenthal et al., 2002b). It also makes it possible to assess the relative importance of factors such as increasing MLU that are implicitly assumed by many current theorists but not explicitly factored into their models.</Paragraph>
    <Paragraph position="5"> An important contribution of Wexler s (1994, 1998) nativist theory of the optional-infinitive stage has been to provide an integrated account of the different patterns of results observed across languages, of the fact that children use both correct finite forms and incorrect (optional) infinitives, and of the scarcity of other types of errors (e.g.</Paragraph>
    <Paragraph position="6"> verb placement errors). His approach, however, requires a complex theoretical apparatus to explain the data, and does not provide any quantitative predictions. Here, we have shown how a simple model with few mechanisms and no free parameters can account for the same phenomena not only qualitatively, but also quantitatively.</Paragraph>
    <Paragraph position="7"> The simplicity of the model inevitably means that some aspects of the data are ignored. Children learning a language have access to a range of sources of information (e.g. phonology, semantics), which the model does not take into consideration. Also, generating output from the model means producing everything the model can output.</Paragraph>
    <Paragraph position="8"> Clearly, children produce only a subset of what they can say. Furthermore, any rote-learned utterance that the model produces early on in its development will continue to be produced during the later stages. This inability to unlearn is clearly a weakness of the model, but one that we hope to correct in subsequent research.</Paragraph>
    <Paragraph position="9"> The results clearly show that the interaction between a simple distributional analyser and the statistical properties of naturalistic child-directed speech can explain a considerable amount of the developmental data, without the need to appeal to innate linguistic knowledge. The fact that such a relatively simple model provides such a good fit to the developmental data in three languages suggests that (1) aspects of children s multi-word speech data such as the optional-infinitive phenomenon do not necessarily require a nativist interpretation, and (2) nativist theories of syntax acquisition need to pay more attention to the role of input statistics and increasing MLU as determinants of the shape of the developmental data.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML