File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1304_intro.xml

Size: 4,677 bytes

Last Modified: 2025-10-06 14:02:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1304">
  <Title>Grammatical Inference and First Language Acquisition</Title>
  <Section position="2" start_page="25" end_page="26" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> For some years, the relevance of formal results in grammatical inference to the empirical question of first language acquisition by infant children has been recognised (Wexler and Culicover, 1980). Unfortunately, for many researchers, with a few notable exceptions (Abe, 1988), this begins and ends with Gold's famous negative results in the identification in the limit paradigm. This paradigm, though still widely used in the grammatical inference community, is clearly of limited relevance to the issue at hand, since it requires the model to be able to exactly identify the target language even when an adversary can pick arbitrarily misleading sequences of examples to provide. Moreover, the paradigm as stated has no bounds on the amount of data or computation required for the learner. In spite of the inapplicability of this particular paradigm, in a suitable analysis there are quite strong arguments that bear directly on this problem.</Paragraph>
    <Paragraph position="1"> Grammatical inference is the study of machine learning of formal languages. It has a vast formal vocabulary and has been applied to a wide selection of different problems, where the &amp;quot;languages&amp;quot; under study can be (representations of) parts of natural languages, sequences of nucleotides, moves of a robot, or some other sequence data. For any conclusions that we draw from formal discussions to have any applicability to the real world, we must be sure to select, or construct, from the rich set of formal devices available an appropriate formalisation. Even then, we should be very cautious about making inferences about how the infant child must or cannot learn language: subsequent developments in GI might allow a more nuanced description in which these conclusions are not valid. The situation is complicated by the fact that the field of grammtical inference, much like the wider field of machine learning in general, is in a state of rapid change.</Paragraph>
    <Paragraph position="2"> In this paper we hope to address this problem by justifying the selection of the appropriate learning framework starting by looking at the actual situation the child is in, rather than from an a priori decision about the right framework. We will not attempt a survey of grammatical inference techniques; nor shall we provide proofs of the theorems we use here.</Paragraph>
    <Paragraph position="3"> Arguments based on formal learnability have been used to support the idea of parameter based theories of language (Chomsky, 1986). As we shall see below, under our analysis of the problem these arguments are weak. Indeed, they are more pertinent to questions about the autonomy and modularity of language learning: the question whether learning of some level of linguistic knowledge - morphology or syntax, for example - can take place in isolation from other forms of learning, such as the acquisition of word meaning, and without interaction, grounding and so on.</Paragraph>
    <Paragraph position="4">  Positive results can help us to understand how humans might learn languages by outlining the class of algorithms that might be used by humans, considered as computational systems at a suitable abstract level. Conversely, negative results might be helpful if they could demonstrate that no algorithms of a certain class could perform the task - in this case we could know that the human child learns his language in some other way.</Paragraph>
    <Paragraph position="5"> We shall proceed as follows: after briefly describing FLA, we describe the various elements of a model of learning, or framework. We then make a series of decisions based on the empirical facts about FLA, to construct an appropriate model or models, avoiding unnecessary idealisation wherever possible. We proceed to some strong negative results, well-known in the GI community that bear on the questions at hand. The most powerful of these (Kearns et al., 1994) appears to apply quite directly to our chosen model. We then discuss an interesting algorithm (Ron et al., 1995) which shows that this can be circumvented, at least for a subclass of regular languages. Finally, after discussing the possibilities for extending this result to all regular languages, and beyond, we conclude with a discussion of the implications of the results presented for the distinction between parametric and non-parametric models.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML