File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/w04-1311_abstr.xml

Size: 9,665 bytes

Last Modified: 2025-10-06 13:43:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1311">
  <Title>tic postprocessor using Variable Memory Length</Title>
  <Section position="1" start_page="77" end_page="114" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> We outline an unsupervised language acquisition algorithm and offer some psycholinguistic support for a model based on it. Our approach resembles the Construction Grammar in its general philosophy, and the Tree Adjoining Grammar in its computational characteristics. The model is trained on a corpus of transcribed child-directed speech (CHILDES). The model's ability to process novel inputs makes it capable of taking various standard tests of English that rely on forced-choice judgment and on magnitude estimation of linguistic acceptability. We report encouraging results from several such tests, and discuss the limitations revealed by other tests in our present method of dealing with novel stimuli.</Paragraph>
    <Paragraph position="1"> 1 The empirical problem of language acquisition The largely unsupervised, amazingly fast and almost invariably successful learning stint that is language acquisition by children has long been the envy of computer scientists (Bod, 1998; Clark, 2001; Roberts and Atwell, 2002) and a daunting enigma for linguists (Chomsky, 1986; Elman et al., 1996). Computational models of language acquisition or &amp;quot; grammar induction&amp;quot; are usually divided into two categories, depending on whether they subscribe to the classical generative theory of syntax, or invoke &amp;quot; general-purpose&amp;quot; statistical learning mechanisms. We believe that polarization between classical and statistical approaches to syntax hampers the integration of the stronger aspects of each method into a common powerful framework. On the one hand, the statistical approach is geared to take advantage of the considerable progress made to date in the areas of distributed representation and probabilistic learning, yet generic &amp;quot; connectionist&amp;quot; architectures are ill-suited to the abstraction and processing of symbolic information. On the other hand, classical rule-based systems excel in just those tasks, yet are brittle and difficult to train. We are developing an approach to the acquisition of distributional information from raw input (e.g., transcribed speech corpora) that also supports the distillation of structural regularities comparable to those captured by Context Sensitive Grammars out of the accrued statistical knowledge. In thinking about such regularities, we adopt Langacker's notion of grammar as &amp;quot; simply an inventory of linguistic units&amp;quot; ((Langacker, 1987), p.63). To detect potentially useful units, we identify and process partially redundant sentences that share the same word sequences. We note that the detection of paradigmatic variation within a slot in a set of otherwise identical aligned sequences (syntagms) is the basis for the classical distributional theory of language (Harris, 1954), as well as for some modern work (van Zaanen, 2000). Likewise, the pattern -- the syntagm and the equivalence class of complementary-distribution symbols that may appear in its open slot -- is the main representational building block of our system, ADIOS (for Automatic DIstillation Of Structure).</Paragraph>
    <Paragraph position="2"> Our goal in the present short paper is to illustrate some of the capabilities of the representations learned by our method vis a vis standard tests used by developmental psychologists, by second-language instructors, and by linguists. Thus, the main computational principles behind the ADIOS model are outlined here only briefl y. The algorithmic details of our approach and accounts of its learning from CHILDES corpora appear elsewhere (Solan et al., 2003a; Solan et al., 2003b; Solan et al., 2004; Edelman et al., 2004).</Paragraph>
    <Paragraph position="3"> 2 The principles behind the ADIOS algorithm The representational power of ADIOS and its capacity for unsupervised learning rest on three principles: (1) probabilistic inference of pattern significance, (2) context-sensitive generalization, and (3) recursive construction of complex patterns. Each of these is described briefl y below.</Paragraph>
    <Paragraph position="5"/>
    <Paragraph position="7"/>
    <Paragraph position="9"/>
    <Paragraph position="11"/>
    <Paragraph position="13"/>
    <Paragraph position="15"/>
    <Paragraph position="17"> Joe thinks that George thinks that Cindy believes that George thinks that Pam thinks that ...</Paragraph>
    <Paragraph position="18"> that the bird jumps disturbes Jim who adores the cat, doesn't it?</Paragraph>
    <Paragraph position="20"/>
    <Paragraph position="22"/>
    <Paragraph position="24"> Joe thinks that George thinks that Cindy believes that George thinks that Pam thinks that ...</Paragraph>
    <Paragraph position="25"> that the bird jumps disturbes Jim who adores the cat, doesn't it?  labels are underscored). This and other examples here were distilled from a 400-sentence corpus generated by a 40-rule Context Free Grammar. Right: the same pattern recast as a set of rewriting rules that can be seen as a Context Free Grammar fragment.</Paragraph>
    <Paragraph position="27"/>
    <Paragraph position="29"> thinks that Cindy believes that Jim who adores a cat meows and the bird flies , don't they? that Pam laughs worries a dog , doesn't it? that a cow jumps disturbs Jim who loves a horse , doesn't it? Joe and Beth think that Jim believes that the rabb it meows and Pam who scolds the dog laughs , don't they? that Joe is eager to please disturbs the bird. Cindy thinks that Jim believes that to read is tough . Beth thinks that Jim believes that Beth who loves a horse meows and the horse jumps , doesn't sh e? that Pam is tough to please worries the cat.</Paragraph>
    <Paragraph position="31"/>
    <Paragraph position="33"/>
    <Paragraph position="35"/>
    <Paragraph position="37"/>
    <Paragraph position="39"> thinks that Cindy believes that Jim who adores a cat meows and the bird flies , don't they? that Pam laughs worries a dog , doesn't it? that a cow jumps disturbs Jim who loves a horse , doesn't it? Joe and Beth think that Jim believes that the rabb it meows and Pam who scolds the dog laughs , don't they? that Joe is eager to please disturbs the bird. Cindy thinks that Jim believes that to read is tough . Beth thinks that Jim believes that Beth who loves a horse meows and the horse jumps , doesn't sh e? that Pam is tough to please worries the cat.</Paragraph>
    <Paragraph position="41"> share the same context, its power is comparable to that of Context Sensitive Grammars. In this example, equivalence class #75 is not extended to subsume the subject position, because that position appears in a different context (e.g., immediately to the right of the symbol BEGIN). Thus, long-range agreement is enforced and over-generalization prevented. Right: the context-sensitive &amp;quot; rules&amp;quot; corresponding to pattern #210.</Paragraph>
    <Paragraph position="42"> Probabilistic inference of pattern significance.</Paragraph>
    <Paragraph position="43"> ADIOS represents a corpus of sentences as an initially highly redundant directed graph, which can be informally visualized as a tangle of strands that are partially segregated into bundles. Each of these consists of some strands clumped together; a bundle is formed when two or more strands join together and run in parallel and is dissolved when more strands leave the bundle than stay in. In a given corpus, there will be many bundles, with each strand (sentence) possibly participating in several. Our algorithm, described in detail in (Solan et al., 2004), identifies significant bundles that balance high compression (small size of the bundle &amp;quot; lexicon&amp;quot; ) against good generalization (the ability to generate new grammatical sentences by splicing together various strand fragments each of which belongs to a different bundle).</Paragraph>
    <Paragraph position="44"> Context sensitivity of patterns. A pattern is an abstraction of a bundle of sentences that are identical up to variation in one place, where one of several symbols -- the members of the equivalence class associated with the pattern -- may appear (Figure 1). Because this variation is only allowed in the context specified by the pattern, the generalization afforded by a set of patterns is inherently safer than in approaches that posit globally valid categories (&amp;quot; parts of speech&amp;quot; ) and rules (&amp;quot; grammar&amp;quot; ). The reliance of ADIOS on many context-sensitive patterns rather than on traditional rules can be compared both to the Construction Grammar (discussed later) and to Langacker's concept of the grammar as a collection of &amp;quot; patterns of all intermediate degrees of generality&amp;quot; ((Langacker, 1987), p.46).</Paragraph>
    <Paragraph position="45"> Hierarchical structure of patterns. The ADIOS graph is rewired every time a new pattern is detected, so that a bundle of strings subsumed by it is represented by a single new edge. Following the rewiring, which is context-specific, potentially farapart symbols that used to straddle the newly abstracted pattern become close neighbors. Patterns thus become hierarchically structured in that their elements may be either terminals (i.e., fully specified strings) or other patterns. Moreover, patterns may refer to themselves, which opens the door for recursion.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML