File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-1311_metho.xml
Size: 20,522 bytes
Last Modified: 2025-10-06 14:09:17
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1311"> <Title>tic postprocessor using Variable Memory Length</Title> <Section position="2" start_page="114" end_page="114" type="metho"> <SectionTitle> 3 Related computational and linguistic </SectionTitle> <Paragraph position="0"> formalisms and psycholinguistic findings Unlike ADIOS, very few existing algorithms for unsupervised language acquisition use raw, unannotated corpus data (as opposed, say, to sentences converted into sequences of POS tags). The only work described in a recent review (Roberts and Atwell, 2002) as completely unsupervised -- the GraSp model (Henrichsen, 2002) -- does attempt to induce syntax from raw transcribed speech, yet it is not completely data-driven in that it makes a prior commitment to a particular theory of syntax (Categorial Grammar, complete with a pre-specified set of allowed categories). Because of the unique nature of our chosen challenge -- finding structure in language rather than imposing it -- the following brief survey of grammar induction focuses on contrasts and comparisons to approaches that generally stop short of attempting to do what our algorithm does. We distinguish between approaches that are motivated computationally (Local Grammar and Variable Order Markov models, and Tree Adjoining Grammar, discussed elsewhere (Edelman et al., 2004), and those whose main motivation is linguistic and cognitive psychological (Cognitive and Construction grammars, discussed below).</Paragraph> <Paragraph position="1"> Local Grammar and Markov models. In capturing the regularities inherent in multiple crisscrossing paths through a corpus, ADIOS superficially resembles finite-state Local Grammars (Gross, 1997) and Variable Order Markov (VOM) models (Guyon and Pereira, 1995). The VOM approach starts by postulating a maximum-n structure, which is then fitted to the data by maximizing the likelihood of the training corpus. The ADIOS philosophy differs from the VOM approach in several key respects. First, rather than fitting a model to the data, we use the data to construct a (recursively structured) graph. Thus, our algorithm naturally addresses the inference of the graph's structure, a task that is more difficult than the estimation of parameters for a given configuration. Second, because ADIOS works from the bottom up in a recursive, data-driven fashion, it is less susceptible to complexity issues. It can be used on huge graphs, and may yield very large patterns, which in a VOM model would correspond to an unmanageably high order n. Third, ADIOS transcends the idea of VOM structure, in the following sense. Consider a set of patterns of the form b1[c1]b2[c2]b3, etc. The equivalence classes [ ] may include vertices of the graph (both words and word patterns turned into nodes), wild cards (i.e., any node), as well as ambivalent cards (any node or no node). This means that the terminal-level length of the string represented by a pattern does not have to be of a fixed length.</Paragraph> <Paragraph position="2"> This goes conceptually beyond the variable order Markov structure: b2[c2]b3 do not have to appear in a Markov chain of a finite order jjb2jj+jjc2jj+jjb3jj because the size of [c2] is ill-defined, as explained above. Fourth, as we showed earlier (Figure 2), ADIOS incorporates both context-sensitive substitution and recursion.</Paragraph> <Paragraph position="3"> Tree Adjoining Grammar. The proper place in the Chomsky hierarchy for the class of strings accepted by our model is between Context Free and Context Sensitive Languages. The pattern-based representations employed by ADIOS have counterparts for each of the two composition operations, substitution and adjoining, that characterize a Tree Adjoining Grammar, or TAG, developed by Joshi and others (Joshi and Schabes, 1997). Specifically, both substitution and adjoining are subsumed in the relationships that hold among ADIOS patterns, such as the membership of one pattern in another. Consider a pattern Pi and its equivalence class E(Pi); any other pattern Pj 2 E(Pi) can be seen as substitutable in Pi. Likewise, if Pj 2 E(Pi), Pk 2 E(Pi) and Pk 2 E(Pj), then the pattern Pj can be seen as adjoinable to Pi. Because of this correspondence between the TAG operations and the ADIOS patterns, we believe that the latter represent regularities that are best described by Mildly Context-Sensitive Language formalism (Joshi and Schabes, 1997). Importantly, because the ADIOS patterns are learned from data, they already incorporate the constraints on substitution and adjoining that in the original TAG framework must be specified manually. null Psychological and linguistic evidence for pattern-based representations. Recent advances in understanding the psychological role of representations based on what we call patterns, or constructions (Goldberg, 2003), focus on the use of statistical cues such as conditional probabilities in pattern learning (Saffran et al., 1996; G'omez, 2002), and on the importance of exemplars and constructions in children's language acquisition (Cameron-Faulkner et al., 2003). Converging evidence for the centrality of pattern-like structures is provided by corpus-based studies of prefabs -- sequences, continuous or discontinuous, of words that appear to be prefabricated, that is, stored and retrieved as a whole, rather than being subject to syntactic processing (Wray, 2002). Similar ideas concerning the ubiquity in syntax of structural peculiarities hitherto marginalized as &quot; exceptions&quot; are now being voiced by linguists (Culicover, 1999; Croft, 2001).</Paragraph> <Paragraph position="4"> Cognitive Grammar; Construction Grammar.</Paragraph> <Paragraph position="5"> The main methodological tenets of ADIOS -- populating the lexicon with &quot; units&quot; of varying complexity and degree of entrenchment, and using cognition-general mechanisms for learning and representation -- fit the spirit of the foundations of Cognitive Grammar (Langacker, 1987). At the same time, whereas the cognitive grammarians typically face the chore of hand-crafting structures that would refl ect the logic of language as they perceive it, ADIOS discovers the primitives of grammar empirically and autonomously. The same is true also for the comparison between ADIOS and the various Construction Grammars (Goldberg, 2003; Croft, 2001), which are all hand-crafted. A construction grammar consists of elements that differ in their complexity and in the degree to which they are specified: an idiom such as &quot; big deal&quot; is a fully specified, immutable construction, whereas the expression &quot; the X, the Y&quot; - as in &quot; the more, the better&quot; (Kay and Fillmore, 1999) - is a partially specified template. The patterns learned by ADIOS likewise vary along the dimensions of complexity and specificity (e.g., not every pattern has an equivalence class).</Paragraph> </Section> <Section position="3" start_page="114" end_page="114" type="metho"> <SectionTitle> 4 ADIOS: a psycholinguistic evaluation </SectionTitle> <Paragraph position="0"> To illustrate the applicability of our method to real data, we first describe briefl y the outcome of running it on a subset of the CHILDES collection (MacWhinney and Snow, 1985), consisting of transcribed speech directed at children. The corpus we selected contained 300; 000 sentences (1:3 million tokens) produced by parents. After 14 real-time days, the algorithm (version 7.3) identified 3400 patterns and 3200 equivalence classes. The outcome was encouraging: the algorithm found intuitively significant patterns and produced semantically adequate corresponding equivalence sets. The algorithm's ability to recombine and reuse the acquired patterns is exemplified in the legend of Figure 3, which lists some of the novel sentences it generated.</Paragraph> <Paragraph position="1"> The input module. The ADIOS system's input module allows it to process a novel sentence by forming its distributed representation in terms of activities of existing patterns. We stress that this module plays a crucial role in the tests described below, all of which require dealing with novel inputs. Figure 4 shows the activation of two patterns (#141 and #120) by a phrase that contains a word in a novel context (stay), as well as another word never before encountered in any context (5pm).</Paragraph> <Paragraph position="2"> Acceptability of correct and perturbed novel sentences. To test the quality of the representations 1985). Hundreds of such patterns and equivalence classes (underscored) together constitute a concise representation of the raw data. Some of the phrases that can be described/generated by these patterns are: let's change her. . . ; I thought you were gonna change her. . . ; I was going to change your. . . ; none of these appear in the training data, illustrating the ability of ADIOS to generalize. The generation process operates as a depth-first search of the tree corresponding to a pattern. For details see (Solan et al., 2003a; Solan et al., 2004).</Paragraph> <Paragraph position="3"> (patterns and their associated equivalence classes) acquired by ADIOS, we have examined their ability to support various kinds of grammaticality judgments. The first experiment we report sought to make a distinction between a set of (presumably grammatical) CHILDES sentences not seen by the algorithm during training, and the same sentences in which the word order has been perturbed. We first trained the model on 10; 000 sentences from CHILDES, then compared its performance on (1) 1000 previously unseen sentences and (2) the same sentences in each of which a single random word order switch has been carried out. The results, shown in Figure 5, indicate a substantial sensitivity of the ADIOS input module to simple deviations from grammaticality in novel data, even after a very brief training.</Paragraph> <Paragraph position="4"> Learnability of nonadjacent dependencies Within the ADIOS framework, the &quot; nonadjacent dependencies&quot; that characterize the artificial languages used by (G'omez, 2002) translate, simply, into patterns with embedded equivalence classes.</Paragraph> <Paragraph position="5"> until 5pm. Leaf activation, which is proportional to the mutual information between input words and various members of the equivalence classes, is propagated upward by taking the average at each junction (Solan et al., 2003a).</Paragraph> <Paragraph position="6"> of the input module output values for two kinds of stimuli: novel grammatical sentences (dark/blue), and sentences obtained from these by a single word-order permutation (light/red).</Paragraph> <Paragraph position="7"> G'omez showed that the ability of subjects to learn a language L1 of the form faXd; bXe; cXfg1, as measured by their ability to distinguish it implicitly from L2=faXe; bXf; cXdg, depends on the amount of variation introduced at X. We replicated this experiment by training ADIOS on 432 strings from L1, with jXj = 2; 6; 12; 24. The stimuli were the same strings as in the original experiment, with the individual letters serving as the basic symbols. A subsequent test resulted in 1Symbols a f here stand for nonce words such as pel, vot, or dak, whereas X denotes a slot in which a subset of 24 other nonce words may appear.</Paragraph> <Paragraph position="8"> a perfect acceptance of L1 and a perfect rejection of L2. Training with the original words (rather than letters) as the basic symbols resulted in L2 rejection rates of 0%; 55%; 100%, and 100%, respectively, for jXj = 2; 6; 12; 24. Thus, the ADIOS performance both mirrors that of the human subjects and suggests a potentially interesting new effect (of the granularity of the input stimuli) that may be explored in further psycholinguistic studies.</Paragraph> <Paragraph position="9"> A developmental test. The CASL test (Comprehensive Assessment of Spoken Language) is widely used in the USA to assess language comprehension in children (Carrow-Woolfolk, 1999). One of its many components is a grammaticality judgment test, which consists of 57 sentences and is administered as follows: a sentence is read to the child, who then has to decide whether or not it is correct.</Paragraph> <Paragraph position="10"> If not, the child has to suggest a correct version of the sentence. For every incorrect sentence, the test lists 2-3 acceptable correct ones. The present version of the ADIOS algorithm can compare sentences but cannot score single sentences. We therefore ignored 11 out of the 57 sentences, which were correct to begin with. The remaining 46 incorrect sentences and their corrected versions were scored by ADIOS (which for this test had been trained on a 300,000sentence corpus from the CHILDES database); the highest scoring sentence in each trial was interpreted as the model's choice. The model labeled 17 of the test sentences correctly, yielding a score of 108 (100 = norm) for the age interval 7-0 through 7-2. This score is the norm for the age interval 8-3 through 8-5.2 2ADIOS was undecided about the majority of the other sentences on which it did not score correctly.</Paragraph> <Paragraph position="11"> ESL test (forced choice). We next used a standard test developed for English as Second Language (ESL) classes, which has been administered in G&quot;oteborg (Sweden) to more than 10; 000 upper secondary levels students (that is, children who typically had 9 years of school, but only 6-7 years of English). The test consists of 100 three-choice questions, such as She asked me at once (choices: come, to come, coming) and The tickets have been paid for, so you not worry (choices: may, dare, need); the average score for the population mentioned is 65%. As before, the choice given the highest score by the algorithm won; if two choices received the same top score, the answer was &quot; don't know&quot; . The algorithm's performance in this and several other tests is summarized in Figure 6 (these tests have been conducted with an earlier version of the algorithm (Solan et al., 2003a)). In the ESL test, ADIOS scored at just under 60%; compare this to the 45% precision (with 20% recall) achieved by a straightforward bi-gram benchmark.3 ESL test (magnitude estimation). In this experiment, six subjects were asked to provide magnitude estimates of linguistic acceptability (Gurman-Bard et al., 1996) for all the 3 100 sentences in the G&quot;oteborg ESL test. The test was paper based and included the instructions from (Keller, 2000).</Paragraph> <Paragraph position="12"> No measures were taken to randomize the order of the sentences or otherwise control the experiment.</Paragraph> <Paragraph position="13"> The same 300 sentences were processed by ADIOS, whose responses were normalized by dividing the output by the sum of each triplet's score. The results indicate a significant correlation (R2 = 6:3%, p < 0:001) between the scores produced by the subjects and by ADIOS. In some cases the scores of 3Chance performance in this test is 33%. We note that the corpus used here was too small to train an n-gram model for n > 2; thus, our algorithm effectively overcomes the problem of sparse data by putting the available data to a better use. ADIOS are equal, which usually indicates that there are too many unfamiliar words. Omitting these sentences yields a significant R2 = 9:7%, p < 0:001; removing sentences for which the choices score almost equally (within 10%) results in R2 = 12:7%, p < 0:001.4 plotted against the ADIOS score on the same sentences (R2 = 0:53; p < 0:05). The sentences (ranked by increasing score) are: How many men did you destroy the picture of? How many men did you destroy a picture of? How many men did you take the picture of? How many men did you take a picture of? Which man did you destroy the picture of? Which man did you destroy a picture of? Which man did you take the picture of? Which man did you take a picture of? Modeling Keller's data. A manuscript by Frank Keller lists magnitude estimation data for eight sentences.5 We compared these to the scores produced by ADIOS, and obtained a significant correlation (Figure 7). The input module seems capable of dealing with the substitution of a with the or of take with destroy, and it does reasonably well on the substitution of How many men with Which man. We conjecture that this performance can be improved by a more sophisticated normalization of the score produced by the input module, which should do a better job quantifying the cover (Edelman, 2004) of a novel sentence by the stored patterns. The limitations of the present version of the model became apparent when we tested it on the 52 sentences from Keller's dissertation, using his magnitude estimation method (Keller, 2000).6 For these sentences, no correlation was found between the human and the model scores. One of the more challenging aspects of this set is the central role of pronoun binding in many of the sentences, e.g., The woman/Each woman saw Peter's photograph of her/herself/him/himself. Moreover, this test set contains examples of context effects, where information in an earlier sentence can help resolve a later ambiguity. Thus, many of the grammatical contrasts that appear in Keller's test sentences are too subtle for the present version of the ADIOS input module to handle.</Paragraph> <Paragraph position="14"> Acceptability of correct and perturbed artificial sentences. In this experiment 64 random sentences was produced with a CFG. For uniformity the sentence length was kept within 15-20 words. 16 of the sentences had two adjacent words switched and another 16 had two random words switched. The 64 sentences were presented to 17 subjects, who placed each on a computer screen at a lateral position refl ecting the perceived acceptability. As expected, the perturbed sentences were rated as less acceptable than the non-perturbed ones (R2 = 50:3% with p < 0:01). We controlled for sentence number, for how high on the screen the sentence was placed, for the reaction time and for sentence length; only the latter had a significant contribution to the correlation. The random permutations scored significantly (p < 0:01) lower than the adjacent permutations.</Paragraph> <Paragraph position="15"> Furthermore, the variance in the scores of the randomly permuted sentences was significantly larger (p < 0:005), suggesting that this kind of permutation violates the sentence structure more severely, but may also sometimes create acceptable sentences by chance. Previous tests showed that ADIOS is very good at recognizing perturbed CFG-generated sentences as such, but it remains to be seen whether or not ADIOS also exhibits differential behavior on the adjacent and non-adjacent permutations.</Paragraph> <Paragraph position="16"> Acceptability of ADIOS-generated sentences.</Paragraph> <Paragraph position="17"> ADIOS was trained on 12,700 sentences (out of a total of 12,966 sentences) in the ATIS (Air Travel Information System) corpus; the remaining 226 sentences were used for precision/recall tests. Because 6We remark that this methodology is not without its problems. As one of our linguistically naive subjects remarked, &quot; The instructions were (purposefully?) vague about what I was supposed to judge -- understandability, grammar, correct use of language, or getting the point through. . . &quot; . Indeed, the scores in a magnitude experiment must be composites of several factors -- at the very least, well-formedness and meaningfulness. We are presently exploring various means of acquiring and dealing with such multidimensional &quot; acceptability&quot; data. ADIOS is sensitive to the presentation order of the training sentences, 30 instances were trained on randomized versions of the training set. Eight human subjects were then asked to estimate acceptability of 20 sentences from the original corpus, intermixed randomly with 20 sentences generated by the trained versions of ADIOS. The precision, calculated as the average number of sentences accepted by the subjects divided by the total number of sentences in the set (20), was 0:73 0:2 for sentences from the original corpus and 0:67 0:07 for the sentences generated by ADIOS. Thus, the ADIOS-generated sentences are, on the average, as acceptable to human subjects as the original ones.</Paragraph> </Section> class="xml-element"></Paper>