File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-3005_intro.xml
Size: 7,551 bytes
Last Modified: 2025-10-06 14:03:46
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-3005"> <Title>Modeling Human Sentence Processing Data with a Statistical Parts-of-Speech Tagger</Title> <Section position="4" start_page="0" end_page="26" type="intro"> <SectionTitle> 2 Experiments </SectionTitle> <Paragraph position="0"> A Hidden Markov Model POS tagger based on bi-grams was used. We made our own implementation to be sure of getting as close as possible to the design of Corley and Crocker (2000). Given a word string, w0,w1,***,wn, the tagger calculates the probability of every possible tag path, t0,***,tn. Under the Markov assumption, the joint probability of the given word sequence and each possible POS sequence can be approximated as a product of conditional probability and transition probability as shown in (1).</Paragraph> <Paragraph position="2"> Using the Viterbi algorithm (Viterbi, 1967), the tagger finds the most likely POS sequence for a given word string as shown in (2).</Paragraph> <Paragraph position="3"> (2) argmaxP(t0,t1,***,tn|w0,w1,***,wn,u).</Paragraph> <Paragraph position="4"> This is known technology, see Manning and Sch&quot;utze (1999), but the particular use we make of it is unusual. The tagger takes a word string as an input, outputs the most likely POS sequence and the final probability. Additionally, it presents accumulated probability at each word break and probability re-ranking, if any. Probability re-ranking occurs when a previously less preferred POS sequence is more favored later. Note that the running probability at the beginning of a sentence will be 1, and will keep decreasing at each word break since it is a product of conditional probabilities. null We tested the predictability of the model on empirical reading data with the probability decrease and the presence or absence of probability re-ranking. Probability re-ranking occurs when a less preferred POS sequence is selected later over a temporarily favored sequence. Adopting the standard experimental design used in human sentence processing studies, where word-by-word reading time or eye-fixation time is compared between an experimental sentence and its control sentence, this study compares probability at each word break between a pair of sentences. Comparatively faster drop of probability is expected to be a good indicator of comparative processing difficulty. Probability re-ranking, which is a simplified model of the reanalysis process assumed in many human studies, is also tested as another indicator of garden-path effect. Probability re-ranking will occur when an initially dispreferred POS sub-sequence becomes the preferred candidate later in the parse, because it fits in better with later words. The model parameters, P(wi|ti) and P(ti|ti[?]1), are estimated from a small section (970,995 tokens,47,831 distinct words) of the British National Corpus (BNC), which is a 100 million-word collection of British English, both written and spoken, developed by Oxford University Press (Burnard, 1995). The BNC was chosen for training the model because it is a POS-annotated corpus, which allows supervised training. In the implementation we use log probabilities to avoid underflow, and we report log probabilities in the sequel.</Paragraph> <Section position="1" start_page="25" end_page="25" type="sub_section"> <SectionTitle> 2.1 Hypotheses </SectionTitle> <Paragraph position="0"> If the HSPM is affected by frequency information, we can assume that it will be easier to process events with higher frequency or probability compared to those with lower frequency or probability.</Paragraph> <Paragraph position="1"> Under this general assumption, the overall difficulty of a sentence is expected to be measured or predicted by the mean size of probability decrease.</Paragraph> <Paragraph position="2"> That is, probability will drop faster in garden-path sentences than in control sentences (e.g. unambiguous sentences or ambiguous but non-garden-path sentences).</Paragraph> <Paragraph position="3"> More importantly, the probability decrease pattern at disambiguating regions will predict the trends in the reading time data. All other things being equal, we might expect a reading time penalty for a garden-path region when the size of the probability decrease at the disambiguating region of a garden-path sentence will be greater than that of control sentences. This is a simple and intuitive assumption that can be easily tested. We could have formed the sum over all possible POS sequences in association with the word strings, but for the present study we simply used the Viterbi path: justifying this because this is the best singlepath approximation to the joint probability.</Paragraph> <Paragraph position="4"> Lastly, re-ranking of POS sequences is expected to predict reanalysis of lexical categories. This is because re-ranking in the tagger is parallel to reanalysis in human subjects, which is known to be cognitively costly.</Paragraph> </Section> <Section position="2" start_page="25" end_page="26" type="sub_section"> <SectionTitle> 2.2 Materials </SectionTitle> <Paragraph position="0"> In this study, five different types of ambiguity were tested including Lexical Category ambiguity, Reduced-relative ambiguity (RR ambiguity), Preposition-phrase attachment ambiguity (PP ambiguity), Direct-object/Sententialcomplement ambiguity (DO/SC ambiguity), and Clausal Boundary ambiguity. The following are example sentences for each ambiguity type, shown with the ambiguous region italicized and the dis- null ambiguating region bolded. All of the example sentences are garden-path sentneces.</Paragraph> <Paragraph position="1"> (3) Lexical Category ambiguity The foreman knows that the warehouse prices the beer very modestly.</Paragraph> <Paragraph position="2"> (4) RR ambiguity The horse raced past the barn fell.</Paragraph> <Paragraph position="3"> (5) PP ambiguity Katie laid the dress on the floor onto the bed. (6) DO/SC ambiguity He forgot Pam needed a ride with him.</Paragraph> <Paragraph position="4"> (7) Clausal Boundary ambiguity Though George kept on reading the story really bothered him.</Paragraph> <Paragraph position="5"> The test materials are constructed such that a garden-path sentence and its control sentence share exactly the same word sequence except for the disambiguating word so that extraneous variables such as word frequency effect can be controlled. We inherit this careful design. In this study, a total of 76 sentences were tested: 10 for lexical category ambiguity, 12 for RR ambiguity, 20 for PP attachment ambiguity, 16 for DO/SC ambiguity, and 18 for clausal boundary ambiguity. This set of materials is, to our knowledge, the most comprehensive yet subjected to this type of study. The sentences are directly adopted from various psycholinguistic studies (Frazier, 1978; Trueswell, 1996; Ferreira and Henderson, 1986).</Paragraph> <Paragraph position="6"> As a baseline test case of the tagger, the well-established asymmetry between subject- and object-relative clauses was tested as shown in (8). (8) a. The editor who kicked the writer fired the entire staff. (Subject-relative) b. The editor who the writer kicked fired the entire staff. (Object-relative) The reading time advantage of subject-relative clauses over object-relative clauses is robust in English (Traxler et al., 2002) as well as other languages (Mak et al., 2002; Homes et al., 1981). For this test, materials from Traxler et al. (2002) (96 sentences) are used.</Paragraph> </Section> </Section> class="xml-element"></Paper>