File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/p06-3005_concl.xml
Size: 2,497 bytes
Last Modified: 2025-10-06 13:55:23
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-3005"> <Title>Modeling Human Sentence Processing Data with a Statistical Parts-of-Speech Tagger</Title> <Section position="7" start_page="28" end_page="29" type="concl"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> Our studies show that, at least for the sample of test materials that we culled from the standard literature, a statistical POS tagging system can predict processing difficulty in structurally ambiguous garden-path sentences. The statistical POS tagger was surprisingly effective in modeling sentence processing data, given the locality of the probability distribution. The findings in this study provide an alternative account for the garden-path effect observed in empirical studies, specifically, that the slower processing times associated with garden-path sentences are due in part to their relatively unlikely POS sequences in comparison with those of non-garden-path sentences and in part to differences in the emission probabilities that the tagger learns. One attractive future direction is to carry out simulations that compare the evolution of probabilities in the tagger with that in a theoretically more powerful model trained on the same data, such as an incremental statistical parser (Wang et al., 2004; Roark, 2001). In so doing we can find the places where the prediction problem faced both by the HSPM and the machines that aspire to emulate it actually warrants the greater power of structurally sensitive models, using this knowledge to mine large corpora for future experiments with human subjects.</Paragraph> <Paragraph position="1"> We have not necessarily cast doubt on the hypothesis that the HSPM makes crucial use of structural information, but we have demonstrated that much of the relevant behavior can be captured in a simple model. The 'structural' regularities that we observe are reasonably well encoded into this model. For purposes of initial real-time processing it could be that the HSPM is using a similar encoding of structural regularities into convenient probabilistic or neural form. It is as yet unclear what the final form of a cognitively accurate model along these lines would be, but it is clear from our study that it is worthwhile, for the sake of clarity and explicit testability, to consider models that are simpler and more precisely specified than those assumed by dominant theories of human sentence processing.</Paragraph> </Section> class="xml-element"></Paper>