File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-2177_intro.xml
Size: 2,527 bytes
Last Modified: 2025-10-06 14:06:40
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2177"> <Title>Statistical Models for Unsupervised Prepositional Phrase Attachment</Title> <Section position="4" start_page="1079" end_page="1079" type="intro"> <SectionTitle> 2 Previous Work </SectionTitle> <Paragraph position="0"> Most of the previous successful approaches to this problem have been statistical or corpusbased, and they consider only prepositions whose attachment is ambiguous between a preceding noun phrase and verb phrase. Previous work has framed the problem as a classification task, in which the goal is to predict N or V, corresponding to noun or verb attachment, given the head verb v, the head noun n, the preposition p, and optionally, the object of the preposition n2. For example, the (v, n,p, n2) tuples corresponding to the example sentences are 1. bought shirt with pockets 2. washed shirt with soap The correct classifications of tuples 1 and 2 are N and V, respectively.</Paragraph> <Paragraph position="1"> (Hindle and Rooth, 1993) describes a partially supervised approach in which the FIDDITCH partial parser was used to extract (v,n,p) tuples from raw text, where p is a preposition whose attachment is ambiguous between the head verb v and the head noun n. The extracted tuples are then used to construct a classifier, which resolves unseen ambiguities at around 80% accuracy. Later work, such as (Ratnaparkhi et al., 1994; Brill and Resnik, 1994; Collins and Brooks, 1995; Merlo et al., 1997; Zavrel and Daelemans, 1997; Franz, 1997), trains and tests on quintuples of the form (v,n,p, n2,a) extracted from the Penn treebank(Marcus et al., 1994), and has gradually improved on this accuracy with other kinds of statistical learning methods, yielding up to 84.5% accuracy(Collins and Brooks, 1995). Recently, (Stetina and Nagao, 1997) have reported 88% accuracy by using a corpus-based model in conjunction with a semantic dictionary.</Paragraph> <Paragraph position="2"> While previous corpus-based methods are highly accurate for this task, they are difficult to port to other languages because they require resources that are expensive to construct or simply nonexistent in other languages. We present an unsupervised algorithm for prepositional phrase attachment in English that requires only an part-of-speech tagger and a morphology database, and is therefore less resource-intensive and more portable than previous approaches, which have all required either tree-banks or partial parsers.</Paragraph> </Section> class="xml-element"></Paper>