File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/03/n03-1027_abstr.xml
Size: 1,180 bytes
Last Modified: 2025-10-06 13:42:48
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-1027"> <Title>Supervised and unsupervised PCFG adaptation to novel domains</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper investigates adapting a lexicalized probabilistic context-free grammar (PCFG) to a novel domain, using maximum a posteriori (MAP) estimation. The MAP framework is general enough to include some previous model adaptation approaches, such as corpus mixing in Gildea (2001), for example. Other approaches falling within this framework are more effective. In contrast to the results in Gildea (2001), we show F-measure parsing accuracy gains of as much as 2.5% for high accuracy lexicalized parsing through the use of out-of-domain treebanks, with the largest gains when the amount of in-domain data is small. MAP adaptation can also be based on either supervised or unsupervised adaptation data. Even when no in-domain treebank is available, unsupervised techniques provide a substantial accuracy gain over unadapted grammars, as much as nearly 5% F-measure improvement.</Paragraph> </Section> class="xml-element"></Paper>