File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/w04-0212_abstr.xml
Size: 852 bytes
Last Modified: 2025-10-06 13:43:44
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0212"> <Title>Annotation and Data Mining of the Penn Discourse TreeBank</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> The Penn Discourse TreeBank (PDTB) is a new resource built on top of the Penn Wall Street Journal corpus, in which discourse connectives are annotated along with their arguments. Its use of stand-off annotation allows integration with a stand-off version of the Penn TreeBank (syntactic structure) and PropBank (verbs and their arguments), which adds value for both linguistic discovery and discourse modeling. Here we describe the PDTB and some experiments in linguistic discovery based on the PDTB alone, as well as on the linked PTB and PDTB corpora.</Paragraph> </Section> class="xml-element"></Paper>