File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/w04-1211_abstr.xml
Size: 858 bytes
Last Modified: 2025-10-06 13:43:56
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1211"> <Title>Creating a Test Corpus of Clinical Notes Manually Tagged for Part-of-Speech Information</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper presents a project whose main goal is to construct a corpus of clinical text manually annotated for part-of-speech information. We describe and discuss the process of training three domain experts to perform linguistic annotation. We list some of the challenges as well as encouraging results pertaining to inter-rater agreement and consistency of annotation. We also present preliminary experimental results indicating the necessity for adapting state-of-the-art POS taggers to the sublanguage domain of medical text.</Paragraph> </Section> class="xml-element"></Paper>