File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-1211_concl.xml

Size: 935 bytes

Last Modified: 2025-10-06 13:54:23

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1211">
  <Title>Creating a Test Corpus of Clinical Notes Manually Tagged for Part-of-Speech Information</Title>
  <Section position="6" start_page="64" end_page="64" type="concl">
    <SectionTitle>
6 Conclusion
</SectionTitle>
    <Paragraph position="0"> Several questions remain unresolved. First of all, it is unclear how much domain specific data is enough to achieve state-of-the-art performance on POS tagging. Second, given that it is somewhat easier to develop lexicons for POS tagging than to annotate corpora, we need to find out how important the corpus statistics are as opposed to a domain specific lexicon. In other words, can we achieve state-of-the-art performance in a specialized domain by simply adding the vocabulary from the domain to the POS tagger's lexicon? We intend to address both of these questions with further experimentation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML