File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/w04-3101_abstr.xml

Size: 1,111 bytes

Last Modified: 2025-10-06 13:44:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-3101">
  <Title>A resource for constructing customized test suites for molecular biology entity identification systems</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> This paper describes a data source and methodology for producing customized test suites for molecular biology entity identification systems. The data consists of: (a) a set of gene names and symbols classified by a taxonomy of features that are relevant to the performance of entity identification systems, and (b) a set of sentential environments into which names and symbols are inserted to create test data and the associated gold standard. We illustrate the utility of test sets producible by this methodology by applying it to five entity identification systems and describing the error patterns uncovered by it, and investigate relationships between performance on a customized test suite generated from this data and the performance of a system on two corpora.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML