File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-3104_concl.xml
Size: 1,148 bytes
Last Modified: 2025-10-06 13:54:28
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-3104"> <Title>A Study of Text Categorization for Model Organism Databases</Title> <Section position="6" start_page="3" end_page="3" type="concl"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> In this paper, we designed a study using existing reference information available at four well-known model organism databases and investigated the problem of identifying relevant articles for these organisms using MEDLINE. We compared the results obtained using keyword searching with supervised machine learning techniques. We found out that keyword searching retrieved about 80% of the citations. When using supervised machine learning techniques, the overall F-measure of the best classifier is around 94.1%. Future work would be applying the supervised machine learning technique to the whole MEDLINE citation to retrieve relevant articles. Also we plan to apply text clustering techniques or text categorization techniques for the routing problem inside a specific model organism database (such as routing to curators in a specific area).</Paragraph> </Section> class="xml-element"></Paper>