File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/w05-1307_concl.xml
Size: 1,328 bytes
Last Modified: 2025-10-06 13:55:03
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-1307"> <Title>Using Biomedical Literature Mining to Consolidate the Set of Known Human Protein-Protein Interactions</Title> <Section position="8" start_page="51" end_page="51" type="concl"> <SectionTitle> 8 Conclusion </SectionTitle> <Paragraph position="0"> Through a combination of automatic text mining and consolidation of existing databases, we have constructed a large database of known human protein interactions containing 31,609 interactions amongst 7,748 proteins. By mining 753,459 human-related abstracts from Medline with a combination of a CRF-based protein tagger, co-citation analysis, and automatic text classification, we extracted a set of 6,580 interactions between 3,737 proteins. By utilizing information in existing knowledge bases, this automatically extracted data was found to have an accuracy comparable to manually developed data sets. More details on our interaction database have been published in the biological literature (Ramani et al., 2005) and it is available on the web at http://bioinformatics.icmb.utexas.edu/idserve. We are currently exploring improvements to this database by more accurately identifying assertions of interactions in the text using an SVM that exploits a relational string kernel.</Paragraph> </Section> class="xml-element"></Paper>