Browse views: by Year, by Function, by GLF, by Subfunction, by Conference, by Journal

Quality issues with public domain chemogenomics data

Kalliokoski, Tuomo, Kramer, Christian and Vulpetti, Anna (2013) Quality issues with public domain chemogenomics data. Molecular Informatics, 32 (11-12). pp. 898-905. ISSN 1868-1743


The key concept in chemogenomics is the similarity principle that states that similar ligands should bind similar targets. Chemogenomic analysis requires large amounts of data and both powerful computational algorithms and computers. Data used for chemogenomics analysis can either be compiled from open sources, or they can be produced in-house as is often done in the pharmaceutical industry. The chemogenomic modeller often has to resort to mixing activity values from different laboratories and even assay types to facilitate chemogenomic analysis. The amount of chemogenomics data available in the public domain has dramatically increased in recent years, allowing fully traceable analysis on a continuously increasing scale. However, some warning flags about the data quality have been raised and because the primary data determine the accuracy of chemogenomic analysis, the quality of the data is one of the key questions in chemogenomics. This minireview discusses some of the most common issues with public domain biological data related to chemogenomic analysis. The errors in data can originate from problems with the experiments themselves and their interpretation, or from more mundane issues such as data extraction and annotation. These issues are not unique for a certain database but are shared by all the public domain databases and can plague commercial and in-house bioactivity databases as well. © 2013 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

Item Type: Article
Additional Information: Review paper
Keywords: Chemogenomics Data accuracy Databases Experimental uncertainty
Date Deposited: 22 Nov 2017 00:45
Last Modified: 25 Jan 2019 00:46