Extraction and validation of substructure profiles for enriching compound libraries

Yeo, Wee Kiang, Nilar, Shahul Hameed and Go, Mei Lin (2012) Extraction and validation of substructure profiles for enriching compound libraries. Journal of Computer-Aided Molecular Design, 26 (10). pp. 1127-1141. ISSN 0920-654X


Compounds known to be potent against a specific protein target may potentially contain a signature profile of common substructures that is highly correlated to their potency. These substructure profiles may be useful in enriching compound libraries or for prioritizing compounds against a specific protein target. With this objective in mind, a set of compounds with known potency against six selected kinases (2 each from 3 kinase families) was used to generate binary molecular fingerprints. Each fingerprint key represents a substructure that is found within a compound and the frequency with which the fingerprint occurs was then tabulated. Thereafter, the concept of Correlation Rules was applied with the aim of uncovering substructures that are not only well represented among known potent inhibitors but are also unrepresented among known inactive compounds and vice versa. Substructure profiles that are representative of potent inhibitors against each of the 3 kinase families were thus extracted. Based on our validation results, these substructure profiles demonstrated significant enrichment for highly potent compounds against their respective kinase targets. The advantages of using Correlation Rules over Association Rules in analyzing such datasets and the methodology used in the mining of enriching substructures are presented.

Item Type: Article
Additional Information: The first author is a PhD student and the research forms part of a PhD thesis project.
Keywords: Correlation rules, substructure profiling, kinase, co-occurrence
