Machine learning and proteochemometric models for Cereblon glue activity predictions
Prael III, Francis J., Cox, Jiayi, Noe, Sturm, Blank, Jutta, Shen, Lingling, Rodriguez Perez, Raquel, Kutchukian, Peter, Forrester, William and Michaud, Gregory (2024) Machine learning and proteochemometric models for Cereblon glue activity predictions. Scientific reports.
Abstract
Targeted protein degradation (TPD) is a rapidly developing drug discovery methodology with unique efficacy and target scope stemming from its degradation-based activity. Molecular glue degraders are a promising arm of TPD, as evidenced by the FDA-approved therapeutics within this class, the increasing number of degraders in clinical development, and their predisposition to drug-likeness. Cereblon (CRBN) glue degraders mediate target degradation by generating a neomorphic interface between CRBN and a protein of interest. While promising, the complicated nature of this CRBN-glue-target ternary complex makes the rational design of molecular glue degraders challenging. For other drug modalities, predictive modeling has been established to help leverage existing activity data and generate quantitative structure-activity relationships (QSAR). However, the applicability of machine learning-based QSAR strategies for glues remains unclear. Herein, machine learning methodologies have been benchmarked for CRBN glue activity predictions with promising performance. Generated models include single-task and multi-task classifiers which leverage more than a hundred internal screening campaigns across thousands of CRBN glues to predict glue-mediated recruitment of targets to CRBN. Our results show that the activity of CRBN glue degraders can be modeled well by both classical single-task and multi-task approaches, with 89% of models producing an area under the receiver operating characteristic curve (ROC AUC) > 0.8 and 70% of models producing a Matthew’s Correlation Coefficient (MCC) > 0.2 for these primary screening data. Importantly, our findings also indicated that the combination of compound information and simple protein descriptors in the so-called proteochemometric models improves performance, with >80% of the models exhibiting higher ROC AUC and MCC scores over their single-task counterparts. Taken together, our investigations show that PCM modeling is a successful approach for molecular glue degraders. The proposed machine learning approaches can aid compound prioritization based on recruitment efficacy and target selectivity, thus have the potential to facilitate the design and discovery of CRBN molecular glues with therapeutic potential.
Item Type: | Article |
---|---|
Keywords: | Machine learning, glues, targeted protein degradation, cereblon, proteochemometric models, chemogenomics |
Date Deposited: | 04 Aug 2024 00:46 |
Last Modified: | 04 Aug 2024 00:46 |
URI: | https://oak.novartis.com/id/eprint/52584 |