Profile-QSAR的发展
Liu, Xin (2018) Profile-QSAR的发展. Undergraduate thesis, Fudan University, Shanghai, China.
Abstract
Profile-QSAR(Profile- Quantitative Structure Activity Relationship) is a virtual screening method based on machine learning published by Novartis Institute of Biomedical Research in 2011 for the first time. It uses the experimental data of pIC50 (negative logarithm of semi-inhibitory concentration) as the label, and the fingerprint of the compounds as the feature to train machine learning models. However, this method still has some problems and there is a lot of room for development. This paper, as a development and extension of the profile-QSAR method, mainly completes the migration of profile-QSAR based on multi-platform development to Python. We have tried to optimize profile-QSAR both on algorithms and methods level, including the optimization of the parameters of the random forest model, the selection and optimization of the clustering algorithm, and the filtering of the profiles. After porting to Python, profile-QSAR runs more easily and quickly and can be easily submitted to large distributed system-based clusters for large-scale computing. From an algorithmic point of view, random forests are optimized as much as possible, and clustering algorithms that best simulate real project are selected . From the perspective of methods development , the filtering of profiles is only part of it, but the it has already greatly improved the overall prediction results significantly.
Item Type: | Article |
---|---|
Keywords: | Profile-QSAR, Virtual Screening, Machine learning, Python, Optimization |
Date Deposited: | 17 Oct 2018 00:45 |
Last Modified: | 17 Oct 2018 00:45 |
URI: | https://oak.novartis.com/id/eprint/36588 |