Browse views: by Year, by Function, by GLF, by Subfunction, by Conference, by Journal

In silico generation of novel, drug-like chemical matter using the LSTM neural network

Ertl, Peter and Lewis, Richard and Martin, Eric and Polyakov, Valery (2017) In silico generation of novel, drug-like chemical matter using the LSTM neural network. arXiv / talk at CCG User Group Meeting.


Background: The exploration of novel chemical spaces is one of the most important tasks of cheminformatics when supporting the drug discovery process. Properly designed and trained deep neural networks can provide a viable alternative to brute-force de novo approaches or various other machine-learning techniques for generating novel drug-like molecules. In this article we present a method to generate molecules using a long short-term memory (LSTM) neural network and provide an analysis of the results, including a virtual screening test.
Results: A computational procedure to generate novel molecules based on an LSTM deep neural network has been developed. Using this protocol one million drug-like molecules were generated in 2 hours. The molecules are novel, diverse (contain numerous novel chemotypes), have good physicochemical properties and have good synthetic accessibility, even though these qualities were not specific constraints. Although novel, their structural features and functional groups remain closely within the drug-like space defined by the bioactive molecules from ChEMBL. Virtual screening using the profile QSAR approach confirms that the potential of these novel molecules to show bioactivity is comparable to the ChEMBL set from which they were derived.
Conclusions: A procedure to generate novel drug-like molecules using a LSTM neural network is described. The generated molecules have good properties, are synthetically accessible, exhibit good diversity and are structurally different from the ChEMBL structures that were used to train the network. This study shows that the LSTM deep neural network provides a good option for generating large numbers (hundreds of millions, even billions) of realistic molecules that may be used for virtual screening or for identification of novel, interesting areas of chemical space. The molecule generator written in Python used in this study is provided for download as an additional file.

Item Type: Article
Date Deposited: 24 Mar 2018 00:45
Last Modified: 24 Mar 2018 00:45


Email Alerts

Register with OAK to receive email alerts for saved searches.