Browse views: by Year, by Function, by GLF, by Subfunction, by Conference, by Journal

A publicly available crystallisation data set and its application in machine learning

Pillong, Max, Marx, Corinne, Piechon, Philippe, Wicker, Jerome GP, Cooper, Richard I and Wagner, Beatrix (2017) A publicly available crystallisation data set and its application in machine learning. CrystEngComm, 19 (27). pp. 3737-3745.

Abstract

We present here the crystallisation outcomes for 319 publicly available compounds in up to 18 different solvents spread over 5710 individual single solvent evaporation trials. The recorded data is part of a much larger, corresponding in-house database and includes both positive as well as negative crystallisation outcomes. Such data can be used for statistical analyses of solvent performances, machine learning approaches or investigation of the crystallisation behaviour in structurally similar compound classes. The presented data suggests that crystallisation behaviour in different solvents is not correlated with chemical similarity among clusters of highly similar compounds. Further, our machine learning models can be used to guide the solvent choice when crystallising a compound. In a retrospective evaluation, these models proved potent to reduce the workload to a third of our initial protocol, while still guaranteeing crystallisation success rates >92%.

Item Type: Article
Date Deposited: 25 Jul 2017 00:45
Last Modified: 25 Jul 2017 00:45
URI: https://oak.novartis.com/id/eprint/32776

Search