Leveraging genetic algorithms to maximise the predictive capabilities of the SOAP descriptor. Barnard, T., Steng, S., Darby, J., Bartók, A. P., Broo, A., & Sosso, G. C. Molecular Systems Design & Engineering, November, 2022. Publisher: The Royal Society of Chemistry
Leveraging genetic algorithms to maximise the predictive capabilities of the SOAP descriptor [link]Paper  doi  abstract   bibtex   
The Smooth Overlap of Atomic Positions (SOAP) descriptor represents an increasingly common approach to encode local atomic environments in a form readily digestible to machine learning algorithms. The SOAP descriptor is obtained by using a local expansion of a Gaussian smeared atomic density with orthonormal functions based on spherical harmonics and radial basis functions. To construct this representation, one has to choose a number of parameters. Whilst the knowledge of the dataset of interest can and should guide this choice, more often than not some optimisation method is required to pinpoint the most effective combinations of SOAP parameters in terms of both accuracy and computational cost. In this work, we present SOAP_GAS, a simple, freely available computational tool that leverages genetic algorithms to optimise the parameters relative to any given SOAP descriptor. To explore the capabilities of the algorithm, we have applied SOAP_GAS to a prototypical molecular dataset of relevance for drug design. In this process, we have realised that a diverse portfolio of different combinations of SOAP parameters can result in equally substantial improvements in terms of the accuracy of the SOAP descriptor. This is especially true when dealing with the concurrent optimisation of the SOAP parameters for multiple SOAP descriptors, which we found it often leads to further accuracy gains. Overall, we show that SOAP_GAS offers an often superior alternative to e.g. randomised grid search approaches to enhanced the predictive capabilities of SOAP descriptors in a largely automatised fashion.
@article{barnard_leveraging_2022,
	title = {Leveraging genetic algorithms to maximise the predictive capabilities of the {SOAP} descriptor},
	issn = {2058-9689},
	url = {https://pubs.rsc.org/en/content/articlelanding/2022/me/d2me00149g},
	doi = {10.1039/D2ME00149G},
	abstract = {The Smooth Overlap of Atomic Positions (SOAP) descriptor represents an increasingly common approach to encode local atomic environments in a form readily digestible to machine learning algorithms. The SOAP descriptor is obtained by using a local expansion of a Gaussian smeared atomic density with orthonormal functions based on spherical harmonics and radial basis functions. To construct this representation, one has to choose a number of parameters. Whilst the knowledge of the dataset of interest can and should guide this choice, more often than not some optimisation method is required to pinpoint the most effective combinations of SOAP parameters in terms of both accuracy and computational cost. In this work, we present SOAP\_GAS, a simple, freely available computational tool that leverages genetic algorithms to optimise the parameters relative to any given SOAP descriptor. To explore the capabilities of the algorithm, we have applied SOAP\_GAS to a prototypical molecular dataset of relevance for drug design. In this process, we have realised that a diverse portfolio of different combinations of SOAP parameters can result in equally substantial improvements in terms of the accuracy of the SOAP descriptor. This is especially true when dealing with the concurrent optimisation of the SOAP parameters for multiple SOAP descriptors, which we found it often leads to further accuracy gains. Overall, we show that SOAP\_GAS offers an often superior alternative to e.g. randomised grid search approaches to enhanced the predictive capabilities of SOAP descriptors in a largely automatised fashion.},
	language = {en},
	urldate = {2022-11-09},
	journal = {Molecular Systems Design \& Engineering},
	author = {Barnard, Trent and Steng, Steven and Darby, James and Bartók, Albert P. and Broo, Anders and Sosso, Gabriele Cesare},
	month = nov,
	year = {2022},
	note = {Publisher: The Royal Society of Chemistry},
}

Downloads: 0