SPICE, A Dataset of Drug-like Molecules and Peptides for Training Machine Learning Potentials

09/21/2022
by   Peter Eastman, et al.
11

Machine learning potentials are an important tool for molecular simulation, but their development is held back by a shortage of high quality datasets to train them on. We describe the SPICE dataset, a new quantum chemistry dataset for training potentials relevant to simulating drug-like small molecules interacting with proteins. It contains over 1.1 million conformations for a diverse set of small molecules, dimers, dipeptides, and solvated amino acids. It includes 15 elements, charged and uncharged molecules, and a wide range of covalent and non-covalent interactions. It provides both forces and energies calculated at the ωB97M-D3(BJ)/def2-TZVPPD level of theory, along with other useful quantities such as multipole moments and bond orders. We train a set of machine learning potentials on it and demonstrate that they can achieve chemical accuracy across a broad region of chemical space. It can serve as a valuable resource for the creation of transferable, ready to use potential functions for use in molecular simulations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2020

TorchMD: A deep learning framework for molecular simulations

Molecular dynamics simulations provide a mechanistic description of mole...
research
01/28/2018

Less is more: sampling chemical space with active learning

The development of accurate and transferable machine learning (ML) poten...
research
12/15/2017

WACSF - Weighted Atom-Centered Symmetry Functions as Descriptors in Machine Learning Potentials

We introduce weighted atom-centered symmetry functions (wACSFs) as descr...
research
08/13/2021

Efficient force field and energy emulation through partition of permutationally equivalent atoms

Kernel ridge regression (KRR) that satisfies energy conservation is a po...
research
01/08/2021

SE(3)-Equivariant Graph Neural Networks for Data-Efficient and Accurate Interatomic Potentials

This work presents Neural Equivariant Interatomic Potentials (NequIP), a...
research
01/27/2016

Predicting Drug Interactions and Mutagenicity with Ensemble Classifiers on Subgraphs of Molecules

In this study, we intend to solve a mutual information problem in intera...

Please sign up or login with your details

Forgot password? Click here to reset