Transition1x – a Dataset for Building Generalizable Reactive Machine Learning Potentials

07/25/2022
by   Mathias Schreiner, et al.
0

Machine Learning (ML) models have, in contrast to their usefulness in molecular dynamics studies, had limited success as surrogate potentials for reaction barrier search. It is due to the scarcity of training data in relevant transition state regions of chemical space. Currently, available datasets for training ML models on small molecular systems almost exclusively contain configurations at or near equilibrium. In this work, we present the dataset Transition1x containing 9.6 million Density Functional Theory (DFT) calculations of forces and energies of molecular configurations on and around reaction pathways at the wB97x/6-31G(d) level of theory. The data was generated by running Nudged Elastic Band (NEB) calculations with DFT on 10k reactions while saving intermediate calculations. We train state-of-the-art equivariant graph message-passing neural network models on Transition1x and cross-validate on the popular ANI1x and QM9 datasets. We show that ML models cannot learn features in transition-state regions solely by training on hitherto popular benchmark datasets. Transition1x is a new challenging benchmark that will provide an important step towards developing next-generation ML force fields that also work far away from equilibrium configurations and reactive systems.

READ FULL TEXT
research
07/20/2022

NeuralNEB – Neural Networks can find Reaction Paths Fast

Quantum mechanical methods like Density Functional Theory (DFT) are used...
research
08/22/2023

xxMD: Benchmarking Neural Force Fields Using Extended Dynamics beyond Equilibrium

Neural force fields (NFFs) have gained prominence in computational chemi...
research
04/20/2023

A 2D Graph-Based Generative Approach For Exploring Transition States Using Diffusion Model

The exploration of transition state (TS) geometries is crucial for eluci...
research
03/02/2022

Machine learning models predict calculation outcomes with the transferability necessary for computational catalysis

Virtual high throughput screening (VHTS) and machine learning (ML) have ...
research
10/09/2022

Hyperactive Learning (HAL) for Data-Driven Interatomic Potentials

Data-driven interatomic potentials have emerged as a powerful class of s...
research
08/16/2017

Ultra-Fast Reactive Transport Simulations When Chemical Reactions Meet Machine Learning: Chemical Equilibrium

During reactive transport modeling, the computational cost associated wi...
research
03/05/2022

Low-cost prediction of molecular and transition state partition functions via machine learning

We have generated an open-source dataset of over 30000 organic chemistry...

Please sign up or login with your details

Forgot password? Click here to reset