The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysis

06/17/2022
by   Richard Tran, et al.
31

Computational catalysis and machine learning communities have made considerable progress in developing machine learning models for catalyst discovery and design. Yet, a general machine learning potential that spans the chemical space of catalysis is still out of reach. A significant hurdle is obtaining access to training data across a wide range of materials. One important class of materials where data is lacking are oxides, which inhibits models from studying the Oxygen Evolution Reaction and oxide electrocatalysis more generally. To address this we developed the Open Catalyst 2022(OC22) dataset, consisting of 62,521 Density Functional Theory (DFT) relaxations ( 9,884,504 single point calculations) across a range of oxide materials, coverages, and adsorbates (*H, *O, *N, *C, *OOH, *OH, *OH2, *O2, *CO). We define generalized tasks to predict the total system energy that are applicable across catalysis, develop baseline performance of several graph neural networks (SchNet, DimeNet++, ForceNet, SpinConv, PaiNN, GemNet-dT, GemNet-OC), and provide pre-defined dataset splits to establish clear benchmarks for future efforts. For all tasks, we study whether combining datasets leads to better results, even if they contain different materials or adsorbates. Specifically, we jointly train models on Open Catalyst 2020 (OC20) Dataset and OC22, or fine-tune pretrained OC20 models on OC22. In the most general task, GemNet-OC sees a  32 improvement in force predictions via joint training. Surprisingly, joint training on both the OC20 and much smaller OC22 datasets also improves total energy predictions on OC20 by  19 sourced, and a public leaderboard will follow to encourage continued community developments on the total energy tasks and data.

READ FULL TEXT

page 3

page 6

page 15

page 36

page 37

research
10/20/2020

The Open Catalyst 2020 (OC20) Dataset and Community Challenges

Catalyst discovery and optimization is key to solving many societal and ...
research
11/14/2018

MT-CGCNN: Integrating Crystal Graph Convolutional Neural Network with Multitask Learning for Material Property Prediction

Developing accurate, transferable and computationally inexpensive machin...
research
09/12/2023

MatSciML: A Broad, Multi-Task Benchmark for Solid-State Materials Modeling

We propose MatSci ML, a novel benchmark for modeling MATerials SCIence u...
research
08/01/2020

DeePKS: a comprehensive data-driven approach towards chemically accurate density functional theory

We propose a general machine learning-based framework for building an ac...
research
12/01/2021

Graph neural networks for fast electron density estimation of molecules, liquids, and solids

Electron density ρ(r⃗) is the fundamental variable in the calculation of...
research
01/12/2021

Interpretable discovery of new semiconductors with machine learning

Machine learning models of materials^1-5 accelerate discovery compared t...
research
01/11/2022

Two Wrongs Can Make a Right: A Transfer Learning Approach for Chemical Discovery with Chemical Accuracy

Appropriately identifying and treating molecules and materials with sign...

Please sign up or login with your details

Forgot password? Click here to reset