Band gap prediction for large organic crystal structures with machine learning

10/30/2018
by   Bart Olsthoorn, et al.
0

Machine learning models are capable of capturing the structure-property relationship from a dataset of computationally demanding ab initio calculations. In fact, machine learning models have reached chemical accuracy on small organic molecules contained in the popular QM9 dataset. At the same time, the domain of large crystal structures remains rather unexplored. Over the past two years, the Organic Materials Database (OMDB) has hosted a growing number of electronic properties of previously synthesized organic crystal structures. The complexity of the organic crystals contained within the OMDB, which have on average 85 atoms per unit cell, makes this database a challenging platform for machine learning applications. In this paper, we focus on predicting the band gap which represents one of the basic properties of a crystalline material. With this aim, we release a consistent dataset of 12500 crystal structures and their corresponding DFT band gap freely available for download at https://omdb.diracmaterials.org/dataset. We run two recent machine learning models, kernel ridge regression with the Smooth Overlap of Atomic Positions (SOAP) kernel and the deep learning model SchNet, on this new dataset and find that an ensemble of these two models reaches mean absolute error (MAE) of 0.361 eV, which corresponds to a percentage error of 12 gap of 3.03 eV. The models also provide chemical insights into the data. For example, by visualizing the SOAP kernel similarity between the crystals, different clusters of materials can be identified, such as organic metals or semiconductors. Finally, the trained models are employed to predict the band gap for 260092 materials contained within the Crystallography Open Database (COD) and made available online so the predictions can be obtained for any arbitrary crystal structure uploaded by a user.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2019

Materials property prediction using symmetry-labeled graphs as atomic-position independent descriptors

Computational materials screening studies require fast calculation of th...
research
09/14/2017

Catalyst design using actively learned machine with non-ab initio input features towards CO2 reduction reactions

In conventional chemisorption model, the d-band center theory (augmented...
research
07/11/2020

Crystal Structure Representations for Machine Learning Models of Formation Energies

We introduce and evaluate a set of feature vector representations of cry...
research
03/31/2020

CRYSPNet: Crystal Structure Predictions via Neural Network

Structure is the most basic and important property of crystalline solids...
research
07/30/2021

Distributed Representations of Atoms and Materials for Machine Learning

The use of machine learning is becoming increasingly common in computati...
research
05/09/2022

Machine Learning Diffusion Monte Carlo Energy Densities

We present two machine learning methodologies which are capable of predi...

Please sign up or login with your details

Forgot password? Click here to reset