Is Self-Supervised Pretraining Good for Extrapolation in Molecular Property Prediction?

08/16/2023
by   Shun Takashige, et al.
0

The prediction of material properties plays a crucial role in the development and discovery of materials in diverse applications, such as batteries, semiconductors, catalysts, and pharmaceuticals. Recently, there has been a growing interest in employing data-driven approaches by using machine learning technologies, in combination with conventional theoretical calculations. In material science, the prediction of unobserved values, commonly referred to as extrapolation, is particularly critical for property prediction as it enables researchers to gain insight into materials beyond the limits of available data. However, even with the recent advancements in powerful machine learning models, accurate extrapolation is still widely recognized as a significantly challenging problem. On the other hand, self-supervised pretraining is a machine learning technique where a model is first trained on unlabeled data using relatively simple pretext tasks before being trained on labeled data for target tasks. As self-supervised pretraining can effectively utilize material data without observed property values, it has the potential to improve the model's extrapolation ability. In this paper, we clarify how such self-supervised pretraining can enhance extrapolation performance.We propose an experimental framework for the demonstration and empirically reveal that while models were unable to accurately extrapolate absolute property values, self-supervised pretraining enables them to learn relative tendencies of unobserved property values and improve extrapolation performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2022

Crystal Twins: Self-supervised Learning for Crystalline Material Property Prediction

Machine learning (ML) models have been widely successful in the predicti...
research
03/23/2021

Self-Supervised Pretraining Improves Self-Supervised Pretraining

While self-supervised pretraining has proven beneficial for many compute...
research
08/20/2021

Self-supervised learning for joint SAR and multispectral land cover classification

Self-supervised learning techniques are gaining popularity due to their ...
research
07/05/2020

Deep Learning based Dimple Detection for Quantitative Fractography

In this work, we try to address the challenging problem of dimple detect...
research
08/05/2021

Self-supervised optimization of random material microstructures in the small-data regime

While the forward and backward modeling of the process-structure-propert...
research
10/25/2022

MOFormer: Self-Supervised Transformer model for Metal-Organic Framework Property Prediction

Metal-Organic Frameworks (MOFs) are materials with a high degree of poro...
research
05/26/2022

AI for Porosity and Permeability Prediction from Geologic Core X-Ray Micro-Tomography

Geologic cores are rock samples that are extracted from deep under the g...

Please sign up or login with your details

Forgot password? Click here to reset