Size doesn't matter: predicting physico- or biochemical properties based on dozens of molecules

07/22/2021
by   Kirill Karpov, et al.
0

The use of machine learning in chemistry has become a common practice. At the same time, despite the success of modern machine learning methods, the lack of data limits their use. Using a transfer learning methodology can help solve this problem. This methodology assumes that a model built on a sufficient amount of data captures general features of the chemical compound structure on which it was trained and that the further reuse of these features on a dataset with a lack of data will greatly improve the quality of the new model. In this paper, we develop this approach for small organic molecules, implementing transfer learning with graph convolutional neural networks. The paper shows a significant improvement in the performance of models for target properties with a lack of data. The effects of the dataset composition on model quality and the applicability domain of the resulting models are also considered.

READ FULL TEXT

page 4

page 5

page 6

research
08/26/2019

Improving Automatic Jazz Melody Generation by Transfer Learning Techniques

In this paper, we tackle the problem of transfer learning for Jazz autom...
research
04/15/2023

Icospherical Chemical Objects (ICOs) allow for chemical data augmentation and maintain rotational, translation and permutation invariance

Dataset augmentation is a common way to deal with small datasets; Chemis...
research
07/15/2019

Cataloging Accreted Stars within Gaia DR2 using Deep Learning

The goal of this paper is to develop a machine learning based approach t...
research
10/23/2019

Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules

Predicting the relationship between a molecule's structure and its odor ...
research
01/12/2022

Intra-domain and cross-domain transfer learning for time series data – How transferable are the features?

In practice, it is very demanding and sometimes impossible to collect da...
research
01/18/2022

Incompleteness of graph convolutional neural networks for points clouds in three dimensions

Graph convolutional neural networks (GCNN) are very popular methods in m...

Please sign up or login with your details

Forgot password? Click here to reset