Fractional Denoising for 3D Molecular Pre-training

07/20/2023
by   Shikun Feng, et al.
0

Coordinate denoising is a promising 3D molecular pre-training method, which has achieved remarkable performance in various downstream drug discovery tasks. Theoretically, the objective is equivalent to learning the force field, which is revealed helpful for downstream tasks. Nevertheless, there are two challenges for coordinate denoising to learn an effective force field, i.e. low coverage samples and isotropic force field. The underlying reason is that molecular distributions assumed by existing denoising methods fail to capture the anisotropic characteristic of molecules. To tackle these challenges, we propose a novel hybrid noise strategy, including noises on both dihedral angel and coordinate. However, denoising such hybrid noise in a traditional way is no more equivalent to learning the force field. Through theoretical deductions, we find that the problem is caused by the dependency of the input conformation for covariance. To this end, we propose to decouple the two types of noise and design a novel fractional denoising method (Frad), which only denoises the latter coordinate part. In this way, Frad enjoys both the merits of sampling more low-energy structures and the force field equivalence. Extensive experiments show the effectiveness of Frad in molecular representation, with a new state-of-the-art on 9 out of 12 tasks of QM9 and on 7 out of 8 targets of MD17.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2022

Pre-training via Denoising for Molecular Property Prediction

Many important problems involving molecular property prediction from 3D ...
research
06/13/2023

Automated 3D Pre-Training for Molecular Property Prediction

Molecular property prediction is an important problem in drug discovery ...
research
11/12/2019

SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery

In drug-discovery-related tasks such as virtual screening, machine learn...
research
11/26/2020

Molecular representation learning with language models and domain-relevant auxiliary tasks

We apply a Transformer architecture, specifically BERT, to learn flexibl...
research
12/24/2019

TF3P: Three-dimensional Force Fields Fingerprint Learned by Deep Capsular Network

Molecular fingerprints are the workhorse in ligand-based drug discovery....
research
05/28/2020

Pattern Denoising in Molecular Associative Memory using Pairwise Markov Random Field Models

We propose an in silico molecular associative memory model for pattern l...

Please sign up or login with your details

Forgot password? Click here to reset