Learn molecular representations from large-scale unlabeled molecules for drug discovery

12/21/2020
by   Pengyong Li, et al.
19

How to produce expressive molecular representations is a fundamental challenge in AI-driven drug discovery. Graph neural network (GNN) has emerged as a powerful technique for modeling molecular data. However, previous supervised approaches usually suffer from the scarcity of labeled data and have poor generalization capability. Here, we proposed a novel Molecular Pre-training Graph-based deep learning framework, named MPG, that leans molecular representations from large-scale unlabeled molecules. In MPG, we proposed a powerful MolGNet model and an effective self-supervised strategy for pre-training the model at both the node and graph-level. After pre-training on 11 million unlabeled molecules, we revealed that MolGNet can capture valuable chemistry insights to produce interpretable representation. The pre-trained MolGNet can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of drug discovery tasks, including molecular properties prediction, drug-drug interaction, and drug-target interaction, involving 13 benchmark datasets. Our work demonstrates that MPG is promising to become a novel approach in the drug discovery pipeline.

READ FULL TEXT
research
06/02/2022

KPGT: Knowledge-Guided Pre-training of Graph Transformer for Molecular Property Prediction

Designing accurate deep learning models for molecular property predictio...
research
03/02/2022

CandidateDrug4Cancer: An Open Molecular Graph Learning Benchmark on Drug Discovery for Cancer

Anti-cancer drug discoveries have been serendipitous, we sought to prese...
research
06/20/2022

SMT-DTA: Improving Drug-Target Affinity Prediction with Semi-supervised Multi-task Training

Drug-Target Affinity (DTA) prediction is an essential task for drug disc...
research
02/15/2023

Activity Cliff Prediction: Dataset and Benchmark

Activity cliffs (ACs), which are generally defined as pairs of structura...
research
10/05/2022

Antibody Representation Learning for Drug Discovery

Therapeutic antibody development has become an increasingly popular appr...
research
07/14/2023

Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions

Accelerating the discovery of novel and more effective therapeutics is a...
research
04/23/2020

MolTrans: Molecular Interaction Transformer for Drug Target Interaction Prediction

Drug target interaction (DTI) prediction is a foundational task for in s...

Please sign up or login with your details

Forgot password? Click here to reset