SMT-DTA: Improving Drug-Target Affinity Prediction with Semi-supervised Multi-task Training

by   Qizhi Pei, et al.

Drug-Target Affinity (DTA) prediction is an essential task for drug discovery and pharmaceutical research. Accurate predictions of DTA can greatly benefit the design of new drug. As wet experiments are costly and time consuming, the supervised data for DTA prediction is extremely limited. This seriously hinders the application of deep learning based methods, which require a large scale of supervised data. To address this challenge and improve the DTA prediction accuracy, we propose a framework with several simple yet effective strategies in this work: (1) a multi-task training strategy, which takes the DTA prediction and the masked language modeling (MLM) task on the paired drug-target dataset; (2) a semi-supervised training method to empower the drug and target representation learning by leveraging large-scale unpaired molecules and proteins in training, which differs from previous pre-training and fine-tuning methods that only utilize molecules or proteins in pre-training; and (3) a cross-attention module to enhance the interaction between drug and target representation. Extensive experiments are conducted on three real-world benchmark datasets: BindingDB, DAVIS and KIBA. The results show that our framework significantly outperforms existing methods and achieves state-of-the-art performances, e.g., 0.712 RMSE on BindingDB IC_50 measurement with more than 5% improvement than previous best work. In addition, case studies on specific drug-target binding activities, drug feature visualizations, and real-world applications demonstrate the great potential of our work. The code and data are released at


page 7

page 8

page 11


Learn molecular representations from large-scale unlabeled molecules for drug discovery

How to produce expressive molecular representations is a fundamental cha...

Domain-Adversarial Multi-Task Framework for Novel Therapeutic Property Prediction of Compounds

With the rapid development of high-throughput technologies, parallel acq...

MLT-LE: predicting drug-target binding affinity with multi-task residual neural networks

Assessing drug-target affinity is a critical step in the drug discovery ...

Antibody Representation Learning for Drug Discovery

Therapeutic antibody development has become an increasingly popular appr...

Multi-task Bioassay Pre-training for Protein-ligand Binding Affinity Prediction

Protein-ligand binding affinity (PLBA) prediction is the fundamental tas...

Drug Synergistic Combinations Predictions via Large-Scale Pre-Training and Graph Structure Learning

Drug combination therapy is a well-established strategy for disease trea...

Please sign up or login with your details

Forgot password? Click here to reset