Activity Cliff Prediction: Dataset and Benchmark

02/15/2023
by   Ziqiao Zhang, et al.
0

Activity cliffs (ACs), which are generally defined as pairs of structurally similar molecules that are active against the same bio-target but significantly different in the binding potency, are of great importance to drug discovery. Up to date, the AC prediction problem, i.e., to predict whether a pair of molecules exhibit the AC relationship, has not yet been fully explored. In this paper, we first introduce ACNet, a large-scale dataset for AC prediction. ACNet curates over 400K Matched Molecular Pairs (MMPs) against 190 targets, including over 20K MMP-cliffs and 380K non-AC MMPs, and provides five subsets for model development and evaluation. Then, we propose a baseline framework to benchmark the predictive performance of molecular representations encoded by deep neural networks for AC prediction, and 16 models are evaluated in experiments. Our experimental results show that deep learning models can achieve good performance when the models are trained on tasks with adequate amount of data, while the imbalanced, low-data and out-of-distribution features of the ACNet dataset still make it challenging for deep neural networks to cope with. In addition, the traditional ECFP method shows a natural advantage on MMP-cliff prediction, and outperforms other deep learning models on most of the data subsets. To the best of our knowledge, our work constructs the first large-scale dataset for AC prediction, which may stimulate the study of AC prediction models and prompt further breakthroughs in AI-aided drug discovery. The codes and dataset can be accessed by https://drugai.github.io/ACNet/.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/21/2020

Learn molecular representations from large-scale unlabeled molecules for drug discovery

How to produce expressive molecular representations is a fundamental cha...
research
01/31/2023

Exploring QSAR Models for Activity-Cliff Prediction

Pairs of similar compounds that only differ by a small structural modifi...
research
04/03/2023

Development and Evaluation of Conformal Prediction Methods for QSAR

The quantitative structure-activity relationship (QSAR) regression model...
research
09/16/2022

ImDrug: A Benchmark for Deep Imbalanced Learning in AI-aided Drug Discovery

The last decade has witnessed a prosperous development of computational ...
research
03/07/2019

Interpretable Deep Learning in Drug Discovery

Without any means of interpretation, neural networks that predict molecu...
research
04/24/2023

Uni-QSAR: an Auto-ML Tool for Molecular Property Prediction

Recently deep learning based quantitative structure-activity relationshi...
research
05/23/2022

Tyger: Task-Type-Generic Active Learning for Molecular Property Prediction

How to accurately predict the properties of molecules is an essential pr...

Please sign up or login with your details

Forgot password? Click here to reset