Exploring QSAR Models for Activity-Cliff Prediction

01/31/2023
by   Markus Dablander, et al.
8

Pairs of similar compounds that only differ by a small structural modification but exhibit a large difference in their binding affinity for a given target are known as activity cliffs (ACs). It has been hypothesised that quantitative structure-activity relationship (QSAR) models struggle to predict ACs and that ACs thus form a major source of prediction error. However, a study to explore the AC-prediction power of modern QSAR methods and its relationship to general QSAR-prediction performance is lacking. We systematically construct nine distinct QSAR models by combining three molecular representation methods (extended-connectivity fingerprints, physicochemical-descriptor vectors and graph isomorphism networks) with three regression techniques (random forests, k-nearest neighbours and multilayer perceptrons); we then use each resulting model to classify pairs of similar compounds as ACs or non-ACs and to predict the activities of individual molecules in three case studies: dopamine receptor D2, factor Xa, and SARS-CoV-2 main protease. We observe low AC-sensitivity amongst the tested models when the activities of both compounds are unknown, but a substantial increase in AC-sensitivity when the actual activity of one of the compounds is given. Graph isomorphism features are found to be competitive with or superior to classical molecular representations for AC-classification and can thus be employed as baseline AC-prediction models or simple compound-optimisation tools. For general QSAR-prediction, however, extended-connectivity fingerprints still consistently deliver the best performance. Our results provide strong support for the hypothesis that indeed QSAR methods frequently fail to predict ACs. We propose twin-network training for deep learning models as a potential future pathway to increase AC-sensitivity and thus overall QSAR performance.

READ FULL TEXT

page 1

page 6

page 9

page 10

page 11

page 14

research
02/15/2023

Activity Cliff Prediction: Dataset and Benchmark

Activity cliffs (ACs), which are generally defined as pairs of structura...
research
01/08/2018

Graph Memory Networks for Molecular Activity Prediction

Molecular activity prediction is critical in drug design. Machine learni...
research
04/10/2019

Classification of signaling proteins based on molecular star graph descriptors using Machine Learning models

Signaling proteins are an important topic in drug development due to the...
research
04/03/2023

Development and Evaluation of Conformal Prediction Methods for QSAR

The quantitative structure-activity relationship (QSAR) regression model...
research
04/13/2017

3D Deep Learning for Biological Function Prediction from Physical Fields

Predicting the biological function of molecules, be it proteins or drug-...
research
04/23/2018

Descriptor Selection via Self-Paced Learning for Bioactivity of Molecular Structure in QSAR Classification

Quantitative structure-activity relationship (QSAR) modelling is effecti...

Please sign up or login with your details

Forgot password? Click here to reset