DOCKSTRING: easy molecular docking yields better benchmarks for ligand design

10/29/2021
by   Miguel Garcia-Ortegon, et al.
0

The field of machine learning for drug discovery is witnessing an explosion of novel methods. These methods are often benchmarked on simple physicochemical properties such as solubility or general druglikeness, which can be readily computed. However, these properties are poor representatives of objective functions in drug design, mainly because they do not depend on the candidate's interaction with the target. By contrast, molecular docking is a widely successful method in drug discovery to estimate binding affinities. However, docking simulations require a significant amount of domain knowledge to set up correctly which hampers adoption. To this end, we present DOCKSTRING, a bundle for meaningful and robust comparison of ML models consisting of three components: (1) an open-source Python package for straightforward computation of docking scores; (2) an extensive dataset of docking scores and poses of more than 260K ligands for 58 medically-relevant targets; and (3) a set of pharmaceutically-relevant benchmark tasks including regression, virtual screening, and de novo design. The Python package implements a robust ligand and target preparation protocol that allows non-experts to obtain meaningful docking scores. Our dataset is the first to include docking poses, as well as the first of its size that is a full matrix, thus facilitating experiments in multiobjective optimization and transfer learning. Overall, our results indicate that docking scores are a more appropriate evaluation objective than simple physicochemical properties, yielding more realistic benchmark tasks and molecular candidates.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2022

CandidateDrug4Cancer: An Open Molecular Graph Learning Benchmark on Drug Discovery for Cancer

Anti-cancer drug discoveries have been serendipitous, we sought to prese...
research
08/20/2022

A biologically-inspired evaluation of molecular generative machine learning

While generative models have recently become ubiquitous in many scientif...
research
02/16/2022

TorchDrug: A Powerful and Flexible Machine Learning Platform for Drug Discovery

Machine learning has huge potential to revolutionize the field of drug d...
research
06/05/2021

MoleHD: Automated Drug Discovery using Brain-Inspired Hyperdimensional Computing

Modern drug discovery is often time-consuming, complex and cost-ineffect...
research
09/16/2022

ImDrug: A Benchmark for Deep Imbalanced Learning in AI-aided Drug Discovery

The last decade has witnessed a prosperous development of computational ...
research
01/03/2018

Rapid, concurrent and adaptive extreme scale binding free energy calculation

The recently demonstrated ability to perform accurate, precise and rapid...

Please sign up or login with your details

Forgot password? Click here to reset