ImDrug: A Benchmark for Deep Imbalanced Learning in AI-aided Drug Discovery

09/16/2022
by   Lanqing Li, et al.
0

The last decade has witnessed a prosperous development of computational methods and dataset curation for AI-aided drug discovery (AIDD). However, real-world pharmaceutical datasets often exhibit highly imbalanced distribution, which is largely overlooked by the current literature but may severely compromise the fairness and generalization of machine learning applications. Motivated by this observation, we introduce ImDrug, a comprehensive benchmark with an open-source Python library which consists of 4 imbalance settings, 11 AI-ready datasets, 54 learning tasks and 16 baseline algorithms tailored for imbalanced learning. It provides an accessible and customizable testbed for problems and solutions spanning a broad spectrum of the drug discovery pipeline such as molecular modeling, drug-target interaction and retrosynthesis. We conduct extensive empirical studies with novel evaluation metrics, to demonstrate that the existing algorithms fall short of solving medicinal and pharmaceutical challenges in the data imbalance scenario. We believe that ImDrug opens up avenues for future research and development, on real-world challenges at the intersection of AIDD and deep imbalanced learning.

READ FULL TEXT

page 2

page 6

research
03/01/2023

Deep Learning Methods for Small Molecule Drug Discovery: A Survey

With the development of computer-assisted techniques, research communiti...
research
02/15/2023

Activity Cliff Prediction: Dataset and Benchmark

Activity cliffs (ACs), which are generally defined as pairs of structura...
research
02/14/2018

Dealing with Difficult Minority Labels in Imbalanced Mutilabel Data Sets

Multilabel classification is an emergent data mining task with a broad r...
research
12/03/2019

Drug-Target Indication Prediction by Integrating End-to-End Learning and Fingerprints

Computer-Aided Drug Discovery research has proven to be a promising dire...
research
12/23/2022

On How AI Needs to Change to Advance the Science of Drug Discovery

Research around AI for Science has seen significant success since the ri...
research
10/29/2021

DOCKSTRING: easy molecular docking yields better benchmarks for ligand design

The field of machine learning for drug discovery is witnessing an explos...

Please sign up or login with your details

Forgot password? Click here to reset