CASTELO: Clustered Atom Subtypes aidEd Lead Optimization – a combined machine learning and molecular modeling method

by   Leili Zhang, et al.

Drug discovery is a multi-stage process that comprises two costly major steps: pre-clinical research and clinical trials. Among its stages, lead optimization easily consumes more than half of the pre-clinical budget. We propose a combined machine learning and molecular modeling approach that automates lead optimization workflow in silico. The initial data collection is achieved with physics-based molecular dynamics (MD) simulation. Contact matrices are calculated as the preliminary features extracted from the simulations. To take advantage of the temporal information from the simulations, we enhanced contact matrices data with temporal dynamism representation, which are then modeled with unsupervised convolutional variational autoencoder (CVAE). Finally, conventional clustering method and CVAE-based clustering method are compared with metrics to rank the submolecular structures and propose potential candidates for lead optimization. With no need for extensive structure-activity relationship database, our method provides new hints for drug modification hotspots which can be used to improve drug efficacy. Our workflow can potentially reduce the lead optimization turnaround time from months/years to days compared with the conventional labor-intensive process and thus can potentially become a valuable tool for medical researchers.


page 3

page 4

page 9

page 18


A Ligand-and-structure Dual-driven Deep Learning Method for the Discovery of Highly Potent GnRH1R Antagonist to treat Uterine Diseases

Gonadotrophin-releasing hormone receptor (GnRH1R) is a promising therape...

Implementation of The Future of Drug Discovery: QuantumBased Machine Learning Simulation (QMLS)

The Research Development (R D) phase of drug development is a leng...

Artificial Intelligence based Autonomous Molecular Design for Medical Therapeutic: A Perspective

Domain-aware machine learning (ML) models have been increasingly adopted...

A biologically-inspired evaluation of molecular generative machine learning

While generative models have recently become ubiquitous in many scientif...

Using 3D Hahn Moments as A Computational Representation of ATS Drugs Molecular Structure

The campaign against drug abuse is fought by all countries, most notably...

Proteome-informed machine learning studies of cocaine addiction

Cocaine addiction accounts for a large portion of substance use disorder...

Please sign up or login with your details

Forgot password? Click here to reset