HD-Bind: Encoding of Molecular Structure with Low Precision, Hyperdimensional Binary Representations

03/27/2023
by   Derek Jones, et al.
0

Publicly available collections of drug-like molecules have grown to comprise 10s of billions of possibilities in recent history due to advances in chemical synthesis. Traditional methods for identifying “hit” molecules from a large collection of potential drug-like candidates have relied on biophysical theory to compute approximations to the Gibbs free energy of the binding interaction between the drug to its protein target. A major drawback of the approaches is that they require exceptional computing capabilities to consider for even relatively small collections of molecules. Hyperdimensional Computing (HDC) is a recently proposed learning paradigm that is able to leverage low-precision binary vector arithmetic to build efficient representations of the data that can be obtained without the need for gradient-based optimization approaches that are required in many conventional machine learning and deep learning approaches. This algorithmic simplicity allows for acceleration in hardware that has been previously demonstrated for a range of application areas. We consider existing HDC approaches for molecular property classification and introduce two novel encoding algorithms that leverage the extended connectivity fingerprint (ECFP) algorithm. We show that HDC-based inference methods are as much as 90 times more efficient than more complex representative machine learning methods and achieve an acceleration of nearly 9 orders of magnitude as compared to inference with molecular docking. We demonstrate multiple approaches for the encoding of molecular data for HDC and examine their relative performance on a range of challenging molecular property prediction and drug-protein binding classification tasks. Our work thus motivates further investigation into molecular representation learning to develop ultra-efficient pre-screening tools.

READ FULL TEXT

page 5

page 6

page 10

page 11

research
12/15/2020

Molecular machine learning with conformer ensembles

Virtual screening can accelerate drug discovery by identifying top candi...
research
03/02/2016

Molecular Graph Convolutions: Moving Beyond Fingerprints

Molecular "fingerprints" encoding structural information are the workhor...
research
07/23/2022

A Ligand-and-structure Dual-driven Deep Learning Method for the Discovery of Highly Potent GnRH1R Antagonist to treat Uterine Diseases

Gonadotrophin-releasing hormone receptor (GnRH1R) is a promising therape...
research
04/15/2023

Icospherical Chemical Objects (ICOs) allow for chemical data augmentation and maintain rotational, translation and permutation invariance

Dataset augmentation is a common way to deal with small datasets; Chemis...
research
02/14/2023

Do Deep Learning Models Really Outperform Traditional Approaches in Molecular Docking?

Molecular docking, given a ligand molecule and a ligand binding site (ca...
research
09/17/2021

Proteome-informed machine learning studies of cocaine addiction

Cocaine addiction accounts for a large portion of substance use disorder...
research
05/05/2020

Adaptive Invariance for Molecule Property Prediction

Effective property prediction methods can help accelerate the search for...

Please sign up or login with your details

Forgot password? Click here to reset