DEL-Dock: Molecular Docking-Enabled Modeling of DNA-Encoded Libraries

11/30/2022
by   Kirill Shmilovich, et al.
0

DNA-Encoded Library (DEL) technology has enabled significant advances in hit identification by enabling efficient testing of combinatorially-generated molecular libraries. DEL screens measure protein binding affinity though sequencing reads of molecules tagged with unique DNA-barcodes that survive a series of selection experiments. Computational models have been deployed to learn the latent binding affinities that are correlated to the sequenced count data; however, this correlation is often obfuscated by various sources of noise introduced in its complicated data-generation process. In order to denoise DEL count data and screen for molecules with good binding affinity, computational models require the correct assumptions in their modeling structure to capture the correct signals underlying the data. Recent advances in DEL models have focused on probabilistic formulations of count data, but existing approaches have thus far been limited to only utilizing 2-D molecule-level representations. We introduce a new paradigm, DEL-Dock, that combines ligand-based descriptors with 3-D spatial information from docked protein-ligand complexes. 3-D spatial information allows our model to learn over the actual binding modality rather than using only structured-based information of the ligand. We show that our model is capable of effectively denoising DEL count data to predict molecule enrichment scores that are better correlated with experimental binding affinity measurements compared to prior works. Moreover, by learning over a collection of docked poses we demonstrate that our model, trained only on DEL data, implicitly learns to perform good docking pose selection without requiring external supervision from expensive-to-source protein crystal structures.

READ FULL TEXT
research
05/16/2022

Partial Product Aware Machine Learning on DNA-Encoded Libraries

DNA encoded libraries (DELs) are used for rapid large-scale screening of...
research
08/27/2021

Machine learning on DNA-encoded library count data using an uncertainty-aware probabilistic loss function

DNA-encoded library (DEL) screening and quantitative structure-activity ...
research
09/16/2020

PANDA: Predicting the change in proteins binding affinity upon mutations using sequence information

Accurately determining a change in protein binding affinity upon mutatio...
research
09/05/2018

Latent Molecular Optimization for Targeted Therapeutic Design

We devise an approach for targeted molecular design, a problem of intere...
research
01/25/2023

Consensus Algorithm For Calculation Of Protein Binding Affinity Using Multiple Models

The major histocompatibility complex (MHC) molecules, which bind peptide...
research
05/21/2018

A Spatially Correlated Auto-regressive Model for Count Data

The statistical modeling of multivariate count data observed on a space-...
research
06/26/2017

Hierarchy and assortativity as new tools for affinity investigation: the case of the TBA aptamer-ligand complex

Aptamers are single stranded DNA, RNA or peptide sequences having the ab...

Please sign up or login with your details

Forgot password? Click here to reset