Learning Geometrically Disentangled Representations of Protein Folding Simulations

05/20/2022
by   N. Joseph Tatro, et al.
12

Massive molecular simulations of drug-target proteins have been used as a tool to understand disease mechanism and develop therapeutics. This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein, e.g. SARS-CoV-2 Spike protein, obtained from computationally expensive molecular simulations. Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules, as well as efficient generation of protein conformations that can serve as an complement of a molecular simulation engine. Specifically, we present a geometric autoencoder framework to learn separate latent space encodings of the intrinsic and extrinsic geometries of the protein structure. For this purpose, the proposed Protein Geometric AutoEncoder (ProGAE) model is trained on the protein contact map and the orientation of the backbone bonds of the protein. Using ProGAE latent embeddings, we reconstruct and generate the conformational ensemble of a protein at or near the experimental resolution, while gaining better interpretability and controllability in term of protein structure generation from the learned latent space. Additionally, ProGAE models are transferable to a different state of the same protein or to a new protein of different size, where only the dense layer decoding from the latent representation needs to be retrained. Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations, charting the path toward scalable and improved approaches for analyzing and enhancing high-cost simulations of drug-target proteins.

READ FULL TEXT

page 2

page 3

page 13

research
05/06/2023

A Latent Diffusion Model for Protein Structure Generation

Proteins are complex biomolecules that perform a variety of crucial func...
research
01/25/2021

Deep learning based mixed-dimensional GMM for characterizing variability in CryoEM

The function of most protein molecules involves structural flexibility a...
research
08/18/2022

Learned Indexing in Proteins: Extended Work on Substituting Complex Distance Calculations with Embedding and Clustering Techniques

Despite the constant evolution of similarity searching research, it cont...
research
11/28/2018

Variational Selection of Features for Molecular Kinetics

The modeling of atomistic biomolecular simulations using kinetic models ...
research
04/05/2022

In-Pocket 3D Graphs Enhance Ligand-Target Compatibility in Generative Small-Molecule Creation

Proteins in complex with small molecule ligands represent the core of st...
research
10/10/2019

Learning protein conformational space by enforcing physics with convolutions and latent interpolations

Determining the different conformational states of a protein and the tra...
research
01/03/2023

Protein-Ligand Complex Generator Drug Screening via Tiered Tensor Transform

Accurate determination of a small molecule candidate (ligand) binding po...

Please sign up or login with your details

Forgot password? Click here to reset