DiAMoNDBack: Diffusion-denoising Autoregressive Model for Non-Deterministic Backmapping of Cα Protein Traces

07/23/2023
by   Michael S. Jones, et al.
0

Coarse-grained molecular models of proteins permit access to length and time scales unattainable by all-atom models and the simulation of processes that occur on long-time scales such as aggregation and folding. The reduced resolution realizes computational accelerations but an atomistic representation can be vital for a complete understanding of mechanistic details. Backmapping is the process of restoring all-atom resolution to coarse-grained molecular models. In this work, we report DiAMoNDBack (Diffusion-denoising Autoregressive Model for Non-Deterministic Backmapping) as an autoregressive denoising diffusion probability model to restore all-atom details to coarse-grained protein representations retaining only Cα coordinates. The autoregressive generation process proceeds from the protein N-terminus to C-terminus in a residue-by-residue fashion conditioned on the Cα trace and previously backmapped backbone and side chain atoms within the local neighborhood. The local and autoregressive nature of our model makes it transferable between proteins. The stochastic nature of the denoising diffusion process means that the model generates a realistic ensemble of backbone and side chain all-atom configurations consistent with the coarse-grained Cα trace. We train DiAMoNDBack over 65k+ structures from Protein Data Bank (PDB) and validate it in applications to a hold-out PDB test set, intrinsically-disordered protein structures from the Protein Ensemble Database (PED), molecular dynamics simulations of fast-folding mini-proteins from DE Shaw Research, and coarse-grained simulation data. We achieve state-of-the-art reconstruction performance in terms of correct bond formation, avoidance of side chain clashes, and diversity of the generated side chain configurational states. We make DiAMoNDBack model publicly available as a free and open source Python package.

READ FULL TEXT

page 7

page 21

page 31

research
02/01/2023

Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics

Coarse-grained (CG) molecular dynamics enables the study of biological p...
research
09/30/2022

Protein structure generation via folding diffusion

The ability to computationally generate novel yet physically foldable pr...
research
05/29/2023

Implicit Transfer Operator Learning: Multiple Time-Resolution Surrogates for Molecular Dynamics

Computing properties of molecular systems rely on estimating expectation...
research
05/26/2022

Protein Structure and Sequence Generation with Equivariant Denoising Diffusion Probabilistic Models

Proteins are macromolecules that mediate a significant fraction of the c...
research
03/02/2023

Chemically Transferable Generative Backmapping of Coarse-Grained Proteins

Coarse-graining (CG) accelerates molecular simulations of protein dynami...
research
06/26/2023

CoarsenConf: Equivariant Coarsening with Aggregated Attention for Molecular Conformer Generation

Molecular conformer generation (MCG) is an important task in cheminforma...
research
11/29/2022

Martinize2 and Vermouth: Unified Framework for Topology Generation

Ongoing advances in force field and computer hardware development enable...

Please sign up or login with your details

Forgot password? Click here to reset