CREMP: Conformer-Rotamer Ensembles of Macrocyclic Peptides for Machine Learning

05/14/2023
by   Colin A. Grambow, et al.
0

Computational and machine learning approaches to model the conformational landscape of macrocyclic peptides have the potential to enable rational design and optimization. However, accurate, fast, and scalable methods for modeling macrocycle geometries remain elusive. Recent deep learning approaches have significantly accelerated protein structure prediction and the generation of small-molecule conformational ensembles, yet similar progress has not been made for macrocyclic peptides due to their unique properties. Here, we introduce CREMP, a resource generated for the rapid development and evaluation of machine learning models for macrocyclic peptides. CREMP contains 36,198 unique macrocyclic peptides and their high-quality structural ensembles generated using the Conformer-Rotamer Ensemble Sampling Tool (CREST). Altogether, this new dataset contains nearly 31.3 million unique macrocycle geometries, each annotated with energies derived from semi-empirical extended tight-binding (xTB) DFT calculations. We anticipate that this dataset will enable the development of machine learning models that can improve peptide design and optimization for novel therapeutics.

READ FULL TEXT

page 1

page 8

research
05/17/2022

A graph representation of molecular ensembles for polymer property prediction

Synthetic polymers are versatile and widely used materials. Similar to s...
research
05/30/2023

RINGER: Rapid Conformer Generation for Macrocycles with Sequence-Conditioned Internal Coordinate Diffusion

Macrocyclic peptides are an emerging therapeutic modality, yet computati...
research
01/16/2019

Optimization Models for Machine Learning: A Survey

This paper surveys the machine learning literature and presents machine ...
research
02/05/2021

Zero Training Overhead Portfolios for Learning to Solve Combinatorial Problems

There has been an increasing interest in harnessing deep learning to tac...
research
10/25/2021

Seeing biodiversity: perspectives in machine learning for wildlife conservation

Data acquisition in animal ecology is rapidly accelerating due to inexpe...
research
06/27/2022

Effective training-time stacking for ensembling of deep neural networks

Ensembling is a popular and effective method for improving machine learn...
research
02/25/2019

The MBPEP: a deep ensemble pruning algorithm providing high quality uncertainty prediction

Machine learning algorithms have been effectively applied into various r...

Please sign up or login with your details

Forgot password? Click here to reset