Guided Generative Protein Design using Regularized Transformers

01/24/2022
by   Egbert Castro, et al.
25

The development of powerful natural language models have increased the ability to learn meaningful representations of protein sequences. In addition, advances in high-throughput mutagenesis, directed evolution, and next-generation sequencing have allowed for the accumulation of large amounts of labeled fitness data. Leveraging these two trends, we introduce Regularized Latent Space Optimization (ReLSO), a deep transformer-based autoencoder which is trained to jointly generate sequences as well as predict fitness. Using ReLSO, we explicitly model the underlying sequence-function landscape of large labeled datasets and optimize within latent space using gradient-based methods. Through regularized prediction heads, ReLSO introduces a powerful protein sequence encoder and novel approach for efficient fitness landscape traversal.

READ FULL TEXT

page 7

page 8

page 10

page 16

research
04/30/2023

Importance Weighted Expectation-Maximization for Protein Sequence Design

Designing protein sequences with desired biological function is crucial ...
research
05/27/2022

Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval

The ability to accurately model the fitness landscape of protein sequenc...
research
05/11/2022

RITA: a Study on Scaling Up Generative Protein Sequence Models

In this work we introduce RITA: a suite of autoregressive generative mod...
research
06/27/2022

ProGen2: Exploring the Boundaries of Protein Language Models

Attention-based models trained on protein sequences have demonstrated in...
research
07/07/2021

Deep Extrapolation for Attribute-Enhanced Generation

Attribute extrapolation in sample generation is challenging for deep neu...
research
07/02/2023

Optimizing protein fitness using Gibbs sampling with Graph-based Smoothing

The ability to design novel proteins with higher fitness on a given task...
research
05/19/2022

ODBO: Bayesian Optimization with Search Space Prescreening for Directed Protein Evolution

Directed evolution is a versatile technique in protein engineering that ...

Please sign up or login with your details

Forgot password? Click here to reset