Benchmarking deep generative models for diverse antibody sequence design

11/12/2021
by   Igor Melnyk, et al.
10

Computational protein design, i.e. inferring novel and diverse protein sequences consistent with a given structure, remains a major unsolved challenge. Recently, deep generative models that learn from sequences alone or from sequences and structures jointly have shown impressive performance on this task. However, those models appear limited in terms of modeling structural constraints, capturing enough sequence diversity, or both. Here we consider three recently proposed deep generative frameworks for protein design: (AR) the sequence-based autoregressive generative model, (GVP) the precise structure-based graph neural network, and Fold2Seq that leverages a fuzzy and scale-free representation of a three-dimensional fold, while enforcing structure-to-sequence (and vice versa) consistency. We benchmark these models on the task of computational design of antibody sequences, which demand designing sequences with high diversity for functional implication. The Fold2Seq framework outperforms the two other baselines in terms of diversity of the designed sequences, while maintaining the typical fold.

READ FULL TEXT
research
06/24/2021

Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design

Designing novel protein sequences for a desired 3D topological fold is a...
research
10/05/2022

AlphaFold Distillation for Improved Inverse Protein Folding

Inverse protein folding, i.e., designing sequences that fold into a give...
research
10/09/2021

Iterative Refinement Graph Neural Network for Antibody Sequence-Structure Co-design

Antibodies are versatile proteins that bind to pathogens like viruses an...
research
05/09/2022

Multi-segment preserving sampling for deep manifold sampler

Deep generative modeling for biological sequences presents a unique chal...
research
11/23/2020

Sparse generative modeling of protein-sequence families

Pairwise Potts models (PM) provide accurate statistical models of famili...
research
04/18/2018

Deep Generative Networks For Sequence Prediction

This thesis investigates unsupervised time series representation learnin...
research
10/31/2017

Designing RNA Secondary Structures is Hard

An RNA sequence is a word over an alphabet on four elements {A,C,G,U} ca...

Please sign up or login with your details

Forgot password? Click here to reset