Mathematics-assisted directed evolution and protein engineering

06/06/2023
by   Yuchi Qiu, et al.
0

Directed evolution is a molecular biology technique that is transforming protein engineering by creating proteins with desirable properties and functions. However, it is experimentally impossible to perform the deep mutational scanning of the entire protein library due to the enormous mutational space, which scales as 20^N , where N is the number of amino acids. This has led to the rapid growth of AI-assisted directed evolution (AIDE) or AI-assisted protein engineering (AIPE) as an emerging research field. Aided with advanced natural language processing (NLP) techniques, including long short-term memory, autoencoder, and transformer, sequence-based embeddings have been dominant approaches in AIDE and AIPE. Persistent Laplacians, an emerging technique in topological data analysis (TDA), have made structure-based embeddings a superb option in AIDE and AIPE. We argue that a class of persistent topological Laplacians (PTLs), including persistent Laplacians, persistent path Laplacians, persistent sheaf Laplacians, persistent hypergraph Laplacians, persistent hyperdigraph Laplacians, and evolutionary de Rham-Hodge theory, can effectively overcome the limitations of the current TDA and offer a new generation of more powerful TDA approaches. In the general framework of topological deep learning, mathematics-assisted directed evolution (MADE) has a great potential for future protein engineering.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/01/2021

Topological Data Analysis of COVID-19 Virus Spike Proteins

Topological data analysis, including persistent homology, has undergone ...
research
05/03/2023

Exploring the Protein Sequence Space with Global Generative Models

Recent advancements in specialized large-scale architectures for trainin...
research
12/20/2022

Plug Play Directed Evolution of Proteins with Gradient-based Discrete MCMC

A long-standing goal of machine-learning-based protein engineering is to...
research
08/11/2021

Accelerating Iterated Persistent Homology Computations with Warm Starts

Persistent homology is a topological feature used in a variety of applic...
research
06/14/2021

Topology identifies emerging adaptive mutations in SARS-CoV-2

The COVID-19 pandemic has lead to a worldwide effort to characterize its...
research
05/19/2022

ODBO: Bayesian Optimization with Search Space Prescreening for Directed Protein Evolution

Directed evolution is a versatile technique in protein engineering that ...
research
04/20/2023

Architectures of Topological Deep Learning: A Survey on Topological Neural Networks

The natural world is full of complex systems characterized by intricate ...

Please sign up or login with your details

Forgot password? Click here to reset