Exploring the Protein Sequence Space with Global Generative Models

05/03/2023
by   Sergio Romero-Romero, et al.
0

Recent advancements in specialized large-scale architectures for training image and language have profoundly impacted the field of computer vision and natural language processing (NLP). Language models, such as the recent ChatGPT and GPT4 have demonstrated exceptional capabilities in processing, translating, and generating human languages. These breakthroughs have also been reflected in protein research, leading to the rapid development of numerous new methods in a short time, with unprecedented performance. Language models, in particular, have seen widespread use in protein research, as they have been utilized to embed proteins, generate novel ones, and predict tertiary structures. In this book chapter, we provide an overview of the use of protein generative models, reviewing 1) language models for the design of novel artificial proteins, 2) works that use non-Transformer architectures, and 3) applications in directed evolution approaches.

READ FULL TEXT

page 3

page 11

page 13

research
07/03/2022

Advancing protein language models with linguistics: a roadmap for improved interpretability

Deep neural-network-based language models (LMs) are increasingly applied...
research
08/16/2023

Atom-by-atom protein generation and beyond with language models

Protein language models learn powerful representations directly from seq...
research
06/06/2023

Mathematics-assisted directed evolution and protein engineering

Directed evolution is a molecular biology technique that is transforming...
research
12/07/2022

Unsupervised language models for disease variant prediction

There is considerable interest in predicting the pathogenicity of protei...
research
06/24/2023

Spatio-temporal Storytelling? Leveraging Generative Models for Semantic Trajectory Analysis

In this paper, we lay out a vision for analysing semantic trajectory tra...
research
05/12/2023

TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Language models (LMs) are powerful tools for natural language processing...
research
09/02/2014

CoMOGrad and PHOG: From Computer Vision to Fast and Accurate Protein Tertiary Structure Retrieval

Due to the advancements in technology number of entries in the structura...

Please sign up or login with your details

Forgot password? Click here to reset