AdaLead: A simple and robust adaptive greedy search algorithm for sequence design

10/05/2020
by   Sam Sinai, et al.
0

Efficient design of biological sequences will have a great impact across many industrial and healthcare domains. However, discovering improved sequences requires solving a difficult optimization problem. Traditionally, this challenge was approached by biologists through a model-free method known as "directed evolution", the iterative process of random mutation and selection. As the ability to build models that capture the sequence-to-function map improves, such models can be used as oracles to screen sequences before running experiments. In recent years, interest in better algorithms that effectively use such oracles to outperform model-free approaches has intensified. These span from approaches based on Bayesian Optimization, to regularized generative models and adaptations of reinforcement learning. In this work, we implement an open-source Fitness Landscape EXploration Sandbox (FLEXS: github.com/samsinai/FLEXS) environment to test and evaluate these algorithms based on their optimality, consistency, and robustness. Using FLEXS, we develop an easy-to-implement, scalable, and robust evolutionary greedy algorithm (AdaLead). Despite its simplicity, we show that AdaLead is a remarkably strong benchmark that out-competes more complex state of the art approaches in a variety of biologically motivated sequence design challenges.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2023

Protein Sequence Design with Batch Bayesian Optimisation

Protein sequence design is a challenging problem in protein engineering,...
research
11/05/2021

Improving RNA Secondary Structure Design using Deep Reinforcement Learning

Rising costs in recent years of developing new drugs and treatments have...
research
10/29/2019

Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization

In reinforcement learning (RL), an autonomous agent learns to perform co...
research
07/02/2020

ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning

Resolving the exploration-exploitation trade-off remains a fundamental p...
research
09/13/2022

Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization

The ability to accelerate the design of biological sequences can have a ...
research
10/20/2021

More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences

Incorporating prior knowledge in reinforcement learning algorithms is ma...
research
11/18/2022

Forecasting labels under distribution-shift for machine-guided sequence design

The ability to design and optimize biological sequences with specific fu...

Please sign up or login with your details

Forgot password? Click here to reset