Dirichlet Diffusion Score Model for Biological Sequence Generation

05/18/2023
by   Pavel Avdeyev, et al.
0

Designing biological sequences is an important challenge that requires satisfying complex constraints and thus is a natural problem to address with deep generative modeling. Diffusion generative models have achieved considerable success in many applications. Score-based generative stochastic differential equations (SDE) model is a continuous-time diffusion model framework that enjoys many benefits, but the originally proposed SDEs are not naturally designed for modeling discrete data. To develop generative SDE models for discrete data such as biological sequences, here we introduce a diffusion process defined in the probability simplex space with stationary distribution being the Dirichlet distribution. This makes diffusion in continuous space natural for modeling discrete data. We refer to this approach as Dirchlet diffusion score model. We demonstrate that this technique can generate samples that satisfy hard constraints using a Sudoku generation task. This generative model can also solve Sudoku, including hard puzzles, without additional training. Finally, we applied this approach to develop the first human promoter DNA sequence design model and showed that designed sequences share similar properties with natural promoter sequences.

READ FULL TEXT

page 19

page 20

page 21

page 24

research
04/26/2023

Score-based Generative Modeling Through Backward Stochastic Differential Equations: Inversion and Generation

The proposed BSDE-based diffusion model represents a novel approach to d...
research
02/05/2022

Score-based Generative Modeling of Graphs via the System of Stochastic Differential Equations

Generating graph-structured data requires learning the underlying distri...
research
05/31/2023

Protein Design with Guided Discrete Diffusion

A popular approach to protein design is to combine a generative model wi...
research
02/20/2023

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

While diffusion models have achieved great success in generating continu...
research
09/05/2023

Diffusion on the Probability Simplex

Diffusion models learn to reverse the progressive noising of a data dist...
research
06/15/2023

Unbalanced Diffusion Schrödinger Bridge

Schrödinger bridges (SBs) provide an elegant framework for modeling the ...
research
06/03/2023

Exploring the Optimal Choice for Generative Processes in Diffusion Models: Ordinary vs Stochastic Differential Equations

The diffusion model has shown remarkable success in computer vision, but...

Please sign up or login with your details

Forgot password? Click here to reset