Unite and Conquer: Cross Dataset Multimodal Synthesis using Diffusion Models

Generating photos satisfying multiple constraints find broad utility in the content creation industry. A key hurdle to accomplishing this task is the need for paired data consisting of all modalities (i.e., constraints) and their corresponding output. Moreover, existing methods need retraining using paired data across all modalities to introduce a new condition. This paper proposes a solution to this problem based on denoising diffusion probabilistic models (DDPMs). Our motivation for choosing diffusion models over other generative models comes from the flexible internal structure of diffusion models. Since each sampling step in the DDPM follows a Gaussian distribution, we show that there exists a closed-form solution for generating an image given various constraints. Our method can unite multiple diffusion models trained on multiple sub-tasks and conquer the combined task through our proposed sampling strategy. We also introduce a novel reliability parameter that allows using different off-the-shelf diffusion models trained across various datasets during sampling time alone to guide it to the desired outcome satisfying multiple constraints. We perform experiments on various standard multimodal tasks to demonstrate the effectiveness of our approach. More details can be found in https://nithin-gk.github.io/projectpages/Multidiff/index.html

READ FULL TEXT

page 15

page 21

page 23

page 25

page 26

page 27

page 28

page 29

research
06/10/2022

Image Generation with Multimodal Priors using Denoising Diffusion Probabilistic Models

Image synthesis under multi-modal priors is a useful and challenging tas...
research
05/19/2023

Any-to-Any Generation via Composable Diffusion

We present Composable Diffusion (CoDi), a novel generative model capable...
research
12/15/2021

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs

A wide variety of deep generative models has been developed in the past ...
research
02/12/2021

Multimodal data visualization, denoising and clustering with integrated diffusion

We propose a method called integrated diffusion for combining multimodal...
research
06/06/2023

Protecting the Intellectual Property of Diffusion Models by the Watermark Diffusion Process

Diffusion models have emerged as state-of-the-art deep generative archit...
research
12/14/2022

Bi-Noising Diffusion: Towards Conditional Diffusion Models with Generative Restoration Priors

Conditional diffusion probabilistic models can model the distribution of...
research
05/24/2023

T1: Scaling Diffusion Probabilistic Fields to High-Resolution on Unified Visual Modalities

Diffusion Probabilistic Field (DPF) models the distribution of continuou...

Please sign up or login with your details

Forgot password? Click here to reset