MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation

02/16/2023
by   Omer Bar-Tal, et al.
11

Recent advances in text-to-image generation with diffusion models present transformative capabilities in image quality. However, user controllability of the generated image, and fast adaptation to new tasks still remains an open challenge, currently mostly addressed by costly and long re-training and fine-tuning or ad-hoc adaptations to specific image generation tasks. In this work, we present MultiDiffusion, a unified framework that enables versatile and controllable image generation, using a pre-trained text-to-image diffusion model, without any further training or finetuning. At the center of our approach is a new generation process, based on an optimization task that binds together multiple diffusion generation processes with a shared set of parameters or constraints. We show that MultiDiffusion can be readily applied to generate high quality and diverse images that adhere to user-provided controls, such as desired aspect ratio (e.g., panorama), and spatial guiding signals, ranging from tight segmentation masks to bounding boxes. Project webpage: https://multidiffusion.github.io

READ FULL TEXT

page 2

page 6

page 7

page 8

page 13

page 14

page 15

page 16

research
05/30/2023

Nested Diffusion Processes for Anytime Image Generation

Diffusion models are the current state-of-the-art in image generation, s...
research
06/07/2023

Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt

Diffusion models have attracted significant attention due to their remar...
research
05/18/2023

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

Achieving machine autonomy and human control often represent divergent o...
research
06/01/2023

StyleDrop: Text-to-Image Generation in Any Style

Pre-trained large text-to-image models synthesize impressive images with...
research
08/03/2023

ConceptLab: Creative Generation using Diffusion Prior Constraints

Recent text-to-image generative models have enabled us to transform our ...
research
03/17/2023

GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation

Text-to-image (T2I) models based on diffusion processes have achieved re...
research
03/13/2020

Semantic Pyramid for Image Generation

We present a novel GAN-based model that utilizes the space of deep featu...

Please sign up or login with your details

Forgot password? Click here to reset