Controlled and Conditional Text to Image Generation with Diffusion Prior

02/23/2023
by   Pranav Aggarwal, et al.
2

Denoising Diffusion models have shown remarkable performance in generating diverse, high quality images from text. Numerous techniques have been proposed on top of or in alignment with models like Stable Diffusion and Imagen that generate images directly from text. A lesser explored approach is DALLE-2's two step process comprising a Diffusion Prior that generates a CLIP image embedding from text and a Diffusion Decoder that generates an image from a CLIP image embedding. We explore the capabilities of the Diffusion Prior and the advantages of an intermediate CLIP representation. We observe that Diffusion Prior can be used in a memory and compute efficient way to constrain the generation to a specific domain without altering the larger Diffusion Decoder. Moreover, we show that the Diffusion Prior can be trained with additional conditional information such as color histogram to further control the generation. We show quantitatively and qualitatively that the proposed approaches perform better than prompt engineering for domain specific generation and existing baselines for color conditioned generation. We believe that our observations and results will instigate further research into the diffusion prior and uncover more of its capabilities.

READ FULL TEXT

page 9

page 10

page 18

page 19

page 20

page 22

page 23

page 24

research
02/15/2023

PRedItOR: Text Guided Image Editing with Diffusion Prior

Diffusion models have shown remarkable capabilities in generating high q...
research
04/13/2022

Hierarchical Text-Conditional Image Generation with CLIP Latents

Contrastive models like CLIP have been shown to learn robust representat...
research
06/01/2022

DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder

Recently most successful image synthesis models are multi stage process ...
research
02/05/2023

Mixture of Diffusers for scene composition and high resolution image generation

Diffusion methods have been proven to be very effective to generate imag...
research
09/01/2023

DiffuGen: Adaptable Approach for Generating Labeled Image Datasets using Stable Diffusion Models

Generating high-quality labeled image datasets is crucial for training a...
research
08/02/2023

Reverse Stable Diffusion: What prompt was used to generate this image?

Text-to-image diffusion models such as Stable Diffusion have recently at...
research
06/14/2023

Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation

Text-to-Image (T2I) generation with diffusion models allows users to con...

Please sign up or login with your details

Forgot password? Click here to reset