Exploring Compositional Visual Generation with Latent Classifier Guidance

04/25/2023
by   Changhao Shi, et al.
0

Diffusion probabilistic models have achieved enormous success in the field of image generation and manipulation. In this paper, we explore a novel paradigm of using the diffusion model and classifier guidance in the latent semantic space for compositional visual tasks. linear fashion. Specifically, we train latent diffusion models and auxiliary latent classifiers to facilitate non-linear navigation of latent representation generation for any pre-trained generative model with a semantic latent space. We demonstrate that such conditional generation achieved by latent classifier guidance provably maximizes a lower bound of the conditional log probability during training. To maintain the original semantics during manipulation, we introduce a new guidance term, which we show is crucial for achieving compositionality. With additional assumptions, we show that the non-linear manipulation reduces to a simple latent arithmetic approach. We show that this paradigm based on latent classifier guidance is agnostic to pre-trained generative models, and present competitive results for both image generation and sequential manipulation of real and synthetic images. Our findings suggest that latent classifier guidance is a promising approach that merits further exploration, even in the presence of other strong competing methods.

READ FULL TEXT

page 6

page 7

research
03/23/2023

End-to-End Diffusion Latent Optimization Improves Classifier Guidance

Classifier guidance – using the gradients of an image classifier to stee...
research
10/21/2021

Controllable and Compositional Generation with Latent-Space Energy-Based Models

Controllable generation is one of the key requirements for successful ad...
research
02/22/2023

Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC

Since their introduction, diffusion models have quickly become the preva...
research
08/13/2023

Shape-guided Conditional Latent Diffusion Models for Synthesising Brain Vasculature

The Circle of Willis (CoW) is the part of cerebral vasculature responsib...
research
07/07/2023

Language-free Compositional Action Generation via Decoupling Refinement

Composing simple elements into complex concepts is crucial yet challengi...
research
06/30/2023

Stay on topic with Classifier-Free Guidance

Classifier-Free Guidance (CFG) has recently emerged in text-to-image gen...
research
08/18/2022

Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance

Denoising diffusion probabilistic models (DDPMs) are a recent family of ...

Please sign up or login with your details

Forgot password? Click here to reset