Null-text Guidance in Diffusion Models is Secretly a Cartoon-style Creator

by   Jing Zhao, et al.

Classifier-free guidance is an effective sampling technique in diffusion models that has been widely adopted. The main idea is to extrapolate the model in the direction of text guidance and away from null-text guidance. In this paper, we demonstrate that null-text guidance in diffusion models is secretly a cartoon-style creator, i.e., the generated images can be efficiently transformed into cartoons by simply perturbing the null-text guidance. Specifically, we proposed two disturbance methods, i.e., Rollback disturbance (Back-D) and Image disturbance (Image-D), to construct misalignment between the noisy images used for predicting null-text guidance and text guidance (subsequently referred to as null-text noisy image and text noisy image respectively) in the sampling process. Back-D achieves cartoonization by altering the noise level of null-text noisy image via replacing x_t with x_t+Δ t. Image-D, alternatively, produces high-fidelity, diverse cartoons by defining x_t as a clean input image, which further improves the incorporation of finer image details. Through comprehensive experiments, we delved into the principle of noise disturbing for null-text and uncovered that the efficacy of disturbance depends on the correlation between the null-text noisy image and the source image. Moreover, our proposed techniques, which can generate cartoon images and cartoonize specific ones, are training-free and easily integrated as a plug-and-play component in any classifier-free guided diffusion model. Project page is available at <>.


page 6

page 13

page 14

page 15

page 18

page 19

page 20

page 21


GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Diffusion models have recently been shown to generate high-quality synth...

SGDiff: A Style Guided Diffusion Model for Fashion Synthesis

This paper reports on the development of a novel style guided diffusion ...

MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models

The advent of open-source AI communities has produced a cornucopia of po...

Improved Vector Quantized Diffusion Models

Vector quantized diffusion (VQ-Diffusion) is a powerful generative model...

Sketch-Guided Text-to-Image Diffusion Models

Text-to-Image models have introduced a remarkable leap in the evolution ...

Universal Guidance for Diffusion Models

Typical diffusion models are trained to accept a particular form of cond...

Diffusion Motion: Generate Text-Guided 3D Human Motion by Diffusion Model

We propose a simple and novel method for generating 3D human motion from...

Please sign up or login with your details

Forgot password? Click here to reset