Conditional Score Guidance for Text-Driven Image-to-Image Translation

05/29/2023
by   Hyunsoo Lee, et al.
0

We present a novel algorithm for text-driven image-to-image translation based on a pretrained text-to-image diffusion model. Our method aims to generate a target image by selectively editing the regions of interest in a source image, defined by a modifying text, while preserving the remaining parts. In contrast to existing techniques that solely rely on a target prompt, we introduce a new score function, which considers both a source prompt and a source image, tailored to address specific translation tasks. To this end, we derive the conditional score function in a principled manner, decomposing it into a standard score and a guiding term for target image generation. For the gradient computation, we adopt a Gaussian distribution of the posterior distribution, estimating its mean and variance without requiring additional training. In addition, to enhance the conditional score guidance, we incorporate a simple yet effective mixup method. This method combines two cross-attention maps derived from the source and target latents, promoting the generation of the target image by a desirable fusion of the original parts in the source image and the edited regions aligned with the target prompt. Through comprehensive experiments, we demonstrate that our approach achieves outstanding image-to-image translation performance on various tasks.

READ FULL TEXT

page 8

page 9

page 14

page 16

page 17

page 18

page 19

page 20

research
11/22/2022

Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation

Large-scale text-to-image generative models have been a revolutionary br...
research
08/04/2023

SDDM: Score-Decomposed Diffusion Models on Manifolds for Unpaired Image-to-Image Translation

Recent score-based diffusion models (SBDMs) show promising results in un...
research
06/07/2023

Improving Diffusion-based Image Translation using Asymmetric Gradient Guidance

Diffusion models have shown significant progress in image translation ta...
research
07/14/2022

EGSDE: Unpaired Image-to-Image Translation via Energy-Guided Stochastic Differential Equations

Score-based diffusion generative models (SDGMs) have achieved the SOTA F...
research
11/02/2021

StyleGAN of All Trades: Image Manipulation with Only Pretrained StyleGAN

Recently, StyleGAN has enabled various image manipulation and editing ta...
research
05/27/2020

Network Fusion for Content Creation with Conditional INNs

Artificial Intelligence for Content Creation has the potential to reduce...
research
04/14/2023

Delta Denoising Score

We introduce Delta Denoising Score (DDS), a novel scoring function for t...

Please sign up or login with your details

Forgot password? Click here to reset