Palette: Image-to-Image Diffusion Models

11/10/2021
by   Chitwan Saharia, et al.
35

We introduce Palette, a simple and general framework for image-to-image translation using conditional diffusion models. On four challenging image-to-image translation tasks (colorization, inpainting, uncropping, and JPEG decompression), Palette outperforms strong GAN and regression baselines, and establishes a new state of the art. This is accomplished without task-specific hyper-parameter tuning, architecture customization, or any auxiliary loss, demonstrating a desirable degree of generality and flexibility. We uncover the impact of using L_2 vs. L_1 loss in the denoising diffusion objective on sample diversity, and demonstrate the importance of self-attention through empirical architecture studies. Importantly, we advocate a unified evaluation protocol based on ImageNet, and report several sample quality scores including FID, Inception Score, Classification Accuracy of a pre-trained ResNet-50, and Perceptual Distance against reference images for various baselines. We expect this standardized evaluation protocol to play a critical role in advancing image-to-image translation research. Finally, we show that a single generalist Palette model trained on 3 tasks (colorization, inpainting, JPEG decompression) performs as well or better than task-specific specialist counterparts.

READ FULL TEXT

page 20

page 21

page 22

page 23

page 24

page 25

page 26

page 27

research
05/16/2022

VQBB: Image-to-image Translation with Vector Quantized Brownian Bridge

Image-to-image translation is an important and challenging problem in co...
research
01/24/2019

Unsupervised Image-to-Image Translation with Self-Attention Networks

Unsupervised image translation aims to learn the transformation from a s...
research
07/23/2020

TSIT: A Simple and Versatile Framework for Image-to-Image Translation

We introduce a simple and versatile framework for image-to-image transla...
research
02/25/2019

Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation

Recent advances in conditional image generation tasks, such as image-to-...
research
05/28/2021

MixerGAN: An MLP-Based Architecture for Unpaired Image-to-Image Translation

While attention-based transformer networks achieve unparalleled success ...
research
08/06/2023

Photorealistic and Identity-Preserving Image-Based Emotion Manipulation with Latent Diffusion Models

In this paper, we investigate the emotion manipulation capabilities of d...
research
08/26/2023

DiffI2I: Efficient Diffusion Model for Image-to-Image Translation

The Diffusion Model (DM) has emerged as the SOTA approach for image synt...

Please sign up or login with your details

Forgot password? Click here to reset