DeepAI AI Chat
Log In Sign Up

Diffusion Autoencoders: Toward a Meaningful and Decodable Representation

by   Konpat Preechakul, et al.

Diffusion probabilistic models (DPMs) have achieved remarkable quality in image generation that rivals GANs'. But unlike GANs, DPMs use a set of latent variables that lack semantic meaning and cannot serve as a useful representation for other tasks. This paper explores the possibility of using DPMs for representation learning and seeks to extract a meaningful and decodable representation of an input image via autoencoding. Our key idea is to use a learnable encoder for discovering the high-level semantics, and a DPM as the decoder for modeling the remaining stochastic variations. Our method can encode any image into a two-part latent code, where the first part is semantically meaningful and linear, and the second part captures stochastic details, allowing near-exact reconstruction. This capability enables challenging applications that currently foil GAN-based methods, such as attribute manipulation on real images. We also show that this two-level encoding improves denoising efficiency and naturally facilitates various downstream tasks including few-shot conditional sampling. Please visit our project page:


page 5

page 6

page 13

page 16

page 17

page 18

page 19

page 20


Representation Learning with Diffusion Models

Diffusion models (DMs) have achieved state-of-the-art results for image ...

Repurposing GANs for One-shot Semantic Part Segmentation

While GANs have shown success in realistic image generation, the idea of...

Improvements to Self-Supervised Representation Learning for Masked Image Modeling

This paper explores improvements to the masked image modeling (MIM) para...

Unsupervised Representation Learning from Pre-trained Diffusion Probabilistic Models

Diffusion Probabilistic Models (DPMs) have shown a powerful capacity of ...

Semantic Image Synthesis via Diffusion Models

Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkabl...

f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation

Diffusion models (DMs) have recently emerged as SoTA tools for generativ...

Theoretical limitations of Encoder-Decoder GAN architectures

Encoder-decoder GANs architectures (e.g., BiGAN and ALI) seek to add an ...