Binary Latent Diffusion

04/10/2023
by   Ze Wang, et al.
0

In this paper, we show that a binary latent space can be explored for compact yet expressive image representations. We model the bi-directional mappings between an image and the corresponding latent binary representation by training an auto-encoder with a Bernoulli encoding distribution. On the one hand, the binary latent space provides a compact discrete image representation of which the distribution can be modeled more efficiently than pixels or continuous latent representations. On the other hand, we now represent each image patch as a binary vector instead of an index of a learned cookbook as in discrete image representations with vector quantization. In this way, we obtain binary latent representations that allow for better image quality and high-resolution image representations without any multi-stage hierarchy in the latent space. In this binary latent space, images can now be generated effectively using a binary latent diffusion model tailored specifically for modeling the prior over the binary image representations. We present both conditional and unconditional image generation experiments with multiple datasets, and show that the proposed method performs comparably to state-of-the-art methods while dramatically improving the sampling efficiency to as few as 16 steps without using any test-time acceleration. The proposed framework can also be seamlessly scaled to 1024 × 1024 high-resolution image generation without resorting to latent hierarchy or multi-stage refinements.

READ FULL TEXT

page 14

page 16

page 17

page 18

page 19

page 20

page 21

page 23

research
12/01/2022

3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models

Diffusion models have shown great promise for image generation, beating ...
research
11/29/2021

Vector Quantized Diffusion Model for Text-to-Image Synthesis

We present the vector quantized diffusion (VQ-Diffusion) model for text-...
research
09/09/2019

An Acceleration Framework for High Resolution Image Synthesis

Synthesis of high resolution images using Generative Adversarial Network...
research
09/19/2022

MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation

Although two-stage Vector Quantized (VQ) generative models allow for syn...
research
03/23/2023

High Fidelity Image Synthesis With Deep VAEs In Latent Space

We present fast, realistic image generation on high-resolution, multimod...
research
10/10/2022

f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation

Diffusion models (DMs) have recently emerged as SoTA tools for generativ...
research
09/10/2020

Self-Supervised Annotation of Seismic Images using Latent Space Factorization

Annotating seismic data is expensive, laborious and subjective due to th...

Please sign up or login with your details

Forgot password? Click here to reset