MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation

09/19/2022
by   Chuanxia Zheng, et al.
0

Although two-stage Vector Quantized (VQ) generative models allow for synthesizing high-fidelity and high-resolution images, their quantization operator encodes similar patches within an image into the same index, resulting in a repeated artifact for similar adjacent regions using existing decoder architectures. To address this issue, we propose to incorporate the spatially conditional normalization to modulate the quantized vectors so as to insert spatially variant information to the embedded index maps, encouraging the decoder to generate more photorealistic images. Moreover, we use multichannel quantization to increase the recombination capability of the discrete codes without increasing the cost of model and codebook. Additionally, to generate discrete tokens at the second stage, we adopt a Masked Generative Image Transformer (MaskGIT) to learn an underlying prior distribution in the compressed latent space, which is much faster than the conventional autoregressive model. Experiments on two benchmark datasets demonstrate that our proposed modulated VQGAN is able to greatly improve the reconstructed image quality as well as provide high-fidelity image generation.

READ FULL TEXT

page 1

page 4

page 6

page 8

page 13

page 14

page 15

page 16

research
03/03/2022

Autoregressive Image Generation using Residual Quantization

For autoregressive (AR) modeling of high-resolution images, vector quant...
research
11/14/2022

Fast Text-Conditional Discrete Denoising on Vector-Quantized Latent Spaces

Conditional text-to-image generation has seen countless recent improveme...
research
04/10/2023

Binary Latent Diffusion

In this paper, we show that a binary latent space can be explored for co...
research
06/22/2020

generating annotated high-fidelity images containing multiple coherent objects

Recent developments related to generative models have made it possible t...
research
12/04/2018

Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling

The unconditional generation of high fidelity images is a longstanding b...
research
09/09/2020

not-so-BigGAN: Generating High-Fidelity Images on a Small Compute Budget

BigGAN is the state-of-the-art in high-resolution image generation, succ...
research
05/19/2023

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Existing vector quantization (VQ) based autoregressive models follow a t...

Please sign up or login with your details

Forgot password? Click here to reset