Autoregressive Image Generation using Residual Quantization

03/03/2022
by   Doyup Lee, et al.
0

For autoregressive (AR) modeling of high-resolution images, vector quantization (VQ) represents an image as a sequence of discrete codes. A short sequence length is important for an AR model to reduce its computational costs to consider long-range interactions of codes. However, we postulate that previous VQ cannot shorten the code sequence and generate high-fidelity images together in terms of the rate-distortion trade-off. In this study, we propose the two-stage framework, which consists of Residual-Quantized VAE (RQ-VAE) and RQ-Transformer, to effectively generate high-resolution images. Given a fixed codebook size, RQ-VAE can precisely approximate a feature map of an image and represent the image as a stacked map of discrete codes. Then, RQ-Transformer learns to predict the quantized feature vector at the next position by predicting the next stack of codes. Thanks to the precise approximation of RQ-VAE, we can represent a 256×256 image as 8×8 resolution of the feature map, and RQ-Transformer can efficiently reduce the computational costs. Consequently, our framework outperforms the existing AR models on various benchmarks of unconditional and conditional image generation. Our approach also has a significantly faster sampling speed than previous AR models to generate high-quality images.

READ FULL TEXT

page 15

page 18

page 20

page 21

page 23

page 24

page 29

page 30

research
05/19/2023

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Existing vector quantization (VQ) based autoregressive models follow a t...
research
09/19/2022

MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation

Although two-stage Vector Quantized (VQ) generative models allow for syn...
research
06/09/2022

Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer

Although autoregressive models have achieved promising results on image ...
research
05/23/2023

Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation

Existing autoregressive models follow the two-stage generation paradigm ...
research
03/05/2021

Generating Images with Sparse Representations

The high dimensionality of images presents architecture and sampling-eff...
research
12/14/2022

Image Compression with Product Quantized Masked Image Modeling

Recent neural compression methods have been based on the popular hyperpr...
research
10/05/2022

Progressive Denoising Model for Fine-Grained Text-to-Image Generation

Recently, vector quantized autoregressive (VQ-AR) models have shown rema...

Please sign up or login with your details

Forgot password? Click here to reset