Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation

05/23/2023
by   Mengqi Huang, et al.
0

Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook. However, existing codebook learning simply models all local region information of images without distinguishing their different perceptual importance, which brings redundancy in the learned codebook that not only limits the next stage's autoregressive model's ability to model important structure but also results in high training cost and slow generation speed. In this study, we borrow the idea of importance perception from classical image coding theory and propose a novel two-stage framework, which consists of Masked Quantization VAE (MQ-VAE) and Stackformer, to relieve the model from modeling redundancy. Specifically, MQ-VAE incorporates an adaptive mask module for masking redundant region features before quantization and an adaptive de-mask module for recovering the original grid image feature map to faithfully reconstruct the original images after quantization. Then, Stackformer learns to predict the combination of the next code and its position in the feature map. Comprehensive experiments on various image generation validate our effectiveness and efficiency. Code will be released at https://github.com/CrossmodalGroup/MaskedVectorQuantization.

READ FULL TEXT

page 6

page 7

research
05/19/2023

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Existing vector quantization (VQ) based autoregressive models follow a t...
research
03/03/2022

Autoregressive Image Generation using Residual Quantization

For autoregressive (AR) modeling of high-resolution images, vector quant...
research
06/09/2022

Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer

Although autoregressive models have achieved promising results on image ...
research
09/13/2019

ρ-VAE: Autoregressive parametrization of the VAE encoder

We make a minimal, but very effective alteration to the VAE model. This ...
research
09/06/2022

Semantic Image Synthesis with Semantically Coupled VQ-Model

Semantic image synthesis enables control over unconditional image genera...
research
07/27/2023

Online Clustered Codebook

Vector Quantisation (VQ) is experiencing a comeback in machine learning,...
research
04/20/2017

Fast Generation for Convolutional Autoregressive Models

Convolutional autoregressive models have recently demonstrated state-of-...

Please sign up or login with your details

Forgot password? Click here to reset