Semantic Image Synthesis with Semantically Coupled VQ-Model

09/06/2022
by   Stephan Alaniz, et al.
3

Semantic image synthesis enables control over unconditional image generation by allowing guidance on what is being generated. We conditionally synthesize the latent space from a vector quantized model (VQ-model) pre-trained to autoencode images. Instead of training an autoregressive Transformer on separately learned conditioning latents and image latents, we find that jointly learning the conditioning and image latents significantly improves the modeling capabilities of the Transformer model. While our jointly trained VQ-model achieves a similar reconstruction performance to a vanilla VQ-model for both semantic and image latents, tying the two modalities at the autoencoding stage proves to be an important ingredient to improve autoregressive modeling performance. We show that our model improves semantic image synthesis using autoregressive models on popular semantic image datasets ADE20k, Cityscapes and COCO-Stuff.

READ FULL TEXT

page 4

page 7

page 8

research
03/01/2023

StraIT: Non-autoregressive Generation with Stratified Image Transformer

We propose Stratified Image Transformer(StraIT), a pure non-autoregressi...
research
11/24/2021

Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences

Autoregressive models have proven to be very powerful in NLP text genera...
research
10/09/2021

Vector-quantized Image Modeling with Improved VQGAN

Pretraining language models with next-token prediction on massive text c...
research
05/23/2023

Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation

Existing autoregressive models follow the two-stage generation paradigm ...
research
05/29/2021

UFC-BERT: Unifying Multi-Modal Controls for Conditional Image Synthesis

Conditional image synthesis aims to create an image according to some mu...
research
08/19/2021

ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

Autoregressive models and their sequential factorization of the data lik...
research
12/06/2022

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis

Vector-Quantized (VQ-based) generative models usually consist of two bas...

Please sign up or login with your details

Forgot password? Click here to reset