High-Resolution Complex Scene Synthesis with Transformers

05/13/2021
by   Manuel Jahn, et al.
0

The use of coarse-grained layouts for controllable synthesis of complex scene images via deep generative models has recently gained popularity. However, results of current approaches still fall short of their promise of high-resolution synthesis. We hypothesize that this is mostly due to the highly engineered nature of these approaches which often rely on auxiliary losses and intermediate steps such as mask generators. In this note, we present an orthogonal approach to this task, where the generative model is based on pure likelihood training without additional objectives. To do so, we first optimize a powerful compression model with adversarial training which learns to reconstruct its inputs via a discrete latent bottleneck and thereby effectively strips the latent representation of high-frequency details such as texture. Subsequently, we train an autoregressive transformer model to learn the distribution of the discrete image representations conditioned on a tokenized version of the layouts. Our experiments show that the resulting system is able to synthesize high-quality images consistent with the given layouts. In particular, we improve the state-of-the-art FID score on COCO-Stuff and on Visual Genome by up to 19 to 512 x 512 px on COCO and Open Images.

READ FULL TEXT

page 3

page 9

page 10

page 11

research
07/05/2020

GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis

While 2D generative adversarial networks have enabled high-resolution im...
research
03/28/2022

GIRAFFE HD: A High-Resolution 3D-aware Generative Model

3D-aware generative models have shown that the introduction of 3D inform...
research
10/08/2021

Collaging Class-specific GANs for Semantic Image Synthesis

We propose a new approach for high resolution semantic image synthesis. ...
research
03/05/2021

Generating Images with Sparse Representations

The high dimensionality of images presents architecture and sampling-eff...
research
12/17/2020

Taming Transformers for High-Resolution Image Synthesis

Designed to learn long-range interactions on sequential data, transforme...
research
03/31/2023

GVP: Generative Volumetric Primitives

Advances in 3D-aware generative models have pushed the boundary of image...
research
11/21/2018

Physics-aware Deep Generative Models for Creating Synthetic Microstructures

A key problem in computational material science deals with understanding...

Please sign up or login with your details

Forgot password? Click here to reset