Generative Adversarial Transformers

03/01/2021
by   Drew A. Hudson, et al.
16

We introduce the GANsformer, a novel and efficient type of transformer, and explore it for the task of visual generative modeling. The network employs a bipartite structure that enables long-range interactions across the image, while maintaining computation of linearly efficiency, that can readily scale to high-resolution synthesis. It iteratively propagates information from a set of latent variables to the evolving visual features and vice versa, to support the refinement of each in light of the other and encourage the emergence of compositional representations of objects and scenes. In contrast to the classic transformer architecture, it utilizes multiplicative integration that allows flexible region-based modulation, and can thus be seen as a generalization of the successful StyleGAN network. We demonstrate the model's strength and robustness through a careful evaluation over a range of datasets, from simulated multi-object environments to rich real-world indoor and outdoor scenes, showing it achieves state-of-the-art results in terms of image quality and diversity, while enjoying fast learning and better data-efficiency. Further qualitative and quantitative experiments offer us an insight into the model's inner workings, revealing improved interpretability and stronger disentanglement, and illustrating the benefits and efficacy of our approach. An implementation of the model is available at https://github.com/dorarad/gansformer.

READ FULL TEXT

page 4

page 6

page 17

page 18

page 19

page 20

page 21

page 22

research
11/17/2021

Compositional Transformers for Scene Generation

We introduce the GANformer2 model, an iterative object-oriented transfor...
research
12/17/2020

Taming Transformers for High-Resolution Image Synthesis

Designed to learn long-range interactions on sequential data, transforme...
research
12/20/2021

StyleSwin: Transformer-based GAN for High-resolution Image Generation

Despite the tantalizing success in a broad of vision tasks, transformers...
research
06/16/2022

IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes

Indoor scenes exhibit significant appearance variations due to myriad in...
research
01/26/2022

Class-Aware Generative Adversarial Transformers for Medical Image Segmentation

Transformers have made remarkable progress towards modeling long-range d...
research
06/09/2021

Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers

In video transformers, the time dimension is often treated in the same w...
research
08/10/2019

DeblurGAN-v2: Deblurring (Orders-of-Magnitude) Faster and Better

We present a new end-to-end generative adversarial network (GAN) for sin...

Please sign up or login with your details

Forgot password? Click here to reset