Image Tranformer

02/15/2018
by   Niki Parmar, et al.
0

Image generation has been successfully cast as an autoregressive sequence generation or transformation problem. Recent work has shown that self-attention is an effective way of modeling textual sequences. In this work, we generalize a recently proposed model architecture based on self-attention, the Transformer, to a sequence modeling formulation of image generation with a tractable likelihood. By restricting the self-attention mechanism to attend to local neighborhoods we significantly increase the size of images the model can process in practice, despite maintaining significantly larger receptive fields per layer than typical convolutional neural networks. We propose another extension of self-attention allowing it to efficiently take advantage of the two-dimensional nature of images. While conceptually simple, our generative models significantly outperform the current state of the art in image generation on ImageNet, improving the best published negative log-likelihood on ImageNet from 3.83 to 3.77. We also present results on image super-resolution with a large magnification ratio, applying an encoder-decoder configuration of our architecture. In a human evaluation study, we show that our super-resolution models improve significantly over previously published super-resolution models. Images generated by the model fool human observers three times more often than the previous state of the art.

READ FULL TEXT

page 1

page 3

page 5

page 7

research
02/15/2018

Image Transformer

Image generation has been successfully cast as an autoregressive sequenc...
research
11/07/2018

Blockwise Parallel Decoding for Deep Autoregressive Models

Deep autoregressive sequence-to-sequence models have demonstrated impres...
research
02/16/2023

TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual Vision Transformer for Fast Arbitrary One-Shot Image Generation

One-shot image generation (OSG) with generative adversarial networks tha...
research
03/17/2023

SRFormer: Permuted Self-Attention for Single Image Super-Resolution

Previous works have shown that increasing the window size for Transforme...
research
11/17/2022

RDRN: Recursively Defined Residual Network for Image Super-Resolution

Deep convolutional neural networks (CNNs) have obtained remarkable perfo...
research
12/12/2018

Efficient Super Resolution For Large-Scale Image Using Attentional GAN

Single Image Super Resolution (SISR) is a well-researched problem with b...
research
06/14/2021

Improved Transformer for High-Resolution GANs

Attention-based models, exemplified by the Transformer, can effectively ...

Please sign up or login with your details

Forgot password? Click here to reset