High-Fidelity Pluralistic Image Completion with Transformers

03/25/2021
by   Ziyu Wan, et al.
3

Image completion has made tremendous progress with convolutional neural networks (CNNs), because of their powerful texture modeling capacity. However, due to some inherent properties (e.g., local inductive prior, spatial-invariant kernels), CNNs do not perform well in understanding global structures or naturally support pluralistic completion. Recently, transformers demonstrate their power in modeling the long-term relationship and generating diverse results, but their computation complexity is quadratic to input length, thus hampering the application in processing high-resolution images. This paper brings the best of both worlds to pluralistic image completion: appearance prior reconstruction with transformer and texture replenishment with CNN. The former transformer recovers pluralistic coherent structures together with some coarse textures, while the latter CNN enhances the local texture details of coarse priors guided by the high-resolution masked images. The proposed method vastly outperforms state-of-the-art methods in terms of three aspects: 1) large performance boost on image fidelity even compared to deterministic completion methods; 2) better diversity and higher fidelity for pluralistic completion; 3) exceptional generalization ability on large masks and generic dataset, like ImageNet.

READ FULL TEXT

page 1

page 3

page 5

page 6

page 8

research
04/26/2021

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Image inpainting is an underdetermined inverse problem, it naturally all...
research
03/28/2022

Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation

We address the problem of generating a 360-degree image from a single im...
research
04/25/2023

CompletionFormer: Depth Completion with Convolutions and Vision Transformers

Given sparse depths and the corresponding RGB images, depth completion a...
research
12/17/2020

Taming Transformers for High-Resolution Image Synthesis

Designed to learn long-range interactions on sequential data, transforme...
research
09/06/2021

3D Human Texture Estimation from a Single Image with Transformers

We propose a Transformer-based framework for 3D human texture estimation...
research
04/22/2020

Spectrally Consistent UNet for High Fidelity Image Transformations

Convolutional Neural Networks (CNNs) are the current de-facto approach u...
research
05/18/2022

Pluralistic Image Completion with Probabilistic Mixture-of-Experts

Pluralistic image completion focuses on generating both visually realist...

Please sign up or login with your details

Forgot password? Click here to reset