High-Quality Pluralistic Image Completion via Code Shared VQGAN

04/05/2022
by   Chuanxia Zheng, et al.
0

PICNet pioneered the generation of multiple and diverse results for image completion task, but it required a careful balance between 𝒦ℒ loss (diversity) and reconstruction loss (quality), resulting in a limited diversity and quality . Separately, iGPT-based architecture has been employed to infer distributions in a discrete space derived from a pixel-level pre-clustered palette, which however cannot generate high-quality results directly. In this work, we present a novel framework for pluralistic image completion that can achieve both high quality and diversity at much faster inference speed. The core of our design lies in a simple yet effective code sharing mechanism that leads to a very compact yet expressive image representation in a discrete latent domain. The compactness and the richness of the representation further facilitate the subsequent deployment of a transformer to effectively learn how to composite and complete a masked image at the discrete code domain. Based on the global context well-captured by the transformer and the available visual regions, we are able to sample all tokens simultaneously, which is completely different from the prevailing autoregressive approach of iGPT-based works, and leads to more than 100× faster inference speed. Experiments show that our framework is able to learn semantically-rich discrete codes efficiently and robustly, resulting in much better image reconstruction quality. Our diverse image completion framework significantly outperforms the state-of-the-art both quantitatively and qualitatively on multiple benchmark datasets.

READ FULL TEXT

page 1

page 6

page 8

page 9

page 11

page 12

page 13

page 15

research
06/09/2022

Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer

Although autoregressive models have achieved promising results on image ...
research
01/25/2022

ShapeFormer: Transformer-based Shape Completion via Sparse Representation

We present ShapeFormer, a transformer-based network that produces a dist...
research
02/25/2023

Raw Image Reconstruction with Learned Compact Metadata

While raw images exhibit advantages over sRGB images (e.g., linearity an...
research
04/21/2022

Non-autoregressive Model for Full-line Code Completion

Code completion tools are frequently used by software developers to acce...
research
05/19/2023

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Existing vector quantization (VQ) based autoregressive models follow a t...
research
03/28/2022

Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation

We address the problem of generating a 360-degree image from a single im...
research
07/29/2021

Deep Quantized Representation for Enhanced Reconstruction

While machine learning approaches have shown remarkable performance in b...

Please sign up or login with your details

Forgot password? Click here to reset