NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis

07/20/2022
by   Chenfei Wu, et al.
4

In this paper, we present NUWA-Infinity, a generative model for infinite visual synthesis, which is defined as the task of generating arbitrarily-sized high-resolution images or long-duration videos. An autoregressive over autoregressive generation mechanism is proposed to deal with this variable-size generation task, where a global patch-level autoregressive model considers the dependencies between patches, and a local token-level autoregressive model considers dependencies between visual tokens within each patch. A Nearby Context Pool (NCP) is introduced to cache-related patches already generated as the context for the current patch being generated, which can significantly save computation costs without sacrificing patch-level dependency modeling. An Arbitrary Direction Controller (ADC) is used to decide suitable generation orders for different visual synthesis tasks and learn order-aware positional embeddings. Compared to DALL-E, Imagen and Parti, NUWA-Infinity can generate high-resolution images with arbitrary sizes and support long-duration video generation additionally. Compared to NUWA, which also covers images and videos, NUWA-Infinity has superior visual synthesis capabilities in terms of resolution and variable-size generation. The GitHub link is https://github.com/microsoft/NUWA. The homepage link is https://nuwa-infinity.microsoft.com.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 15

page 16

page 17

page 19

research
04/28/2021

InfinityGAN: Towards Infinite-Resolution Image Synthesis

We present InfinityGAN, a method to generate arbitrary-resolution images...
research
10/07/2022

GOLLIC: Learning Global Context beyond Patches for Lossless High-Resolution Image Compression

Neural-network-based approaches recently emerged in the field of data co...
research
06/02/2022

Modeling Image Composition for Complex Scene Generation

We present a method that achieves state-of-the-art results on challengin...
research
08/22/2022

Patient-level Microsatellite Stability Assessment from Whole Slide Images By Combining Momentum Contrast Learning and Group Patch Embeddings

Assessing microsatellite stability status of a patient's colorectal canc...
research
11/24/2021

NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion

This paper presents a unified multimodal pre-trained model called NÜWA t...
research
03/01/2023

StraIT: Non-autoregressive Generation with Stratified Image Transformer

We propose Stratified Image Transformer(StraIT), a pure non-autoregressi...
research
09/23/2016

Example-Based Image Synthesis via Randomized Patch-Matching

Image and texture synthesis is a challenging task that has long been dra...

Please sign up or login with your details

Forgot password? Click here to reset