MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

07/03/2023
by   Shitao Tang, et al.
0

This paper introduces MVDiffusion, a simple yet effective multi-view image generation method for scenarios where pixel-to-pixel correspondences are available, such as perspective crops from panorama or multi-view images given geometry (depth maps and poses). Unlike prior models that rely on iterative image warping and inpainting, MVDiffusion concurrently generates all images with a global awareness, encompassing high resolution and rich content, effectively addressing the error accumulation prevalent in preceding models. MVDiffusion specifically incorporates a correspondence-aware attention mechanism, enabling effective cross-view interaction. This mechanism underpins three pivotal modules: 1) a generation module that produces low-resolution images while maintaining global correspondence, 2) an interpolation module that densifies spatial coverage between images, and 3) a super-resolution module that upscales into high-resolution outputs. In terms of panoramic imagery, MVDiffusion can generate high-resolution photorealistic images up to 1024×1024 pixels. For geometry-conditioned multi-view image generation, MVDiffusion demonstrates the first method capable of generating a textured map of a scene mesh. The project page is at https://mvdiffusion.github.io.

READ FULL TEXT

page 1

page 4

page 7

page 8

page 9

page 14

page 20

page 32

research
07/18/2022

Geometry-Aware Reference Synthesis for Multi-View Image Super-Resolution

Recent multi-view multimedia applications struggle between high-resoluti...
research
01/14/2020

Learned Multi-View Texture Super-Resolution

We present a super-resolution method capable of creating a high-resoluti...
research
12/21/2021

StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation

We introduce a high resolution, 3D-consistent image and shape generation...
research
11/01/2022

VIINTER: View Interpolation with Implicit Neural Representations of Images

We present VIINTER, a method for view interpolation by interpolating the...
research
09/07/2020

Improved Modeling of 3D Shapes with Multi-view Depth Maps

We present a simple yet effective general-purpose framework for modeling...
research
07/08/2021

Grid Partitioned Attention: Efficient TransformerApproximation with Inductive Bias for High Resolution Detail Generation

Attention is a general reasoning mechanism than can flexibly deal with i...
research
04/14/2021

Aligning Latent and Image Spaces to Connect the Unconnectable

In this work, we develop a method to generate infinite high-resolution i...

Please sign up or login with your details

Forgot password? Click here to reset