Viewpoint Textual Inversion: Unleashing Novel View Synthesis with Pretrained 2D Diffusion Models

09/14/2023
by   James Burgess, et al.
0

Text-to-image diffusion models understand spatial relationship between objects, but do they represent the true 3D structure of the world from only 2D supervision? We demonstrate that yes, 3D knowledge is encoded in 2D image diffusion models like Stable Diffusion, and we show that this structure can be exploited for 3D vision tasks. Our method, Viewpoint Neural Textual Inversion (ViewNeTI), controls the 3D viewpoint of objects in generated images from frozen diffusion models. We train a small neural mapper to take camera viewpoint parameters and predict text encoder latents; the latents then condition the diffusion generation process to produce images with the desired camera viewpoint. ViewNeTI naturally addresses Novel View Synthesis (NVS). By leveraging the frozen diffusion model as a prior, we can solve NVS with very few input views; we can even do single-view novel view synthesis. Our single-view NVS predictions have good semantic details and photorealism compared to prior methods. Our approach is well suited for modeling the uncertainty inherent in sparse 3D vision problems because it can efficiently generate diverse samples. Our view-control mechanism is general, and can even change the camera view in images generated by user-defined prompts.

READ FULL TEXT

page 16

page 17

page 18

page 19

page 20

page 21

page 22

page 23

research
03/20/2023

Zero-1-to-3: Zero-shot One Image to 3D Object

We introduce Zero-1-to-3, a framework for changing the camera viewpoint ...
research
09/07/2023

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

In this paper, we present a novel diffusion model called that generates ...
research
12/06/2022

NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors

2D-to-3D reconstruction is an ill-posed problem, yet humans are good at ...
research
04/20/2023

Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion

We present Farm3D, a method to learn category-specific 3D reconstructors...
research
03/04/2023

Diffusion Models Generate Images Like Painters: an Analytical Theory of Outline First, Details Later

How do diffusion generative models convert pure noise into meaningful im...
research
03/01/2017

Change Detection under Global Viewpoint Uncertainty

This paper addresses the problem of change detection from a novel perspe...
research
04/13/2023

Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction

3D-aware image synthesis encompasses a variety of tasks, such as scene g...

Please sign up or login with your details

Forgot password? Click here to reset