Learning 3D Photography Videos via Self-supervised Diffusion on Single Images

02/21/2023
by   Xiaodong Wang, et al.
0

3D photography renders a static image into a video with appealing 3D visual effects. Existing approaches typically first conduct monocular depth estimation, then render the input frame to subsequent frames with various viewpoints, and finally use an inpainting model to fill those missing/occluded regions. The inpainting model plays a crucial role in rendering quality, but it is normally trained on out-of-domain data. To reduce the training and inference gap, we propose a novel self-supervised diffusion model as the inpainting module. Given a single input image, we automatically construct a training pair of the masked occluded image and the ground-truth image with random cycle-rendering. The constructed training samples are closely aligned to the testing instances, without the need of data annotation. To make full use of the masked images, we design a Masked Enhanced Block (MEB), which can be easily plugged into the UNet and enhance the semantic conditions. Towards real-world animation, we present a novel task: out-animation, which extends the space and time of input objects. Extensive experiments on real datasets show that our method achieves competitive results with existing SOTA methods.

READ FULL TEXT

page 2

page 3

page 5

page 6

page 8

research
10/14/2022

MonoDVPS: A Self-Supervised Monocular Depth Estimation Approach to Depth-aware Video Panoptic Segmentation

Depth-aware video panoptic segmentation tackles the inverse projection p...
research
11/21/2021

Self-Supervised Point Cloud Completion via Inpainting

When navigating in urban environments, many of the objects that need to ...
research
02/16/2021

Restore from Restored: Single-image Inpainting

Recent image inpainting methods show promising results due to the power ...
research
02/28/2020

Instance Separation Emerges from Inpainting

Deep neural networks trained to inpaint partially occluded images show a...
research
04/29/2021

The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth

Self-supervised monocular depth estimation networks are trained to predi...
research
09/15/2021

Solving Occlusion in Terrain Mapping with Neural Networks

Accurate and complete terrain maps enhance the awareness of autonomous r...
research
03/10/2023

Self-Supervised CSF Inpainting with Synthetic Atrophy for Improved Accuracy Validation of Cortical Surface Analyses

Accuracy validation of cortical thickness measurement is a difficult pro...

Please sign up or login with your details

Forgot password? Click here to reset