Learning the Depths of Moving People by Watching Frozen People

04/25/2019
by   Zhengqi Li, et al.
12

We present a method for predicting dense depth in scenarios where both a monocular camera and people in the scene are freely moving. Existing methods for recovering depth for dynamic, non-rigid objects from monocular video impose strong assumptions on the objects' motion and may only recover sparse depth. In this paper, we take a data-driven approach and learn human depth priors from a new source of data: thousands of Internet videos of people imitating mannequins, i.e., freezing in diverse, natural poses, while a hand-held camera tours the scene. Because people are stationary, training data can be generated using multi-view stereo reconstruction. At inference time, our method uses motion parallax cues from the static areas of the scenes to guide the depth prediction. We demonstrate our method on real-world sequences of complex human actions captured by a moving hand-held camera, show improvement over state-of-the-art monocular depth prediction methods, and show various 3D effects produced using our predicted depth.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 6

page 7

page 8

research
08/02/2021

Consistent Depth of Moving Objects in Video

We present a method to estimate depth of a dynamic scene, containing arb...
research
04/25/2019

Web Stereo Video Supervision for Depth Prediction from Dynamic Scenes

We present a fully data-driven method to compute depth from diverse mono...
research
02/05/2021

Sampling Based Scene-Space Video Processing

Many compelling video processing effects can be achieved if per-pixel de...
research
10/30/2020

Unsupervised Monocular Depth Learning in Dynamic Scenes

We present a method for jointly training the estimation of depth, ego-mo...
research
06/14/2022

3D scene reconstruction from monocular spherical video with motion parallax

In this paper, we describe a method to capture nearly entirely spherical...
research
09/28/2018

Depth Reconstruction of Translucent Objects from a Single Time-of-Flight Camera using Deep Residual Networks

We propose a novel approach to recovering the translucent objects from a...
research
07/07/2023

AutoDecoding Latent 3D Diffusion Models

We present a novel approach to the generation of static and articulated ...

Please sign up or login with your details

Forgot password? Click here to reset