The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth

04/29/2021
by   Jamie Watson, et al.
5

Self-supervised monocular depth estimation networks are trained to predict scene depth using nearby frames as a supervision signal during training. However, for many applications, sequence information in the form of video frames is also available at test time. The vast majority of monocular networks do not make use of this extra signal, thus ignoring valuable information that could be used to improve the predicted depth. Those that do, either use computationally expensive test-time refinement techniques or off-the-shelf recurrent networks, which only indirectly make use of the geometric information that is inherently available. We propose ManyDepth, an adaptive approach to dense depth estimation that can make use of sequence information at test time, when it is available. Taking inspiration from multi-view stereo, we propose a deep end-to-end cost volume based approach that is trained using self-supervision only. We present a novel consistency loss that encourages the network to ignore the cost volume when it is deemed unreliable, e.g. in the case of moving objects, and an augmentation scheme to cope with static cameras. Our detailed experiments on both KITTI and Cityscapes show that we outperform all published self-supervised baselines, including those that use single or multiple frames at test time.

READ FULL TEXT

page 1

page 4

page 5

page 6

research
05/10/2023

FusionDepth: Complement Self-Supervised Monocular Depth Estimation with Cost Volume

Multi-view stereo depth estimation based on cost volume usually works be...
research
05/30/2022

SMUDLP: Self-Teaching Multi-Frame Unsupervised Endoscopic Depth Estimation with Learnable Patchmatch

Unsupervised monocular trained depth estimation models make use of adjac...
research
11/24/2022

SfM-TTR: Using Structure from Motion for Test-Time Refinement of Single-View Depth Networks

Estimating a dense depth map from a single view is geometrically ill-pos...
research
08/14/2023

DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume

Self-supervised monocular depth estimation methods typically rely on the...
research
12/22/2022

Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography

Modern mobile burst photography pipelines capture and merge a short sequ...
research
02/21/2023

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images

3D photography renders a static image into a video with appealing 3D vis...
research
03/24/2021

SaccadeCam: Adaptive Visual Attention for Monocular Depth Sensing

Most monocular depth sensing methods use conventionally captured images ...

Please sign up or login with your details

Forgot password? Click here to reset