SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation

04/07/2022
by   Yi Wei, et al.
1

Depth estimation from images serves as the fundamental step of 3D perception for autonomous driving and is an economical alternative to expensive depth sensors like LiDAR. The temporal photometric consistency enables self-supervised depth estimation without labels, further facilitating its application. However, most existing methods predict the depth solely based on each monocular image and ignore the correlations among multiple surrounding cameras, which are typically available for modern self-driving vehicles. In this paper, we propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras. Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views. We apply cross-view self-attention to efficiently enable the global interactions between multi-camera feature maps. Different from self-supervised monocular depth estimation, we are able to predict real-world scales given multi-camera extrinsic matrices. To achieve this goal, we adopt structure-from-motion to extract scale-aware pseudo depths to pretrain the models. Further, instead of predicting the ego-motion of each individual camera, we estimate a universal ego-motion of the vehicle and transfer it to each view to achieve multi-view consistency. In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets DDAD and nuScenes.

READ FULL TEXT

page 2

page 6

page 9

page 10

page 14

page 15

research
03/14/2023

A Simple Baseline for Supervised Surround-view Depth Estimation

Depth estimation has been widely studied and serves as the fundamental s...
research
04/06/2023

EGA-Depth: Efficient Guided Attention for Self-Supervised Multi-Camera Depth Estimation

The ubiquitous multi-camera setup on modern autonomous vehicles provides...
research
03/31/2021

Full Surround Monodepth from Multiple Cameras

Self-supervised monocular depth and ego-motion estimation is a promising...
research
04/02/2020

Novel View Synthesis of Dynamic Scenes with Globally Coherent Depths from a Monocular Camera

This paper presents a new method to synthesize an image from arbitrary v...
research
11/13/2018

Self-Supervised Learning of Depth and Camera Motion from 360° Videos

As 360 cameras become prevalent in many autonomous systems (e.g., self-d...
research
10/31/2022

Multi-Camera Calibration Free BEV Representation for 3D Object Detection

In advanced paradigms of autonomous driving, learning Bird's Eye View (B...
research
04/09/2021

SVDistNet: Self-Supervised Near-Field Distance Estimation on Surround View Fisheye Cameras

A 360 perception of scene geometry is essential for automated driving, n...

Please sign up or login with your details

Forgot password? Click here to reset