MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones

07/26/2022
by   Tai Wang, et al.
3

In this technical report, we present our solution, dubbed MV-FCOS3D++, for the Camera-Only 3D Detection track in Waymo Open Dataset Challenge 2022. For multi-view camera-only 3D detection, methods based on bird-eye-view or 3D geometric representations can leverage the stereo cues from overlapped regions between adjacent views and directly perform 3D detection without hand-crafted post-processing. However, it lacks direct semantic supervision for 2D backbones, which can be complemented by pretraining simple monocular-based detectors. Our solution is a multi-view framework for 4D detection following this paradigm. It is built upon a simple monocular detector FCOS3D++, pretrained only with object annotations of Waymo, and converts multi-view features to a 3D grid space to detect 3D objects thereon. A dual-path neck for single-frame understanding and temporal stereo matching is devised to incorporate multi-frame information. Our method finally achieves 49.75 with a single model and wins 2nd place in the WOD challenge, without any LiDAR-based depth supervision during training. The code will be released at https://github.com/Tai-Wang/Depth-from-Motion.

READ FULL TEXT
research
06/02/2021

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

In this paper, we introduce the task of multi-view RGB-based 3D object d...
research
04/03/2023

VoxelFormer: Bird's-Eye-View Feature Generation based on Dual-view Attention for Multi-view 3D Object Detection

In recent years, transformer-based detectors have demonstrated remarkabl...
research
08/22/2022

STS: Surround-view Temporal Stereo for Multi-view 3D Detection

Learning accurate depth is essential to multi-view 3D object detection. ...
research
07/02/2022

ORA3D: Overlap Region Aware Multi-view 3D Object Detection

In multi-view 3D object detection tasks, disparity supervision over over...
research
03/29/2023

DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking

Recent multi-camera 3D object detectors usually leverage temporal inform...
research
01/26/2022

DIREG3D: DIrectly REGress 3D Hands from Multiple Cameras

In this paper, we present DIREG3D, a holistic framework for 3D Hand Trac...
research
10/05/2022

Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection

While recent camera-only 3D detection methods leverage multiple timestep...

Please sign up or login with your details

Forgot password? Click here to reset