ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

06/02/2021
by   Danila Rukhovich, et al.
12

In this paper, we introduce the task of multi-view RGB-based 3D object detection as an end-to-end optimization problem. To address this problem, we propose ImVoxelNet, a novel fully convolutional method of 3D object detection based on monocular or multi-view RGB images. The number of monocular images in each multi-view input can variate during training and inference; actually, this number might be unique for each multi-view input. ImVoxelNet successfully handles both indoor and outdoor scenes, which makes it general-purpose. Specifically, it achieves state-of-the-art results in car detection on KITTI (monocular) and nuScenes (multi-view) benchmarks among all methods that accept RGB images. Moreover, it surpasses existing RGB-based 3D object detection methods on the SUN RGB-D dataset. On ScanNet, ImVoxelNet sets a new benchmark for multi-view 3D object detection. The source code and the trained models are available at <https://github.com/saic-vul/imvoxelnet>.

READ FULL TEXT

page 14

page 15

page 16

page 17

research
03/25/2023

Viewpoint Equivariance for Multi-View 3D Object Detection

3D object detection from visual sensors is a cornerstone capability of r...
research
08/22/2022

A Simple Baseline for Multi-Camera 3D Object Detection

3D object detection with surrounding cameras has been a promising direct...
research
07/27/2023

NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection

We present NeRF-Det, a novel method for indoor 3D detection with posed R...
research
07/26/2022

MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones

In this technical report, we present our solution, dubbed MV-FCOS3D++, f...
research
04/17/2023

Leveraging Multi-view Data for Improved Detection Performance: An Industrial Use Case

Printed circuit boards (PCBs) are essential components of electronic dev...
research
07/08/2022

Multi-view Attention for gestational age at birth prediction

We present our method for gestational age at birth prediction for the SL...
research
08/23/2021

ODAM: Object Detection, Association, and Mapping using Posed RGB Video

Localizing objects and estimating their extent in 3D is an important ste...

Please sign up or login with your details

Forgot password? Click here to reset