Exploring the Capabilities and Limits of 3D Monocular Object Detection – A Study on Simulation and Real World Data

05/15/2020
by   Felix Nobis, et al.
0

3D object detection based on monocular camera data is a key enabler for autonomous driving. The task however, is ill-posed due to lack of depth information in 2D images. Recent deep learning methods show promising results to recover depth information from single images by learning priors about the environment. Several competing strategies tackle this problem. In addition to the network design, the major difference of these competing approaches lies in using a supervised or self-supervised optimization loss function, which require different data and ground truth information. In this paper, we evaluate the performance of a 3D object detection pipeline which is parameterizable with different depth estimation configurations. We implement a simple distance calculation approach based on camera intrinsics and 2D bounding box size, a self-supervised, and a supervised learning approach for depth estimation. Ground truth depth information cannot be recorded reliable in real world scenarios. This shifts our training focus to simulation data. In simulation, labeling and ground truth generation can be automatized. We evaluate the detection pipeline on simulator data and a real world sequence from an autonomous vehicle on a race track. The benefit of simulation training to real world application is investigated. Advantages and drawbacks of the different depth estimation strategies are discussed.

READ FULL TEXT

page 1

page 4

research
09/30/2020

Monocular Differentiable Rendering for Self-Supervised 3D Object Detection

3D object detection from monocular images is an ill-posed problem due to...
research
12/01/2020

Sim2Real for Self-Supervised Monocular Depth and Segmentation

Image-based learning methods for autonomous vehicle perception tasks req...
research
05/26/2021

Self-supervised Monocular Multi-robot Relative Localization with Efficient Deep Neural Networks

Relative localization is an important ability for multiple robots to per...
research
01/16/2019

Lightweight Markerless Monocular Face Capture with 3D Spatial Priors

We present a simple lightweight markerless facial performance capture fr...
research
07/20/2023

OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured Traffic Scenarios

Modern approaches for vision-centric environment perception for autonomo...
research
03/20/2023

Boosting Weakly Supervised Object Detection using Fusion and Priors from Hallucinated Depth

Despite recent attention and exploration of depth for various tasks, it ...
research
09/15/2021

Solving Occlusion in Terrain Mapping with Neural Networks

Accurate and complete terrain maps enhance the awareness of autonomous r...

Please sign up or login with your details

Forgot password? Click here to reset