MonoNext: A 3D Monocular Object Detection with ConvNext

08/01/2023
by   Marcelo Eduardo Pederiva, et al.
0

Autonomous driving perception tasks rely heavily on cameras as the primary sensor for Object Detection, Semantic Segmentation, Instance Segmentation, and Object Tracking. However, RGB images captured by cameras lack depth information, which poses a significant challenge in 3D detection tasks. To supplement this missing data, mapping sensors such as LIDAR and RADAR are used for accurate 3D Object Detection. Despite their significant accuracy, the multi-sensor models are expensive and require a high computational demand. In contrast, Monocular 3D Object Detection models are becoming increasingly popular, offering a faster, cheaper, and easier-to-implement solution for 3D detections. This paper introduces a different Multi-Tasking Learning approach called MonoNext that utilizes a spatial grid to map objects in the scene. MonoNext employs a straightforward approach based on the ConvNext network and requires only 3D bounding box annotated data. In our experiments with the KITTI dataset, MonoNext achieved high precision and competitive performance comparable with state-of-the-art approaches. Furthermore, by adding more training data, MonoNext surpassed itself and achieved higher accuracies.

READ FULL TEXT

page 1

page 6

research
09/11/2019

Multi-Sensor 3D Object Box Refinement for Autonomous Driving

We propose a 3D object detection system with multi-sensor refinement in ...
research
12/03/2020

Generalized Object Detection on Fisheye Cameras for Autonomous Driving: Dataset, Representations and Baseline

Object detection is a comprehensively studied problem in autonomous driv...
research
10/01/2018

RGB-D Object Detection and Semantic Segmentation for Autonomous Manipulation in Clutter

Autonomous robotic manipulation in clutter is challenging. A large varie...
research
05/04/2023

OSDaR23: Open Sensor Data for Rail 2023

For driverless train operation on mainline railways, several tasks need ...
research
03/27/2023

Learning to Zoom and Unzoom

Many perception systems in mobile computing, autonomous navigation, and ...
research
10/14/2022

Instance Segmentation with Cross-Modal Consistency

Segmenting object instances is a key task in machine perception, with sa...
research
10/23/2021

espiownage: Tracking Transients in Steelpan Drum Strikes Using Surveillance Technology

We present an improvement in the ability to meaningfully track features ...

Please sign up or login with your details

Forgot password? Click here to reset