DeepAI AI Chat
Log In Sign Up

Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image

by   Florian Chabot, et al.

In this paper, we present a novel approach, called Deep MANTA (Deep Many-Tasks), for many-task vehicle analysis from a given image. A robust convolutional network is introduced for simultaneous vehicle detection, part localization, visibility characterization and 3D dimension estimation. Its architecture is based on a new coarse-to-fine object proposal that boosts the vehicle detection. Moreover, the Deep MANTA network is able to localize vehicle parts even if these parts are not visible. In the inference, the network's outputs are used by a real time robust pose estimation algorithm for fine orientation estimation and 3D vehicle localization. We show in experiments that our method outperforms monocular state-of-the-art approaches on vehicle detection, orientation and 3D location tasks on the very challenging KITTI benchmark.


page 1

page 3

page 4

page 6


CenterLoc3D: Monocular 3D Vehicle Localization Network for Roadside Surveillance Cameras

Monocular 3D vehicle localization is an important task in Intelligent Tr...

Convolutional Cross-View Pose Estimation

We propose a novel end-to-end method for cross-view pose estimation. Giv...

Exploring Intermediate Representation for Monocular Vehicle Pose Estimation

We present a new learning-based approach to recover egocentric 3D vehicl...

Robust and Fast Vehicle Detection using Augmented Confidence Map

Vehicle detection in real-time scenarios is challenging because of the t...

Mono3D++: Monocular 3D Vehicle Detection with Two-Scale 3D Hypotheses and Task Priors

We present a method to infer 3D pose and shape of vehicles from a single...

What My Motion tells me about Your Pose: Self-Supervised Fine-Tuning of Observed Vehicle Orientation Angle

The determination of the relative 6 Degree of Freedom (DoF) pose of vehi...