DeepAI AI Chat
Log In Sign Up

Driving among Flatmobiles: Bird-Eye-View occupancy grids from a monocular camera for holistic trajectory planning

by   Abdelhak Loukkal, et al.

Camera-based end-to-end driving neural networks bring the promise of a low-cost system that maps camera images to driving control commands. These networks are appealing because they replace laborious hand engineered building blocks but their black-box nature makes them difficult to delve in case of failure. Recent works have shown the importance of using an explicit intermediate representation that has the benefits of increasing both the interpretability and the accuracy of networks' decisions. Nonetheless, these camera-based networks reason in camera view where scale is not homogeneous and hence not directly suitable for motion forecasting. In this paper, we introduce a novel monocular camera-only holistic end-to-end trajectory planning network with a Bird-Eye-View (BEV) intermediate representation that comes in the form of binary Occupancy Grid Maps (OGMs). To ease the prediction of OGMs in BEV from camera images, we introduce a novel scheme where the OGMs are first predicted as semantic masks in camera view and then warped in BEV using the homography between the two planes. The key element allowing this transformation to be applied to 3D objects such as vehicles, consists in predicting solely their footprint in camera-view, hence respecting the flat world hypothesis implied by the homography.


page 1

page 3

page 7


FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras

Driving requires interacting with road agents and predicting their futur...

Monocular Semantic Occupancy Grid Mapping with Convolutional Variational Auto-Encoders

In this work, we research and evaluate the usage of convolutional variat...

Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

The goal of perception for autonomous vehicles is to extract semantic re...

LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation

Recent works in autonomous driving have widely adopted the bird's-eye-vi...

BirdSLAM: Monocular Multibody SLAM in Bird's-Eye View

In this paper, we present BirdSLAM, a novel simultaneous localization an...

V2HDM-Mono: A Framework of Building a Marking-Level HD Map with One or More Monocular Cameras

Marking-level high-definition maps (HD maps) are of great significance f...

Monocular Plan View Networks for Autonomous Driving

Convolutions on monocular dash cam videos capture spatial invariances in...