A Dual-Cycled Cross-View Transformer Network for Unified Road Layout Estimation and 3D Object Detection in the Bird's-Eye-View

09/19/2022
by   Curie Kim, et al.
0

The bird's-eye-view (BEV) representation allows robust learning of multiple tasks for autonomous driving including road layout estimation and 3D object detection. However, contemporary methods for unified road layout estimation and 3D object detection rarely handle the class imbalance of the training dataset and multi-class learning to reduce the total number of networks required. To overcome these limitations, we propose a unified model for road layout estimation and 3D object detection inspired by the transformer architecture and the CycleGAN learning framework. The proposed model deals with the performance degradation due to the class imbalance of the dataset utilizing the focal loss and the proposed dual cycle loss. Moreover, we set up extensive learning scenarios to study the effect of multi-class learning for road layout estimation in various situations. To verify the effectiveness of the proposed model and the learning scheme, we conduct a thorough ablation study and a comparative study. The experiment results attest the effectiveness of our model; we achieve state-of-the-art performance in both the road layout estimation and 3D object detection tasks.

READ FULL TEXT

page 1

page 3

page 5

page 6

research
12/09/2019

Learning a Layout Transfer Network for Context Aware Object Detection

We present a context aware object detection method based on a retrieve-a...
research
01/11/2023

Street-View Image Generation from a Bird's-Eye View Layout

Bird's-Eye View (BEV) Perception has received increasing attention in re...
research
11/15/2022

Monocular BEV Perception of Road Scenes via Front-to-Top View Projection

HD map reconstruction is crucial for autonomous driving. LiDAR-based met...
research
02/19/2020

MonoLayout: Amodal scene layout from a single image

In this paper, we address the novel, highly challenging problem of estim...
research
08/07/2023

RoadScan: A Novel and Robust Transfer Learning Framework for Autonomous Pothole Detection in Roads

This research paper presents a novel approach to pothole detection using...
research
03/30/2020

Predicting Semantic Map Representations from Images using Pyramid Occupancy Networks

Autonomous vehicles commonly rely on highly detailed birds-eye-view maps...
research
08/28/2023

Group Regression for Query Based Object Detection and Tracking

Group regression is commonly used in 3D object detection to predict box ...

Please sign up or login with your details

Forgot password? Click here to reset