Focal-PETR: Embracing Foreground for Efficient Multi-Camera 3D Object Detection

12/11/2022
by   Shihao Wang, et al.
0

The dominant multi-camera 3D detection paradigm is based on explicit 3D feature construction, which requires complicated indexing of local image-view features via 3D-to-2D projection. Other methods implicitly introduce geometric positional encoding and perform global attention (e.g., PETR) to build the relationship between image tokens and 3D objects. The 3D-to-2D perspective inconsistency and global attention lead to a weak correlation between foreground tokens and queries, resulting in slow convergence. We propose Focal-PETR with instance-guided supervision and spatial alignment module to adaptively focus object queries on discriminative foreground regions. Focal-PETR additionally introduces a down-sampling strategy to reduce the consumption of global attention. Due to the highly parallelized implementation and down-sampling strategy, our model, without depth supervision, achieves leading performance on the large-scale nuScenes benchmark and a superior speed of 30 FPS on a single RTX3090 GPU. Extensive experiments show that our method outperforms PETR while consuming 3x fewer training hours. The code will be made publicly available.

READ FULL TEXT

page 2

page 8

page 12

page 13

page 14

page 15

research
04/30/2020

Bilateral Attention Network for RGB-D Salient Object Detection

Most existing RGB-D salient object detection (SOD) methods focus on the ...
research
01/13/2023

OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for Multi-Camera 3D Object Detection

The recent trend for multi-camera 3D object detection is through the uni...
research
03/21/2022

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds

We study the problem of efficient object detection of 3D LiDAR point clo...
research
03/05/2021

IAFA: Instance-aware Feature Aggregation for 3D Object Detection from a Single Image

3D object detection from a single image is an important task in Autonomo...
research
04/14/2023

DETR with Additional Global Aggregation for Cross-domain Weakly Supervised Object Detection

This paper presents a DETR-based method for cross-domain weakly supervis...
research
08/18/2023

SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos

Camera-based 3D object detection in BEV (Bird's Eye View) space has draw...
research
05/31/2021

Training Domain-invariant Object Detector Faster with Feature Replay and Slow Learner

In deep learning-based object detection on remote sensing domain, nuisan...

Please sign up or login with your details

Forgot password? Click here to reset