DETR4D: Direct Multi-View 3D Object Detection with Sparse Attention

12/15/2022
by   Zhipeng Luo, et al.
0

3D object detection with surround-view images is an essential task for autonomous driving. In this work, we propose DETR4D, a Transformer-based framework that explores sparse attention and direct feature query for 3D object detection in multi-view images. We design a novel projective cross-attention mechanism for query-image interaction to address the limitations of existing methods in terms of geometric cue exploitation and information loss for cross-view objects. In addition, we introduce a heatmap generation technique that bridges 3D and 2D spaces efficiently via query initialization. Furthermore, unlike the common practice of fusing intermediate spatial features for temporal aggregation, we provide a new perspective by introducing a novel hybrid approach that performs cross-frame fusion over past object queries and image features, enabling efficient and robust modeling of temporal information. Extensive experiments on the nuScenes dataset demonstrate the effectiveness and efficiency of the proposed DETR4D.

READ FULL TEXT

page 3

page 4

page 7

page 8

research
01/06/2023

Object as Query: Equipping Any 2D Object Detector with 3D Detection Ability

3D object detection from multi-view images has drawn much attention over...
research
06/02/2023

OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection

Multi-view 3D object detection is becoming popular in autonomous driving...
research
02/16/2023

3M3D: Multi-view, Multi-path, Multi-representation for 3D Object Detection

3D visual perception tasks based on multi-camera images are essential fo...
research
11/19/2022

Sparse4D: Multi-view 3D Object Detection with Sparse Spatial-Temporal Fusion

Bird-eye-view (BEV) based methods have made great progress recently in m...
research
03/30/2021

3D-MAN: 3D Multi-frame Attention Network for Object Detection

3D object detection is an important module in autonomous driving and rob...
research
08/09/2023

Multi-View Fusion and Distillation for Subgrade Distresses Detection based on 3D-GPR

The application of 3D ground-penetrating radar (3D-GPR) for subgrade dis...
research
06/02/2022

PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

In this paper, we propose PETRv2, a unified framework for 3D perception ...

Please sign up or login with your details

Forgot password? Click here to reset