Self-Attention Based Context-Aware 3D Object Detection

01/07/2021
by   Prarthana Bhattacharyya, et al.
9

Most existing point-cloud based 3D object detectors use convolution-like operators to process information in a local neighbourhood with fixed-weight kernels and aggregate global context hierarchically. However, recent work on non-local neural networks and self-attention for 2D vision has shown that explicitly modeling global context and long-range interactions between positions can lead to more robust and competitive models. In this paper, we explore two variants of self-attention for contextual modeling in 3D object detection by augmenting convolutional features with self-attention features. We first incorporate the pairwise self-attention mechanism into the current state-of-the-art BEV, voxel and point-based detectors and show consistent improvement over strong baseline models while simultaneously significantly reducing their parameter footprint and computational cost. We also propose a self-attention variant that samples a subset of the most representative features by learning deformations over randomly sampled locations. This not only allows us to scale explicit global contextual modeling to larger point-clouds, but also leads to more discriminative and informative feature descriptors. Our method can be flexibly applied to most state-of-the-art detectors with increased accuracy and parameter and compute efficiency. We achieve new state-of-the-art detection performance on KITTI and nuScenes datasets. Code is available at <https://github.com/AutoVision-cloud/SA-Det3D>.

READ FULL TEXT

page 1

page 16

page 17

research
03/19/2022

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds

Transformer has demonstrated promising performance in many 2D vision tas...
research
04/22/2019

Attention Augmented Convolutional Networks

Convolutional networks have been the paradigm of choice in many computer...
research
02/13/2023

Surface-biased Multi-Level Context 3D Object Detection

Object detection in 3D point clouds is a crucial task in a range of comp...
research
08/11/2020

Attention-based 3D Object Reconstruction from a Single Image

Recently, learning-based approaches for 3D reconstruction from 2D images...
research
11/15/2021

Searching for TrioNet: Combining Convolution with Local and Global Self-Attention

Recently, self-attention operators have shown superior performance as a ...
research
08/23/2020

Neighbourhood-Insensitive Point Cloud Normal Estimation Network

We introduce a novel self-attention-based normal estimation network that...
research
03/23/2021

Scaling Local Self-Attention For Parameter Efficient Visual Backbones

Self-attention has the promise of improving computer vision systems due ...

Please sign up or login with your details

Forgot password? Click here to reset