SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds

10/13/2022
by   Pei Sun, et al.
0

3D object detection in point clouds is a core component for modern robotics and autonomous driving systems. A key challenge in 3D object detection comes from the inherent sparse nature of point occupancy within the 3D scene. In this paper, we propose Sparse Window Transformer (SWFormer ), a scalable and accurate model for 3D object detection, which can take full advantage of the sparsity of point clouds. Built upon the idea of window-based Transformers, SWFormer converts 3D points into sparse voxels and windows, and then processes these variable-length sparse windows efficiently using a bucketing scheme. In addition to self-attention within each spatial window, our SWFormer also captures cross-window correlation with multi-scale feature fusion and window shifting operations. To further address the unique challenge of detecting 3D objects accurately from sparse features, we propose a new voxel diffusion technique. Experimental results on the Waymo Open Dataset show our SWFormer achieves state-of-the-art 73.36 L2 mAPH on vehicle and pedestrian for 3D object detection on the official test set, outperforming all previous single-stage and two-stage models, while being much more efficient.

READ FULL TEXT
research
05/04/2023

OctFormer: Octree-based Transformers for 3D Point Clouds

We propose octree-based transformers, named OctFormer, for 3D point clou...
research
01/15/2023

DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets

Designing an efficient yet deployment-friendly 3D backbone to handle spa...
research
12/13/2021

Embracing Single Stride 3D Object Detector with Sparse Transformer

In LiDAR-based 3D object detection for autonomous driving, the ratio of ...
research
07/04/2023

DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation

Recent Diffusion Transformers (e.g., DiT) have demonstrated their powerf...
research
04/06/2020

SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds

Multi-class 3D object detection aims to localize and classify objects of...
research
06/22/2020

Generative Sparse Detection Networks for 3D Single-shot Object Detection

3D object detection has been widely studied due to its potential applica...
research
01/20/2023

FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer

Transformer, as an alternative to CNN, has been proven effective in many...

Please sign up or login with your details

Forgot password? Click here to reset