SVT-Net: A Super Light-Weight Network for Large Scale Place Recognition using Sparse Voxel Transformers

05/01/2021
by   Zhaoxin Fan, et al.
1

Point cloud-based large scale place recognition is fundamental for many applications like Simultaneous Localization and Mapping (SLAM). Though previous methods have achieved good performance by learning short range local features, long range contextual properties have long been neglected. And model size has became a bottleneck for further popularizing. In this paper, we propose model SVTNet, a super light-weight network, for large scale place recognition. In our work, building on top of the highefficiency 3D Sparse Convolution (SP-Conv), an Atom-based Sparse Voxel Transformer (ASVT) and a Cluster-based Sparse Voxel Transformer (CSVT) are proposed to learn both short range local features and long range contextual features. Consisting of ASVT and CSVT, our SVT-Net can achieve state-of-art performance in terms of both accuracy and speed with a super-light model size (0.9M). Two simplified version of SVT-Net named ASVT-Net and CSVT-Net are also introduced, which also achieve state-of-art performances while further reduce the model size to 0.8M and 0.4M respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2021

Attentive Rotation Invariant Convolution for Point Cloud-based Large Scale Place Recognition

Autonomous Driving and Simultaneous Localization and Mapping(SLAM) are b...
research
04/12/2022

HiTPR: Hierarchical Transformer for Place Recognition in Point Cloud

Place recognition or loop closure detection is one of the core component...
research
03/02/2022

Contextual Attention Network: Transformer Meets U-Net

Currently, convolutional neural networks (CNN) (e.g., U-Net) have become...
research
05/25/2021

TransLoc3D : Point Cloud based Large-scale Place Recognition using Adaptive Receptive Fields

Place recognition plays an essential role in the field of autonomous dri...
research
09/23/2022

GIDP: Learning a Good Initialization and Inducing Descriptor Post-enhancing for Large-scale Place Recognition

Large-scale place recognition is a fundamental but challenging task, whi...
research
01/15/2023

DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets

Designing an efficient yet deployment-friendly 3D backbone to handle spa...
research
01/16/2022

Sparse Cross-scale Attention Network for Efficient LiDAR Panoptic Segmentation

Two major challenges of 3D LiDAR Panoptic Segmentation (PS) are that poi...

Please sign up or login with your details

Forgot password? Click here to reset