GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

12/06/2022
by   Honghui Yang, et al.
0

Despite the tremendous progress of Masked Autoencoders (MAE) in developing vision tasks such as image and video, exploring MAE in large-scale 3D point clouds remains challenging due to the inherent irregularity. In contrast to previous 3D MAE frameworks, which either design a complex decoder to infer masked information from maintained regions or adopt sophisticated masking strategies, we instead propose a much simpler paradigm. The core idea is to apply a Generative Decoder for MAE (GD-MAE) to automatically merges the surrounding context to restore the masked geometric knowledge in a hierarchical fusion manner. In doing so, our approach is free from introducing the heuristic design of decoders and enjoys the flexibility of exploring various masking strategies. The corresponding part costs less than 12% latency compared with conventional methods, while achieving better performance. We demonstrate the efficacy of the proposed method on several large-scale benchmarks: Waymo, KITTI, and ONCE. Consistent improvement on downstream detection tasks illustrates strong robustness and generalization capability. Not only our method reveals state-of-the-art results, but remarkably, we achieve comparable accuracy even with 20% of the labeled data on the Waymo dataset. The code will be released at <https://github.com/Nightmare-n/GD-MAE>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2022

Voxel-MAE: Masked Autoencoders for Pre-training Large-scale Point Clouds

Mask-based pre-training has achieved great success for self-supervised l...
research
05/28/2022

Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training

Masked Autoencoders (MAE) have shown great potentials in self-supervised...
research
12/14/2022

MAELi – Masked Autoencoder for Large-Scale LiDAR Point Clouds

We show how the inherent, but often neglected, properties of large-scale...
research
12/12/2022

BEV-MAE: Bird's Eye View Masked Autoencoders for Outdoor Point Cloud Pre-training

Current outdoor LiDAR-based 3D object detection methods mainly adopt the...
research
05/31/2023

Point-GCC: Universal Self-supervised 3D Scene Pre-training via Geometry-Color Contrast

Geometry and color information provided by the point clouds are both cru...
research
01/09/2023

Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling

We identify and overcome two key obstacles in extending the success of B...
research
05/14/2020

Taskology: Utilizing Task Relations at Scale

It has been recognized that the joint training of computer vision tasks ...

Please sign up or login with your details

Forgot password? Click here to reset