Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling

11/29/2021
by   Xumin Yu, et al.
0

We present Point-BERT, a new paradigm for learning Transformers to generalize the concept of BERT to 3D point cloud. Inspired by BERT, we devise a Masked Point Modeling (MPM) task to pre-train point cloud Transformers. Specifically, we first divide a point cloud into several local point patches, and a point cloud Tokenizer with a discrete Variational AutoEncoder (dVAE) is designed to generate discrete point tokens containing meaningful local information. Then, we randomly mask out some patches of input point clouds and feed them into the backbone Transformers. The pre-training objective is to recover the original point tokens at the masked locations under the supervision of point tokens obtained by the Tokenizer. Extensive experiments demonstrate that the proposed BERT-style pre-training strategy significantly improves the performance of standard point cloud Transformers. Equipped with our pre-training strategy, we show that a pure Transformer architecture attains 93.8 and 83.1 designed point cloud models with much fewer hand-made designs. We also demonstrate that the representations learned by Point-BERT transfer well to new tasks and domains, where our models largely advance the state-of-the-art of few-shot point cloud classification task. The code and pre-trained models are available at https://github.com/lulutang0608/Point-BERT

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2022

POS-BERT: Point Cloud One-Stage BERT Pre-Training

Recently, the pre-training paradigm combining Transformer and masked lan...
research
07/27/2022

Point-McBert: A Multi-choice Self-supervised Framework for Point Cloud Pre-training

Masked language modeling (MLM) has become one of the most successful sel...
research
06/19/2023

ExpPoint-MAE: Better interpretability and performance for self-supervised point cloud transformers

In this paper we delve into the properties of transformers, attained thr...
research
05/19/2023

PointGPT: Auto-regressively Generative Pre-training from Point Clouds

Large language models (LLMs) based on the generative pre-training transf...
research
04/20/2021

Efficient pre-training objectives for Transformers

The Transformer architecture deeply changed the natural language process...
research
01/09/2023

Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling

We identify and overcome two key obstacles in extending the success of B...
research
09/15/2022

Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?

Vision Transformers (ViTs) have proven to be effective, in solving 2D im...

Please sign up or login with your details

Forgot password? Click here to reset