Point-McBert: A Multi-choice Self-supervised Framework for Point Cloud Pre-training

07/27/2022
by   Kexue Fu, et al.
0

Masked language modeling (MLM) has become one of the most successful self-supervised pre-training task. Inspired by its success, Point-Bert, as a pioneer work in point cloud, proposed masked point modeling (MPM) to pre-train point transformer on large scale unanotated dataset. Despite its great performance, we find inherent difference between language and point cloud tends to cause ambiguous tokenization for point cloud. For point cloud, there doesn't exist a gold standard for point cloud tokenization. Although Point-Bert introduce a discrete Variational AutoEncoder (dVAE) as tokenizer to allocate token ids to local patches, it tends to generate ambigious token ids for local patches. We find this imperfect tokenizer might generate different token ids for semantically-similar patches and same token ids for semantically-dissimilar patches. To tackle above problem, we propose our Point-McBert, a pre-training framework with eased and refined supervision signals. Specifically, we ease the previous single-choice constraint on patches, and provide multi-choice token ids for each patch as supervision. Moreover, we utilitze the high-level semantics learned by transformer to further refine our supervision signals. Extensive experiments on point cloud classification, few-shot classification and part segmentation tasks demonstrate the superiority of our method, e.g., the pre-trained transformer achieves 94.1 accuracy on the hardest setting of ScanObjectNN and new state-of-the-art performance on few-shot learning. We also demonstrate that our method not only improves the performance of Point-Bert on all downstream tasks, but also incurs almost no extra computational overhead.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2021

Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling

We present Point-BERT, a new paradigm for learning Transformers to gener...
research
03/29/2022

mc-BEiT: Multi-choice Discretization for Image BERT Pre-training

Image BERT pre-training with masked image modeling (MIM) becomes a popul...
research
07/04/2022

Masked Autoencoders in 3D Point Cloud Representation Learning

Transformer-based Self-supervised Representation Learning methods learn ...
research
05/19/2023

PointGPT: Auto-regressively Generative Pre-training from Point Clouds

Large language models (LLMs) based on the generative pre-training transf...
research
03/14/2022

Self-Promoted Supervision for Few-Shot Transformer

The few-shot learning ability of vision transformers (ViTs) is rarely in...
research
11/13/2022

Point-DAE: Denoising Autoencoders for Self-supervised Point Cloud Learning

Masked autoencoder has demonstrated its effectiveness in self-supervised...
research
06/19/2023

ExpPoint-MAE: Better interpretability and performance for self-supervised point cloud transformers

In this paper we delve into the properties of transformers, attained thr...

Please sign up or login with your details

Forgot password? Click here to reset