Self-Supervised Pre-training for 3D Point Clouds via View-Specific Point-to-Image Translation

12/29/2022
by   Qijian Zhang, et al.
0

The past few years have witnessed the prevalence of self-supervised representation learning within the language and 2D vision communities. However, such advancements have not been fully migrated to the community of 3D point cloud learning. Different from previous pre-training pipelines for 3D point clouds that generally fall into the scope of either generative modeling or contrastive learning, in this paper, we investigate a translative pre-training paradigm, namely PointVST, driven by a novel self-supervised pretext task of cross-modal translation from an input 3D object point cloud to its diverse forms of 2D rendered images (e.g., silhouette, depth, contour). Specifically, we begin with deducing view-conditioned point-wise embeddings via the insertion of the viewpoint indicator, and then adaptively aggregate a view-specific global codeword, which is further fed into the subsequent 2D convolutional translation heads for image generation. We conduct extensive experiments on common task scenarios of 3D shape analysis, where our PointVST shows consistent and prominent performance superiority over current state-of-the-art methods under diverse evaluation protocols. Our code will be made publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2023

Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud Pre-training

Masked Autoencoders (MAE) have shown promising performance in self-super...
research
12/12/2022

BEV-MAE: Bird's Eye View Masked Autoencoders for Outdoor Point Cloud Pre-training

Current outdoor LiDAR-based 3D object detection methods mainly adopt the...
research
01/18/2023

Contrastive Learning for Self-Supervised Pre-Training of Point Cloud Segmentation Networks With Image Data

Reducing the quantity of annotations required for supervised training is...
research
07/01/2022

Masked Autoencoders for Self-Supervised Learning on Automotive Point Clouds

Masked autoencoding has become a successful pre-training paradigm for Tr...
research
10/03/2022

CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training

Pre-training across 3D vision and language remains under development bec...
research
07/11/2022

A Closer Look at Invariances in Self-supervised Pre-training for 3D Vision

Self-supervised pre-training for 3D vision has drawn increasing research...
research
09/17/2021

Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck

Semantic understanding of 3D point clouds is important for various robot...

Please sign up or login with your details

Forgot password? Click here to reset