Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos

08/18/2023
by   Zhiqiang Shen, et al.
0

Recently, the community has made tremendous progress in developing effective methods for point cloud video understanding that learn from massive amounts of labeled data. However, annotating point cloud videos is usually notoriously expensive. Moreover, training via one or only a few traditional tasks (e.g., classification) may be insufficient to learn subtle details of the spatio-temporal structure existing in point cloud videos. In this paper, we propose a Masked Spatio-Temporal Structure Prediction (MaST-Pre) method to capture the structure of point cloud videos without human annotations. MaST-Pre is based on spatio-temporal point-tube masking and consists of two self-supervised learning tasks. First, by reconstructing masked point tubes, our method is able to capture the appearance information of point cloud videos. Second, to learn motion, we propose a temporal cardinality difference prediction task that estimates the change in the number of points within a point tube. In this way, MaST-Pre is forced to model the spatial and temporal structure in point cloud videos. Extensive experiments on MSRAction-3D, NTU-RGBD, NvGesture, and SHREC'17 demonstrate the effectiveness of the proposed method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2022

PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences

Point cloud sequences are irregular and unordered in the spatial dimensi...
research
05/06/2023

PointCMP: Contrastive Mask Prediction for Self-supervised Learning on Point Cloud Videos

Self-supervised learning can extract representations of good quality fro...
research
03/11/2023

3DInAction: Understanding Human Actions in 3D Point Clouds

We propose a novel method for 3D point cloud action recognition. Underst...
research
07/31/2023

DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation

In this technical report, we present our findings from the research cond...
research
06/04/2023

Point Cloud Video Anomaly Detection Based on Point Spatio-Temporal Auto-Encoder

Video anomaly detection has great potential in enhancing safety in the p...
research
09/01/2021

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds

To date, various 3D scene understanding tasks still lack practical and g...
research
07/30/2022

Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding

This paper proposes a 4D backbone for long-term point cloud video unders...

Please sign up or login with your details

Forgot password? Click here to reset