SCD-Net: Spatiotemporal Clues Disentanglement Network for Self-supervised Skeleton-based Action Recognition

09/11/2023
by   Cong Wu, et al.
0

Contrastive learning has achieved great success in skeleton-based action recognition. However, most existing approaches encode the skeleton sequences as entangled spatiotemporal representations and confine the contrasts to the same level of representation. Instead, this paper introduces a novel contrastive learning framework, namely Spatiotemporal Clues Disentanglement Network (SCD-Net). Specifically, we integrate the decoupling module with a feature extractor to derive explicit clues from spatial and temporal domains respectively. As for the training of SCD-Net, with a constructed global anchor, we encourage the interaction between the anchor and extracted clues. Further, we propose a new masking strategy with structural constraints to strengthen the contextual associations, leveraging the latest development from masked image modelling into the proposed SCD-Net. We conduct extensive evaluations on the NTU-RGB+D (60 120) and PKU-MMD (I II) datasets, covering various downstream tasks such as action recognition, action retrieval, transfer learning, and semi-supervised learning. The experimental results demonstrate the effectiveness of our method, which outperforms the existing state-of-the-art (SOTA) approaches significantly.

READ FULL TEXT
research
08/08/2021

Skeleton-Contrastive 3D Action Representation Learning

This paper strives for self-supervised learning of a feature space suita...
research
02/05/2023

Spatiotemporal Decouple-and-Squeeze Contrastive Learning for Semi-Supervised Skeleton-based Action Recognition

Contrastive learning has been successfully leveraged to learn action rep...
research
05/03/2023

Cross-Stream Contrastive Learning for Self-Supervised Skeleton-Based Action Recognition

Self-supervised skeleton-based action recognition enjoys a rapid growth ...
research
03/20/2023

Actionlet-Dependent Contrastive Learning for Unsupervised Skeleton-Based Action Recognition

The self-supervised pretraining paradigm has achieved great success in s...
research
02/17/2023

Self-supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences

Self-supervised learning has demonstrated remarkable capability in repre...
research
02/04/2022

Bootstrapped Representation Learning for Skeleton-Based Action Recognition

In this work, we study self-supervised representation learning for 3D sk...
research
05/31/2023

Learning by Aligning 2D Skeleton Sequences in Time

This paper presents a novel self-supervised temporal video alignment fra...

Please sign up or login with your details

Forgot password? Click here to reset