Spatiotemporal Decouple-and-Squeeze Contrastive Learning for Semi-Supervised Skeleton-based Action Recognition

02/05/2023
by   Binqian Xu, et al.
0

Contrastive learning has been successfully leveraged to learn action representations for addressing the problem of semi-supervised skeleton-based action recognition. However, most contrastive learning-based methods only contrast global features mixing spatiotemporal information, which confuses the spatial- and temporal-specific information reflecting different semantic at the frame level and joint level. Thus, we propose a novel Spatiotemporal Decouple-and-Squeeze Contrastive Learning (SDS-CL) framework to comprehensively learn more abundant representations of skeleton-based actions by jointly contrasting spatial-squeezing features, temporal-squeezing features, and global features. In SDS-CL, we design a new Spatiotemporal-decoupling Intra-Inter Attention (SIIA) mechanism to obtain the spatiotemporal-decoupling attentive features for capturing spatiotemporal specific information by calculating spatial- and temporal-decoupling intra-attention maps among joint/motion features, as well as spatial- and temporal-decoupling inter-attention maps between joint and motion features. Moreover, we present a new Spatial-squeezing Temporal-contrasting Loss (STL), a new Temporal-squeezing Spatial-contrasting Loss (TSL), and the Global-contrasting Loss (GL) to contrast the spatial-squeezing joint and motion features at the frame level, temporal-squeezing joint and motion features at the joint level, as well as global joint and motion features at the skeleton level. Extensive experimental results on four public datasets show that the proposed SDS-CL achieves performance gains compared with other competitive methods.

READ FULL TEXT

page 1

page 2

page 10

research
02/05/2023

Pyramid Self-attention Polymerization Learning for Semi-supervised Skeleton-based Action Recognition

Most semi-supervised skeleton-based action recognition approaches aim to...
research
01/27/2023

Skeleton-based Action Recognition through Contrasting Two-Stream Spatial-Temporal Networks

For pursuing accurate skeleton-based action recognition, most prior meth...
research
09/11/2023

SCD-Net: Spatiotemporal Clues Disentanglement Network for Self-supervised Skeleton-based Action Recognition

Contrastive learning has achieved great success in skeleton-based action...
research
07/05/2023

STS-CCL: Spatial-Temporal Synchronous Contextual Contrastive Learning for Urban Traffic Forecasting

Efficiently capturing the complex spatiotemporal representations from la...
research
06/23/2023

Learning Scene Flow With Skeleton Guidance For 3D Action Recognition

Among the existing modalities for 3D action recognition, 3D flow has bee...
research
04/17/2018

Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation

Skeleton-based human action recognition has recently drawn increasing at...
research
10/14/2022

Spatiotemporal Classification with limited labels using Constrained Clustering for large datasets

Creating separable representations via representation learning and clust...

Please sign up or login with your details

Forgot password? Click here to reset