ViA: View-invariant Skeleton Action Representation Learning via Motion Retargeting

08/31/2022
by   Di Yang, et al.
6

Current self-supervised approaches for skeleton action representation learning often focus on constrained scenarios, where videos and skeleton data are recorded in laboratory settings. When dealing with estimated skeleton data in real-world videos, such methods perform poorly due to the large variations across subjects and camera viewpoints. To address this issue, we introduce ViA, a novel View-Invariant Autoencoder for self-supervised skeleton action representation learning. ViA leverages motion retargeting between different human performers as a pretext task, in order to disentangle the latent action-specific `Motion' features on top of the visual representation of a 2D or 3D skeleton sequence. Such `Motion' features are invariant to skeleton geometry and camera view and allow ViA to facilitate both, cross-subject and cross-view action classification tasks. We conduct a study focusing on transfer-learning for skeleton-based action recognition with self-supervised pre-training on real-world data (e.g., Posetics). Our results showcase that skeleton representations learned from ViA are generic enough to improve upon state-of-the-art action classification accuracy, not only on 3D laboratory datasets such as NTU-RGB+D 60 and NTU-RGB+D 120, but also on real-world datasets where only 2D data are accurately estimated, e.g., Toyota Smarthome, UAV-Human and Penn Action.

READ FULL TEXT
research
10/12/2020

MS^2L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition

In this paper, we address self-supervised representation learning from h...
research
08/28/2023

LAC: Latent Action Composition for Skeleton-based Action Segmentation

Skeleton-based action segmentation requires recognizing composable actio...
research
04/21/2022

Unsupervised Human Action Recognition with Skeletal Graph Laplacian and Self-Supervised Viewpoints Invariance

This paper presents a novel end-to-end method for the problem of skeleto...
research
10/01/2021

Unsupervised Motion Representation Learning with Capsule Autoencoders

We propose the Motion Capsule Autoencoder (MCAE), which addresses a key ...
research
08/15/2019

DeepHuMS: Deep Human Motion Signature for 3D Skeletal Sequences

3D Human Motion Indexing and Retrieval is an interesting problem due to ...
research
02/04/2022

Bootstrapped Representation Learning for Skeleton-Based Action Recognition

In this work, we study self-supervised representation learning for 3D sk...
research
08/14/2023

Masked Motion Predictors are Strong 3D Action Representation Learners

In 3D human action recognition, limited supervised data makes it challen...

Please sign up or login with your details

Forgot password? Click here to reset