Unsupervised View-Invariant Human Posture Representation

by   Faegheh Sardari, et al.
University of Heidelberg
University of Bristol

Most recent view-invariant action recognition and performance assessment approaches rely on a large amount of annotated 3D skeleton data to extract view-invariant features. However, acquiring 3D skeleton data can be cumbersome, if not impractical, in in-the-wild scenarios. To overcome this problem, we present a novel unsupervised approach that learns to extract view-invariant 3D human pose representation from a 2D image without using 3D joint data. Our model is trained by exploiting the intrinsic view-invariant properties of human pose between simultaneous frames from different viewpoints and their equivariant properties between augmented frames from the same viewpoint. We evaluate the learned view-invariant pose representations for two downstream tasks. We perform comparative experiments that show improvements on the state-of-the-art unsupervised cross-view action classification accuracy on NTU RGB+D by a significant margin, on both RGB and depth images. We also show the efficiency of transferring the learned representations from NTU RGB+D to obtain the first ever unsupervised cross-view and cross-subject rank correlation results on the multi-view human movement quality dataset, QMAR, and marginally improve on the-state-of-the-art supervised results for this dataset. We also carry out ablation studies to examine the contributions of the different components of our proposed network.


VI-Net: View-Invariant Quality of Human Movement Assessment

We propose a view-invariant method towards the assessment of the quality...

Unsupervised Human 3D Pose Representation with Viewpoint and Pose Disentanglement

Learning a good 3D human pose representation is important for human pose...

View-invariant Deep Architecture for Human Action Recognition using late fusion

Human action Recognition for unknown views is a challenging task. We pro...

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

We propose Human Pose Models that represent RGB and depth images of huma...

Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization

We introduce a novel representation learning method to disentangle pose-...

DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders

Human Pose Estimation (HPE) aims at retrieving the 3D position of human ...

Hierarchically Learned View-Invariant Representations for Cross-View Action Recognition

Recognizing human actions from varied views is challenging due to huge a...

Please sign up or login with your details

Forgot password? Click here to reset