Understanding Pose and Appearance Disentanglement in 3D Human Pose Estimation

09/20/2023
by   Krishna Kanth Nakka, et al.
0

As 3D human pose estimation can now be achieved with very high accuracy in the supervised learning scenario, tackling the case where 3D pose annotations are not available has received increasing attention. In particular, several methods have proposed to learn image representations in a self-supervised fashion so as to disentangle the appearance information from the pose one. The methods then only need a small amount of supervised data to train a pose regressor using the pose-related latent vector as input, as it should be free of appearance information. In this paper, we carry out in-depth analysis to understand to what degree the state-of-the-art disentangled representation learning methods truly separate the appearance information from the pose one. First, we study disentanglement from the perspective of the self-supervised network, via diverse image synthesis experiments. Second, we investigate disentanglement with respect to the 3D pose regressor following an adversarial attack perspective. Specifically, we design an adversarial strategy focusing on generating natural appearance changes of the subject, and against which we could expect a disentangled network to be robust. Altogether, our analyses show that disentanglement in the three state-of-the-art disentangled representation learning frameworks if far from complete, and that their pose codes contain significant appearance information. We believe that our approach provides a valuable testbed to evaluate the degree of disentanglement of pose from appearance in self-supervised 3D human pose estimation.

READ FULL TEXT
research
04/09/2020

Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis

Camera captured human pose is an outcome of several sources of variation...
research
08/30/2021

Equine Pain Behavior Classification via Self-Supervised Disentangled Pose Representation

Timely detection of horse pain is important for equine welfare. Horses e...
research
03/13/2019

Neural Scene Decomposition for Multi-Person Motion Capture

Learning general image representations has proven key to the success of ...
research
08/07/2017

Self-supervised Learning of Pose Embeddings from Spatiotemporal Relations in Videos

Human pose analysis is presently dominated by deep convolutional network...
research
07/04/2020

Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation

Existing 3D human pose estimation models suffer performance drop when ap...
research
10/22/2019

Unsupervised Robust Disentangling of Latent Characteristics for Image Synthesis

Deep generative models come with the promise to learn an explainable rep...
research
08/04/2020

Appearance Consensus Driven Self-Supervised Human Mesh Recovery

We present a self-supervised human mesh recovery framework to infer huma...

Please sign up or login with your details

Forgot password? Click here to reset