Adversarial Motion Modelling helps Semi-supervised Hand Pose Estimation

by   Adrian Spurr, et al.

Hand pose estimation is difficult due to different environmental conditions, object- and self-occlusion as well as diversity in hand shape and appearance. Exhaustively covering this wide range of factors in fully annotated datasets has remained impractical, posing significant challenges for generalization of supervised methods. Embracing this challenge, we propose to combine ideas from adversarial training and motion modelling to tap into unlabeled videos. To this end we propose what to the best of our knowledge is the first motion model for hands and show that an adversarial formulation leads to better generalization properties of the hand pose estimator via semi-supervised training on unlabeled video sequences. In this setting, the pose predictor must produce a valid sequence of hand poses, as determined by a discriminative adversary. This adversary reasons both on the structural as well as temporal domain, effectively exploiting the spatio-temporal structure in the task. The main advantage of our approach is that we can make use of unpaired videos and joint sequence data both of which are much easier to attain than paired training data. We perform extensive evaluation, investigating essential components needed for the proposed framework and empirically demonstrate in two challenging settings that the proposed approach leads to significant improvements in pose estimation accuracy. In the lowest label setting, we attain an improvement of 40% in absolute mean joint error.


page 1

page 4

page 6


Semi-Supervised 3D Hand Shape and Pose Estimation with Label Propagation

To obtain 3D annotations, we are restricted to controlled environments o...

3D human pose estimation in video with temporal convolutions and semi-supervised training

In this work, we demonstrate that 3D poses in video can be effectively e...

Semi-supervised 3D Hand-Object Pose Estimation via Pose Dictionary Learning

3D hand-object pose estimation is an important issue to understand the i...

Multi-Scale Networks for 3D Human Pose Estimation with Inference Stage Optimization

Estimating 3D human poses from a monocular video is still a challenging ...

Exploiting temporal information for 3D pose estimation

In this work, we address the problem of 3D human pose estimation from a ...

Unsupervised Domain Adaptation with Temporal-Consistent Self-Training for 3D Hand-Object Joint Reconstruction

Deep learning-solutions for hand-object 3D pose and shape estimation are...

3D Pose Detection in Videos: Focusing on Occlusion

In this work, we build upon existing methods for occlusion-aware 3D pose...

Please sign up or login with your details

Forgot password? Click here to reset