Disentangling Latent Hands for Image Synthesis and Pose Estimation

12/03/2018
by   Linlin Yang, et al.
0

Hand image synthesis and pose estimation from RGB images are both highly challenging tasks due to the large discrepancy between factors of variation ranging from image background content to camera viewpoint. To better analyze these factors of variation, we propose the use of disentangled representations and propose a disentangled variational autoencoder (dVAE) that allows for specific sampling and inference of these factors. The derived objective from the variational lower bound as well as the proposed training strategy are highly flexible, allowing us to handle cross-modal encoders and decoders as well as semi-supervised learning scenarios. Experiments show that our dVAE can synthesize highly realistic images of the hand specifiable by both pose and image background content and also estimate 3D hand poses from RGB images with accuracy competitive with state-of-the-art on two public benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 7

page 8

page 9

research
03/30/2018

Cross-modal Deep Variational Hand Pose Estimation

The human hand moves in complex and high-dimensional ways, making estima...
research
04/16/2019

Disentangling Pose from Appearance in Monochrome Hand Images

Hand pose estimation from the monocular 2D image is challenging due to t...
research
10/30/2022

Image-free Domain Generalization via CLIP for 3D Hand Pose Estimation

RGB-based 3D hand pose estimation has been successful for decades thanks...
research
11/25/2018

Generating Realistic Training Images Based on Tonality-Alignment Generative Adversarial Networks for Hand Pose Estimation

Hand pose estimation from a monocular RGB image is an important but chal...
research
02/08/2021

DEFT: Distilling Entangled Factors

Disentanglement is a highly desirable property of representation due to ...
research
10/02/2020

MM-Hand: 3D-Aware Multi-Modal Guided Hand Generative Network for 3D Hand Pose Synthesis

Estimating the 3D hand pose from a monocular RGB image is important but ...
research
03/05/2019

Unsupervised Domain-Specific Deblurring via Disentangled Representations

Image deblurring aims to restore the latent sharp images from the corres...

Please sign up or login with your details

Forgot password? Click here to reset