DeepAI AI Chat
Log In Sign Up

Learning to Regress Bodies from Images using Differentiable Semantic Rendering

by   Sai Kumar Dwivedi, et al.

Learning to regress 3D human body shape and pose (e.g. SMPL parameters) from monocular images typically exploits losses on 2D keypoints, silhouettes, and/or part-segmentation when 3D training data is not available. Such losses, however, are limited because 2D keypoints do not supervise body shape and segmentations of people in clothing do not match projected minimally-clothed SMPL shapes. To exploit richer image information about clothed people, we introduce higher-level semantic information about clothing to penalize clothed and non-clothed regions of the image differently. To do so, we train a body regressor using a novel Differentiable Semantic Rendering - DSR loss. For Minimally-Clothed regions, we define the DSR-MC loss, which encourages a tight match between a rendered SMPL body and the minimally-clothed regions of the image. For clothed regions, we define the DSR-C loss to encourage the rendered SMPL body to be inside the clothing mask. To ensure end-to-end differentiable training, we learn a semantic clothing prior for SMPL vertices from thousands of clothed human scans. We perform extensive qualitative and quantitative experiments to evaluate the role of clothing semantics on the accuracy of 3D human pose and shape estimation. We outperform all previous state-of-the-art methods on 3DPW and Human3.6M and obtain on par results on MPI-INF-3DHP. Code and trained models are available for research at


page 1

page 2

page 4

page 6

page 8

page 13

page 14

page 15


DIG: Draping Implicit Garment over the Human Body

Existing data-driven methods for draping garments over human bodies, des...

Accurate 3D Body Shape Regression using Metric and Semantic Attributes

While methods that regress 3D human meshes from images have progressed r...

Neural Body Fitting: Unifying Deep Learning and Model-Based Human Pose and Shape Estimation

Direct prediction of 3D body pose and shape remains a challenge even for...

Neural Descent for Visual 3D Human Pose and Shape

We present deep neural network methodology to reconstruct the 3d pose an...

Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture from Images "In the Wild"

We present the first method to perform automatic 3D pose, shape and text...

BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information

Our goal is to recover the 3D shape and pose of dogs from a single image...

STAR: Sparse Trained Articulated Human Body Regressor

The SMPL body model is widely used for the estimation, synthesis, and an...