Deep Multitask Architecture for Integrated 2D and 3D Human Sensing

01/31/2017
by   Alin-Ionut Popa, et al.
0

We propose a deep multitask architecture for fully automatic 2d and 3d human sensing (DMHS), including recognition and reconstruction, in monocular images. The system computes the figure-ground segmentation, semantically identifies the human body parts at pixel level, and estimates the 2d and 3d pose of the person. The model supports the joint training of all components by means of multi-task losses where early processing stages recursively feed into advanced ones for increasingly complex calculations, accuracy and robustness. The design allows us to tie a complete training protocol, by taking advantage of multiple datasets that would otherwise restrictively cover only some of the model components: complex 2d image data with no body part labeling and without associated 3d ground truth, or complex 3d data with limited 2d background variability. In detailed experiments based on several challenging 2d and 3d datasets (LSP, HumanEva, Human3.6M), we evaluate the sub-structures of the model, the effect of various types of training data in the multitask loss, and demonstrate that state-of-the-art results can be achieved at all processing levels. We also show that in the wild our monocular RGB architecture is perceptually competitive to a state-of-the art (commercial) Kinect system based on RGB-D data.

READ FULL TEXT

page 5

page 8

research
09/21/2020

Synthetic Training for Accurate 3D Human Pose and Shape Estimation in the Wild

This paper addresses the problem of monocular 3D human shape and pose es...
research
12/09/2017

Single-Shot Multi-Person 3D Body Pose Estimation From Monocular RGB Input

We propose a new efficient single-shot method for multi-person 3D pose e...
research
06/22/2021

Deep3DPose: Realtime Reconstruction of Arbitrarily Posed Human Bodies from Single RGB Images

We introduce an approach that accurately reconstructs 3D human poses and...
research
04/11/2019

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

To facilitate the analysis of human actions, interactions and emotions, ...
research
06/17/2021

THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers

We present THUNDR, a transformer-based deep neural network methodology t...
research
06/14/2019

MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation

We tackle the fundamentally ill-posed problem of 3D human localization f...

Please sign up or login with your details

Forgot password? Click here to reset