DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion models

11/29/2022
by   Karl Holmquist, et al.
0

Traditionally, monocular 3D human pose estimation employs a machine learning model to predict the most likely 3D pose for a given input image. However, a single image can be highly ambiguous and induces multiple plausible solutions for the 2D-3D lifting step which results in overly confident 3D pose predictors. To this end, we propose DiffPose, a conditional diffusion model, that predicts multiple hypotheses for a given input image. In comparison to similar approaches, our diffusion model is straightforward and avoids intensive hyperparameter tuning, complex network structures, mode collapse, and unstable training. Moreover, we tackle a problem of the common two-step approach that first estimates a distribution of 2D joint locations via joint-wise heatmaps and consecutively approximates them based on first- or second-moment statistics. Since such a simplification of the heatmaps removes valid information about possibly correct, though labeled unlikely, joint locations, we propose to represent the heatmaps as a set of 2D joint candidate samples. To extract information about the original distribution from these samples we introduce our embedding transformer that conditions the diffusion model. Experimentally, we show that DiffPose slightly improves upon the state of the art for multi-hypothesis pose estimation for simple poses and outperforms it by a large margin for highly ambiguous poses.

READ FULL TEXT

page 1

page 4

page 8

page 14

research
03/21/2023

Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis Aggregation

In this paper, a novel Diffusion-based 3D Pose estimation (D3DP) method ...
research
03/22/2018

Unsupervised Adversarial Learning of 3D Human Pose from 2D Joint Locations

The task of three-dimensional (3D) human pose estimation from a single i...
research
07/01/2022

Vision-based Conflict Detection within Crowds based on High-Resolution Human Pose Estimation for Smart and Safe Airport

Future airports are becoming more complex and congested with the increas...
research
12/01/2016

Learning in an Uncertain World: Representing Ambiguity Through Multiple Hypotheses

Many prediction tasks contain uncertainty. In some cases, uncertainty is...
research
10/20/2022

Multi-hypothesis 3D human pose estimation metrics favor miscalibrated distributions

Due to depth ambiguities and occlusions, lifting 2D poses to 3D is a hig...
research
07/07/2023

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

Learning-based methods have dominated the 3D human pose estimation (HPE)...
research
11/24/2021

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

Estimating 3D human poses from monocular videos is a challenging task du...

Please sign up or login with your details

Forgot password? Click here to reset