Geometric Pose Affordance: 3D Human Pose with Scene Constraints

by   Zhe Wang, et al.

Full 3D estimation of human pose from a single image remains a challenging task despite many recent advances. In this paper, we explore the hypothesis that strong prior information about scene geometry can be used to improve pose estimation accuracy. To tackle this question empirically, we have assembled a novel Geometric Pose Affordance dataset, consisting of multi-view imagery of people interacting with a variety of rich 3D environments. We utilized a commercial motion capture system to collect gold-standard estimates of pose and construct accurate geometric 3D CAD models of the scene itself. To inject prior knowledge of scene constraints into existing frameworks for pose estimation from images, we introduce a novel, view-based representation of scene geometry, a multi-layer depth map, which employs multi-hit ray tracing to concisely encode multiple surface entry and exit points along each camera view ray direction. We propose two different mechanisms for integrating multi-layer depth information pose estimation: input as encoded ray features used in lifting 2D pose to full 3D, and secondly as a differentiable loss that encourages learned models to favor geometrically consistent pose estimates. We show experimentally that these techniques can improve the accuracy of 3D pose estimates, particularly in the presence of occlusion and complex scene geometry.


page 12

page 13

page 14

page 15

page 16

page 17

page 19

page 20


Scene-aware Egocentric 3D Human Pose Estimation

Egocentric 3D human pose estimation with a single head-mounted fisheye c...

Deep Reinforcement Learning for Active Human Pose Estimation

Most 3d human pose estimation methods assume that input – be it images o...

Unsupervised 3D Keypoint Estimation with Multi-View Geometry

Given enough annotated training data, 3D human pose estimation models ca...

Multi-layer Depth and Epipolar Feature Transformers for 3D Scene Reconstruction

We tackle the problem of automatically reconstructing a complete 3D mode...

Robust Single-view Cone-beam X-ray Pose Estimation with Neural Tuned Tomography (NeTT) and Masked Neural Radiance Fields (mNeRF)

Many tasks performed in image-guided, mini-invasive, medical procedures ...

A Versatile Scene Model with Differentiable Visibility Applied to Generative Pose Estimation

Generative reconstruction methods compute the 3D configuration (such as ...

Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation

We present a lightweight solution to recover 3D pose from multi-view ima...

Please sign up or login with your details

Forgot password? Click here to reset