Learning to Disambiguate Strongly Interacting Hands via Probabilistic Per-pixel Part Segmentation

07/01/2021
by   Zicong Fan, et al.
2

In natural conversation and interaction, our hands often overlap or are in contact with each other. Due to the homogeneous appearance of hands, this makes estimating the 3D pose of interacting hands from images difficult. In this paper we demonstrate that self-similarity, and the resulting ambiguities in assigning pixel observations to the respective hands and their parts, is a major cause of the final 3D pose error. Motivated by this insight, we propose DIGIT, a novel method for estimating the 3D poses of two interacting hands from a single monocular image. The method consists of two interwoven branches that process the input imagery into a per-pixel semantic part segmentation mask and a visual feature volume. In contrast to prior work, we do not decouple the segmentation from the pose estimation stage, but rather leverage the per-pixel probabilities directly in the downstream pose estimation task. To do so, the part probabilities are merged with the visual features and processed via fully-convolutional layers. We experimentally show that the proposed approach achieves new state-of-the-art performance on the InterHand2.6M dataset for both single and interacting hands across all metrics. We provide detailed ablation studies to demonstrate the efficacy of our method and to provide insights into how the modelling of pixel ownership affects single and interacting hand pose estimation. Our code will be released for research purposes.

READ FULL TEXT

page 1

page 3

page 4

page 6

page 7

page 8

research
08/21/2020

InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image

Analysis of hand-hand interactions is a crucial step towards better unde...
research
07/22/2022

3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal

Estimating 3D interacting hand pose from a single RGB image is essential...
research
10/11/2020

PI-Net: Pose Interacting Network for Multi-Person Monocular 3D Pose Estimation

Recent literature addressed the monocular 3D pose estimation task very s...
research
10/24/2021

A Dynamic Keypoints Selection Network for 6DoF Pose Estimation

6 DoF poses estimation problem aims to estimate the rotation and transla...
research
04/25/2021

Parallel mesh reconstruction streams for pose estimation of interacting hands

We present a new multi-stream 3D mesh reconstruction network (MSMR-Net) ...
research
02/05/2023

See You Soon: Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image

Reconstructing interacting hands from a single RGB image is a very chall...

Please sign up or login with your details

Forgot password? Click here to reset