HandOccNet: Occlusion-Robust 3D Hand Mesh Estimation Network

03/28/2022
by   JoonKyu Park, et al.
0

Hands are often severely occluded by objects, which makes 3D hand mesh estimation challenging. Previous works often have disregarded information at occluded regions. However, we argue that occluded regions have strong correlations with hands so that they can provide highly beneficial information for complete 3D hand mesh estimation. Thus, in this work, we propose a novel 3D hand mesh estimation network HandOccNet, that can fully exploits the information at occluded regions as a secondary means to enhance image features and make it much richer. To this end, we design two successive Transformer-based modules, called feature injecting transformer (FIT) and self- enhancing transformer (SET). FIT injects hand information into occluded region by considering their correlation. SET refines the output of FIT by using a self-attention mechanism. By injecting the hand information to the occluded region, our HandOccNet reaches the state-of-the-art performance on 3D hand mesh benchmarks that contain challenging hand-object occlusions. The codes are available in: https://github.com/namepllet/HandOccNet.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 7

page 13

page 14

page 15

research
06/24/2021

Feature Completion for Occluded Person Re-Identification

Person re-identification (reID) plays an important role in computer visi...
research
03/09/2023

Deformer: Dynamic Fusion Transformer for Robust Hand Pose Estimation

Accurately estimating 3D hand pose is crucial for understanding how huma...
research
04/06/2021

Learning to Estimate Hidden Motions with Global Motion Aggregation

Occlusions pose a significant challenge to optical flow algorithms that ...
research
03/27/2023

Recovering 3D Hand Mesh Sequence from a Single Blurry Image: A New Dataset and Temporal Unfolding

Hands, one of the most dynamic parts of our body, suffer from blur due t...
research
06/17/2021

To fit or not to fit: Model-based Face Reconstruction and Occlusion Segmentation from Weak Supervision

3D face reconstruction from a single image is challenging due to its ill...
research
03/13/2021

OCID-Ref: A 3D Robotic Dataset with Embodied Language for Clutter Scene Grounding

To effectively apply robots in working environments and assist humans, i...
research
02/23/2023

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

Humans can easily imagine the complete 3D geometry of occluded objects a...

Please sign up or login with your details

Forgot password? Click here to reset