Pyramid Deep Fusion Network for Two-Hand Reconstruction from RGB-D Images

07/12/2023
by   Jinwei Ren, et al.
0

Accurately recovering the dense 3D mesh of both hands from monocular images poses considerable challenges due to occlusions and projection ambiguity. Most of the existing methods extract features from color images to estimate the root-aligned hand meshes, which neglect the crucial depth and scale information in the real world. Given the noisy sensor measurements with limited resolution, depth-based methods predict 3D keypoints rather than a dense mesh. These limitations motivate us to take advantage of these two complementary inputs to acquire dense hand meshes on a real-world scale. In this work, we propose an end-to-end framework for recovering dense meshes for both hands, which employ single-view RGB-D image pairs as input. The primary challenge lies in effectively utilizing two different input modalities to mitigate the blurring effects in RGB images and noises in depth images. Instead of directly treating depth maps as additional channels for RGB images, we encode the depth information into the unordered point cloud to preserve more geometric details. Specifically, our framework employs ResNet50 and PointNet++ to derive features from RGB and point cloud, respectively. Additionally, we introduce a novel pyramid deep fusion network (PDFNet) to aggregate features at different scales, which demonstrates superior efficacy compared to previous fusion strategies. Furthermore, we employ a GCN-based decoder to process the fused features and recover the corresponding 3D pose and dense mesh. Through comprehensive ablation experiments, we have not only demonstrated the effectiveness of our proposed fusion algorithm but also outperformed the state-of-the-art approaches on publicly available datasets. To reproduce the results, we will make our source code and models publicly available at <https://github.com/zijinxuxu/PDFNet>.

READ FULL TEXT

page 1

page 4

page 5

page 8

page 9

research
11/25/2021

Rotation Equivariant 3D Hand Mesh Generation from a Single RGB Image

We develop a rotation equivariant model for generating 3D hand meshes fr...
research
08/09/2023

PointMBF: A Multi-scale Bidirectional Fusion Network for Unsupervised RGB-D Point Cloud Registration

Point cloud registration is a task to estimate the rigid transformation ...
research
03/01/2021

FPS-Net: A Convolutional Fusion Network for Large-Scale LiDAR Point Cloud Segmentation

Scene understanding based on LiDAR point cloud is an essential task for ...
research
12/28/2019

Silhouette-Net: 3D Hand Pose Estimation from Silhouettes

3D hand pose estimation has received a lot of attention for its wide ran...
research
10/16/2021

DPC: Unsupervised Deep Point Correspondence via Cross and Self Construction

We present a new method for real-time non-rigid dense correspondence bet...
research
03/08/2019

3DN: 3D Deformation Network

Applications in virtual and augmented reality create a demand for rapid ...
research
11/26/2020

Depth-Enhanced Feature Pyramid Network for Occlusion-Aware Verification of Buildings from Oblique Images

Detecting the changes of buildings in urban environments is essential. E...

Please sign up or login with your details

Forgot password? Click here to reset