POEM: Reconstructing Hand in a Point Embedded Multi-view Stereo

04/08/2023
by   Lixin Yang, et al.
0

Enable neural networks to capture 3D geometrical-aware features is essential in multi-view based vision tasks. Previous methods usually encode the 3D information of multi-view stereo into the 2D features. In contrast, we present a novel method, named POEM, that directly operates on the 3D POints Embedded in the Multi-view stereo for reconstructing hand mesh in it. Point is a natural form of 3D information and an ideal medium for fusing features across views, as it has different projections on different views. Our method is thus in light of a simple yet effective idea, that a complex 3D hand mesh can be represented by a set of 3D points that 1) are embedded in the multi-view stereo, 2) carry features from the multi-view images, and 3) encircle the hand. To leverage the power of points, we design two operations: point-based feature fusion and cross-set point attention mechanism. Evaluation on three challenging multi-view datasets shows that POEM outperforms the state-of-the-art in hand mesh reconstruction. Code and models are available for research at https://github.com/lixiny/POEM.

READ FULL TEXT

page 4

page 9

research
11/30/2020

How Good MVSNets Are at Depth Fusion

We study the effects of the additional input to deep multi-view stereo m...
research
05/03/2015

Detail-preserving and Content-aware Variational Multi-view Stereo Reconstruction

Accurate recovery of 3D geometrical surfaces from calibrated 2D multi-vi...
research
12/06/2020

MVHM: A Large-Scale Multi-View Hand Mesh Benchmark for Accurate 3D Hand Pose Estimation

Estimating 3D hand poses from a single RGB image is challenging because ...
research
04/14/2018

Physics-driven Fire Modeling from Multi-view Images

Fire effects are widely used in various computer graphics applications s...
research
01/26/2022

DIREG3D: DIrectly REGress 3D Hands from Multiple Cameras

In this paper, we present DIREG3D, a holistic framework for 3D Hand Trac...
research
03/29/2023

ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance

Understanding 3D scenes from multi-view inputs has been proven to allevi...
research
04/15/2022

MVSTER: Epipolar Transformer for Efficient Multi-View Stereo

Learning-based Multi-View Stereo (MVS) methods warp source images into t...

Please sign up or login with your details

Forgot password? Click here to reset