Attentional Aggregation of Deep Feature Sets for Multi-view 3D Reconstruction

08/02/2018
by   Bo Yang, et al.
0

We study the problem of recovering an underlying 3D shape from a set of images. Existing learning based approaches usually resort to recurrent neural nets, e.g., GRU, or intuitive pooling operations, e.g., max/mean pooling, to fuse multiple deep features encoded from input images. However, GRU based approaches are unable to consistently estimate 3D shapes given the same set of input images as the recurrent unit is permutation variant. It is also unlikely to refine the 3D shape given more images due to the long-term memory loss of GRU. The widely used pooling approaches are limited to capturing only the first order/moment information, ignoring other valuable features. In this paper, we present a new feed-forward neural module, named AttSets, together with a dedicated training algorithm, named JTSO, to attentionally aggregate an arbitrary sized deep feature set for multi-view 3D reconstruction. AttSets is permutation invariant, computationally efficient, flexible and robust to multiple input images. We thoroughly evaluate various properties of AttSets on large public datasets. Extensive experiments show AttSets together with JTSO algorithm significantly outperforms existing aggregation approaches.

READ FULL TEXT

page 10

page 12

research
01/31/2019

Pix2Vox: Context-aware 3D Reconstruction from Single and Multi-view Images

Recovering the 3D representation of an object from single-view or multi-...
research
03/14/2022

VPFusion: Joint 3D Volume and Pixel-Aligned Feature Fusion for Single and Multi-view 3D Reconstruction

We introduce a unified single and multi-view neural implicit 3D reconstr...
research
03/26/2018

Learning the Multiple Traveling Salesmen Problem with Permutation Invariant Pooling Networks

While there are optimal TSP solvers as well as recent learning-based app...
research
04/01/2019

Equivariant Multi-View Networks

Several approaches to 3D vision tasks process multiple views of the inpu...
research
08/27/2019

HRGE-Net: Hierarchical Relational Graph Embedding Network for Multi-view 3D Shape Recognition

View-based approach that recognizes 3D shape through its projected 2D im...
research
04/01/2023

SeSDF: Self-evolved Signed Distance Field for Implicit 3D Clothed Human Reconstruction

We address the problem of clothed human reconstruction from a single ima...
research
04/21/2022

Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images

We study the problem of shape generation in 3D mesh representation from ...

Please sign up or login with your details

Forgot password? Click here to reset