DeepAI AI Chat
Log In Sign Up

3DVNet: Multi-View Depth Prediction and Volumetric Refinement

by   Alexander Rich, et al.
The Regents of the University of California

We present 3DVNet, a novel multi-view stereo (MVS) depth-prediction method that combines the advantages of previous depth-based and volumetric MVS approaches. Our key idea is the use of a 3D scene-modeling network that iteratively updates a set of coarse depth predictions, resulting in highly accurate predictions which agree on the underlying scene geometry. Unlike existing depth-prediction techniques, our method uses a volumetric 3D convolutional neural network (CNN) that operates in world space on all depth maps jointly. The network can therefore learn meaningful scene-level priors. Furthermore, unlike existing volumetric MVS techniques, our 3D CNN operates on a feature-augmented point cloud, allowing for effective aggregation of multi-view information and flexible iterative refinement of depth maps. Experimental results show our method exceeds state-of-the-art accuracy in both depth prediction and 3D reconstruction metrics on the ScanNet dataset, as well as a selection of scenes from the TUM-RGBD and ICL-NUIM datasets. This shows that our method is both effective and generalizes to new settings.


page 4

page 8


Point-Based Multi-View Stereo Network

We introduce Point-MVSNet, a novel point-based deep framework for multi-...

Point-Based Neural Rendering with Per-View Optimization

There has recently been great interest in neural rendering methods. Some...

Learn-to-Score: Efficient 3D Scene Exploration by Predicting View Utility

Camera equipped drones are nowadays being used to explore large scenes a...

3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation

We present 3DMV, a novel method for 3D semantic scene segmentation of RG...

StereoDRNet: Dilated Residual Stereo Net

We propose a system that uses a convolution neural network (CNN) to esti...

Mesh-based Camera Pairs Selection and Occlusion-Aware Masking for Mesh Refinement

Many Multi-View-Stereo algorithms extract a 3D mesh model of a scene, af...