DeepVideoMVS: Multi-View Stereo on Video with Recurrent Spatio-Temporal Fusion

12/03/2020
by   Arda Düzçeker, et al.
0

We propose an online multi-view depth prediction approach on posed video streams, where the scene geometry information computed in the previous time steps is propagated to the current time step in an efficient and geometrically plausible way. The backbone of our approach is a real-time capable, lightweight encoder-decoder that relies on cost volumes computed from pairs of images. We extend it by placing a ConvLSTM cell at the bottleneck layer, which compresses an arbitrary amount of past information in its states. The novelty lies in propagating the hidden state of the cell by accounting for the viewpoint changes between time steps. At a given time step, we warp the previous hidden state into the current camera plane using the previous depth prediction. Our extension brings only a small overhead of computation time and memory consumption, while improving the depth predictions significantly. As a result, we outperform the existing state-of-the-art multi-view stereo methods on most of the evaluated metrics in hundreds of indoor scenes while maintaining a real-time performance. Code available: https://github.com/ardaduz/deep-video-mvs

READ FULL TEXT

page 1

page 4

page 5

page 8

page 11

page 13

page 14

research
02/27/2019

Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference

Deep learning has recently demonstrated its excellent performance for mu...
research
12/09/2021

IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo

We present IterMVS, a new data-driven method for high-resolution multi-v...
research
04/12/2019

Multi-View Stereo by Temporal Nonparametric Fusion

We propose a novel idea for depth estimation from unstructured multi-vie...
research
04/04/2022

RayMVSNet: Learning Ray-based 1D Implicit Fields for Accurate Multi-View Stereo

Learning-based multi-view stereo (MVS) has by far centered around 3D con...
research
12/04/2021

Generalized Binary Search Network for Highly-Efficient Multi-View Stereo

Multi-view Stereo (MVS) with known camera parameters is essentially a 1D...
research
04/22/2022

Multi-view Information Bottleneck Without Variational Approximation

By "intelligently" fusing the complementary information across different...
research
03/09/2023

3D Video Loops from Asynchronous Input

Looping videos are short video clips that can be looped endlessly withou...

Please sign up or login with your details

Forgot password? Click here to reset