TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers

11/29/2021
by   Yikang Ding, et al.
1

In this paper, we present TransMVSNet, based on our exploration of feature matching in multi-view stereo (MVS). We analogize MVS back to its nature of a feature matching task and therefore propose a powerful Feature Matching Transformer (FMT) to leverage intra- (self-) and inter- (cross-) attention to aggregate long-range context information within and across images. To facilitate a better adaptation of the FMT, we leverage an Adaptive Receptive Field (ARF) module to ensure a smooth transit in scopes of features and bridge different stages with a feature pathway to pass transformed features and gradients across different scales. In addition, we apply pair-wise feature correlation to measure similarity between features, and adopt ambiguity-reducing focal loss to strengthen the supervision. To the best of our knowledge, TransMVSNet is the first attempt to leverage Transformer into the task of MVS. As a result, our method achieves state-of-the-art performance on DTU dataset, Tanks and Temples benchmark, and BlendedMVS dataset. The code of our method will be made available at https://github.com/MegviiRobot/TransMVSNet .

READ FULL TEXT

page 6

page 7

page 8

page 13

page 14

page 15

page 16

page 17

research
05/28/2022

WT-MVSNet: Window-based Transformers for Multi-view Stereo

Recently, Transformers were shown to enhance the performance of multi-vi...
research
08/17/2023

Long-Range Grouping Transformer for Multi-View 3D Reconstruction

Nowadays, transformer networks have demonstrated superior performance in...
research
08/06/2023

Multi-scale Alternated Attention Transformer for Generalized Stereo Matching

Recent stereo matching networks achieves dramatic performance by introdu...
research
05/10/2020

Epipolar Transformers

A common approach to localize 3D human joints in a synchronized and cali...
research
04/24/2023

Explicit Correspondence Matching for Generalizable Neural Radiance Fields

We present a new generalizable NeRF method that is able to directly gene...
research
08/09/2021

AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network

In this paper, we present a novel recurrent multi-view stereo network ba...
research
06/21/2022

Enhancing Multi-view Stereo with Contrastive Matching and Weighted Focal Loss

Learning-based multi-view stereo (MVS) methods have made impressive prog...

Please sign up or login with your details

Forgot password? Click here to reset