Multi-scale Alternated Attention Transformer for Generalized Stereo Matching

08/06/2023
by   Wei Miao, et al.
0

Recent stereo matching networks achieves dramatic performance by introducing epipolar line constraint to limit the matching range of dual-view. However, in complicated real-world scenarios, the feature information based on intra-epipolar line alone is too weak to facilitate stereo matching. In this paper, we present a simple but highly effective network called Alternated Attention U-shaped Transformer (AAUformer) to balance the impact of epipolar line in dual and single view respectively for excellent generalization performance. Compared to other models, our model has several main designs: 1) to better liberate the local semantic features of the single-view at pixel level, we introduce window self-attention to break the limits of intra-row self-attention and completely replace the convolutional network for denser features before cross-matching; 2) the multi-scale alternated attention backbone network was designed to extract invariant features in order to achieves the coarse-to-fine matching process for hard-to-discriminate regions. We performed a series of both comparative studies and ablation studies on several mainstream stereo matching datasets. The results demonstrate that our model achieves state-of-the-art on the Scene Flow dataset, and the fine-tuning performance is competitive on the KITTI 2015 dataset. In addition, for cross generalization experiments on synthetic and real-world datasets, our model outperforms several state-of-the-art works.

READ FULL TEXT

page 1

page 4

page 7

page 8

page 12

page 13

research
04/25/2019

Multi-scale Cross-form Pyramid Network for Stereo Matching

Stereo matching plays an indispensable part in autonomous driving, robot...
research
11/29/2021

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers

In this paper, we present TransMVSNet, based on our exploration of featu...
research
07/31/2021

Multi-scale Matching Networks for Semantic Correspondence

Deep features have been proven powerful in building accurate dense seman...
research
07/05/2021

What Makes for Hierarchical Vision Transformer?

Recent studies show that hierarchical Vision Transformer with interleave...
research
03/22/2022

Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation

With the advent of convolutional neural networks, stereo matching algori...
research
07/31/2021

Towards Adversarially Robust and Domain Generalizable Stereo Matching by Rethinking DNN Feature Backbones

Stereo matching has recently witnessed remarkable progress using Deep Ne...
research
07/11/2023

ResMatch: Residual Attention Learning for Local Feature Matching

Attention-based graph neural networks have made great progress in featur...

Please sign up or login with your details

Forgot password? Click here to reset