A Re-Parameterized Vision Transformer (ReVT) for Domain-Generalized Semantic Segmentation

08/25/2023
by   Jan-Aike Termöhlen, et al.
0

The task of semantic segmentation requires a model to assign semantic labels to each pixel of an image. However, the performance of such models degrades when deployed in an unseen domain with different data distributions compared to the training domain. We present a new augmentation-driven approach to domain generalization for semantic segmentation using a re-parameterized vision transformer (ReVT) with weight averaging of multiple models after training. We evaluate our approach on several benchmark datasets and achieve state-of-the-art mIoU performance of 47.3 and of 50.1 datasets. At the same time, our method requires fewer parameters and reaches a higher frame rate than the best prior art. It is also easy to implement and, unlike network ensembles, does not add any computational complexity during inference.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset