Multimodal Transformer for Material Segmentation

09/07/2023
by   Md Kaykobad Reza, et al.
0

Leveraging information across diverse modalities is known to enhance performance on multimodal segmentation tasks. However, effectively fusing information from different modalities remains challenging due to the unique characteristics of each modality. In this paper, we propose a novel fusion strategy that can effectively fuse information from different combinations of four different modalities: RGB, Angle of Linear Polarization (AoLP), Degree of Linear Polarization (DoLP) and Near-Infrared (NIR). We also propose a new model named Multi-Modal Segmentation Transformer (MMSFormer) that incorporates the proposed fusion strategy to perform multimodal material segmentation. MMSFormer achieves 52.05 Material Segmentation (MCubeS) dataset. For instance, our method provides significant improvement in detecting gravel (+10.4 Ablation studies show that different modules in the fusion block are crucial for overall model performance. Furthermore, our ablation studies also highlight the capacity of different input modalities to improve performance in the identification of different types of materials. The code and pretrained models will be made available at https://github.com/csiplab/MMSFormer.

READ FULL TEXT

page 3

page 5

page 7

research
08/31/2022

NestedFormer: Nested Modality-Aware Transformer for Brain Tumor Segmentation

Multi-modal MR imaging is routinely used in clinical practice to diagnos...
research
08/26/2022

TFusion: Transformer based N-to-One Multimodal Fusion Block

People perceive the world with different senses, such as sight, hearing,...
research
04/26/2022

TranSiam: Fusing Multimodal Visual Features Using Transformer for Medical Image Segmentation

Automatic segmentation of medical images based on multi-modality is an i...
research
07/04/2023

H-DenseFormer: An Efficient Hybrid Densely Connected Transformer for Multimodal Tumor Segmentation

Recently, deep learning methods have been widely used for tumor segmenta...
research
11/10/2020

Deep Multimodal Fusion by Channel Exchanging

Deep multimodal fusion by using multiple sources of data for classificat...
research
08/21/2023

Deep Metric Loss for Multimodal Learning

Multimodal learning often outperforms its unimodal counterparts by explo...
research
03/29/2022

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Multimodal learning helps to comprehensively understand the world, by in...

Please sign up or login with your details

Forgot password? Click here to reset