STM-UNet: An Efficient U-shaped Architecture Based on Swin Transformer and Multi-scale MLP for Medical Image Segmentation

by   Lei Shi, et al.

Automated medical image segmentation can assist doctors to diagnose faster and more accurate. Deep learning based models for medical image segmentation have made great progress in recent years. However, the existing models fail to effectively leverage Transformer and MLP for improving U-shaped architecture efficiently. In addition, the multi-scale features of the MLP have not been fully extracted in the bottleneck of U-shaped architecture. In this paper, we propose an efficient U-shaped architecture based on Swin Transformer and multi-scale MLP, namely STM-UNet. Specifically, the Swin Transformer block is added to skip connection of STM-UNet in form of residual connection, which can enhance the modeling ability of global features and long-range dependency. Meanwhile, a novel PCAS-MLP with parallel convolution module is designed and placed into the bottleneck of our architecture to contribute to the improvement of segmentation performance. The experimental results on ISIC 2016 and ISIC 2018 demonstrate the effectiveness of our proposed method. Our method also outperforms several state-of-the-art methods in terms of IoU and Dice. Our method has achieved a better trade-off between high segmentation accuracy and low model complexity.


page 3

page 5


TransAttUnet: Multi-level Attention-guided U-Net with Transformer for Medical Image Segmentation

With the development of deep encoder-decoder architectures and large-sca...

MS-DC-UNeXt: An MLP-based Multi-Scale Feature Learning Framework For X-ray Images

The advancement of deep learning theory and infrastructure is crucial in...

Swin transformers make strong contextual encoders for VHR image road extraction

Significant progress has been made in automatic road extra-ction or segm...

CMUNeXt: An Efficient Medical Image Segmentation Network based on Large Kernel and Skip Fusion

The U-shaped architecture has emerged as a crucial paradigm in the desig...

FAS-UNet: A Novel FAS-driven Unet to Learn Variational Image Segmentation

Solving variational image segmentation problems with hidden physics is o...

SegAN: Adversarial Network with Multi-scale L_1 Loss for Medical Image Segmentation

Inspired by classic generative adversarial networks (GAN), we propose a ...

3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of Transformer-MLP Paradigm for Dense Prediction in Medical Volume

Dense prediction in medical volume provides enriched guidance for clinic...

Please sign up or login with your details

Forgot password? Click here to reset