DeepAI AI Chat
Log In Sign Up

nnFormer: Interleaved Transformer for Volumetric Segmentation

by   Hong-Yu Zhou, et al.
Xiamen University
Association for Computing Machinery
The University of Hong Kong

Transformers, the default model of choices in natural language processing, have drawn scant attention from the medical imaging community. Given the ability to exploit long-term dependencies, transformers are promising to help atypical convolutional neural networks (convnets) to overcome its inherent shortcomings of spatial inductive bias. However, most of recently proposed transformer-based segmentation approaches simply treated transformers as assisted modules to help encode global context into convolutional representations without investigating how to optimally combine self-attention (i.e., the core of transformers) with convolution. To address this issue, in this paper, we introduce nnFormer (i.e., Not-aNother transFormer), a powerful segmentation model with an interleaved architecture based on empirical combination of self-attention and convolution. In practice, nnFormer learns volumetric representations from 3D local volumes. Compared to the naive voxel-level self-attention implementation, such volume-based operations help to reduce the computational complexity by approximate 98 ACDC datasets, respectively. In comparison to prior-art network configurations, nnFormer achieves tremendous improvements over previous transformer-based methods on two commonly used datasets Synapse and ACDC. For instance, nnFormer outperforms Swin-UNet by over 7 percents on Synapse. Even when compared to nnUNet, currently the best performing fully-convolutional medical segmentation network, nnFormer still provides slightly better performance on Synapse and ACDC.


page 13

page 14


Medical Transformer: Gated Axial-Attention for Medical Image Segmentation

Over the past decade, Deep Convolutional Neural Networks have been widel...

Mitigation of Spatial Nonstationarity with Vision Transformers

Spatial nonstationarity, the location variance of features' statistical ...

Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation

This paper presents a novel cost aggregation network, called Volumetric ...

Deep Reinforcement Learning with Swin Transformer

Transformers are neural network models that utilize multiple layers of s...

Dual-former: Hybrid Self-attention Transformer for Efficient Image Restoration

Recently, image restoration transformers have achieved comparable perfor...

Hepatic vessel segmentation based on 3Dswin-transformer with inductive biased multi-head self-attention

Purpose: Segmentation of liver vessels from CT images is indispensable p...

Transformer Assisted Convolutional Network for Cell Instance Segmentation

Region proposal based methods like R-CNN and Faster R-CNN models have pr...

Code Repositories