STU-Net: Scalable and Transferable Medical Image Segmentation Models Empowered by Large-Scale Supervised Pre-training

by   Ziyan Huang, et al.

Large-scale models pre-trained on large-scale datasets have profoundly advanced the development of deep learning. However, the state-of-the-art models for medical image segmentation are still small-scale, with their parameters only in the tens of millions. Further scaling them up to higher orders of magnitude is rarely explored. An overarching goal of exploring large-scale models is to train them on large-scale medical segmentation datasets for better transfer capacities. In this work, we design a series of Scalable and Transferable U-Net (STU-Net) models, with parameter sizes ranging from 14 million to 1.4 billion. Notably, the 1.4B STU-Net is the largest medical image segmentation model to date. Our STU-Net is based on nnU-Net framework due to its popularity and impressive performance. We first refine the default convolutional blocks in nnU-Net to make them scalable. Then, we empirically evaluate different scaling combinations of network depth and width, discovering that it is optimal to scale model depth and width together. We train our scalable STU-Net models on a large-scale TotalSegmentator dataset and find that increasing model size brings a stronger performance gain. This observation reveals that a large model is promising in medical image segmentation. Furthermore, we evaluate the transferability of our model on 14 downstream datasets for direct inference and 3 datasets for further fine-tuning, covering various modalities and segmentation targets. We observe good performance of our pre-trained model in both direct inference and fine-tuning. The code and pre-trained models are available at


page 8

page 10


Segment Anything in Medical Images

Segment anything model (SAM) has revolutionized natural image segmentati...

From Patch to Image Segmentation using Fully Convolutional Networks - Application to Retinal Images

In general, deep learning based models require a tremendous amount of sa...

A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal Multi-Organ Segmentation

Although deep learning have revolutionized abdominal multi-organ segment...

MedNeXt: Transformer-driven Scaling of ConvNets for Medical Image Segmentation

There has been exploding interest in embracing Transformer-based archite...

Prompt-based Tuning of Transformer Models for Multi-Center Medical Image Segmentation

Medical image segmentation is a vital healthcare endeavor requiring prec...

UniSeg: A Prompt-driven Universal Segmentation Model as well as A Strong Representation Learner

The universal model emerges as a promising trend for medical image segme...

Evaluate Fine-tuning Strategies for Fetal Head Ultrasound Image Segmentation with U-Net

Fetal head segmentation is a crucial step in measuring the fetal head ci...

Please sign up or login with your details

Forgot password? Click here to reset