Revisiting Fine-Tuning Strategies for Self-supervised Medical Imaging Analysis
Despite the rapid progress in self-supervised learning (SSL), end-to-end fine-tuning still remains the dominant fine-tuning strategy for medical imaging analysis. However, it remains unclear whether this approach is truly optimal for effectively utilizing the pre-trained knowledge, especially considering the diverse categories of SSL that capture different types of features. In this paper, we first establish strong contrastive and restorative SSL baselines that outperform SOTA methods across four diverse downstream tasks. Building upon these strong baselines, we conduct an extensive fine-tuning analysis across multiple pre-training and fine-tuning datasets, as well as various fine-tuning dataset sizes. Contrary to the conventional wisdom of fine-tuning only the last few layers of a pre-trained network, we show that fine-tuning intermediate layers is more effective, with fine-tuning the second quarter (25-50 network being optimal for contrastive SSL whereas fine-tuning the third quarter (50-75 de-facto standard of end-to-end fine-tuning, our best fine-tuning strategy, which fine-tunes a shallower network consisting of the first three quarters (0-75 Additionally, using these insights, we propose a simple yet effective method to leverage the complementary strengths of multiple SSL models, resulting in enhancements of up to 3.57 fine-tuning strategies not only enhance the performance of individual SSL models, but also enable effective utilization of the complementary strengths offered by multiple SSL models, leading to significant improvements in self-supervised medical imaging analysis.
READ FULL TEXT