Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space

07/07/2022
by   Wenqi Shao, et al.
0

This paper addresses an important problem of ranking the pre-trained deep neural networks and screening the most transferable ones for downstream tasks. It is challenging because the ground-truth model ranking for each task can only be generated by fine-tuning the pre-trained models on the target dataset, which is brute-force and computationally expensive. Recent advanced methods proposed several lightweight transferability metrics to predict the fine-tuning results. However, these approaches only capture static representations but neglect the fine-tuning dynamics. To this end, this paper proposes a new transferability metric, called Self-challenging Fisher Discriminant Analysis (SFDA), which has many appealing benefits that existing works do not have. First, SFDA can embed the static features into a Fisher space and refine them for better separability between classes. Second, SFDA uses a self-challenging mechanism to encourage different pre-trained models to differentiate on hard examples. Third, SFDA can easily select multiple pre-trained models for the model ensemble. Extensive experiments on 33 pre-trained models of 11 downstream tasks show that SFDA is efficient, effective, and robust when measuring the transferability of pre-trained models. For instance, compared with the state-of-the-art method NLEEP, SFDA demonstrates an average of 59.1% gain while bringing 22.5x speedup in wall-clock time. The code will be available at <https://github.com/TencentARC/SFDA>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2023

Exploring Model Transferability through the Lens of Potential Energy

Transfer learning has become crucial in computer vision tasks due to the...
research
08/11/2023

Foundation Model is Efficient Multimodal Multitask Model Selector

This paper investigates an under-explored but important problem: given a...
research
10/17/2022

ZooD: Exploiting Model Zoo for Out-of-Distribution Generalization

Recent advances on large-scale pre-training have shown great potentials ...
research
02/22/2021

LogME: Practical Assessment of Pre-trained Models for Transfer Learning

This paper studies task adaptive pre-trained model selection, an underex...
research
08/03/2023

ETran: Energy-Based Transferability Estimation

This paper addresses the problem of ranking pre-trained models for objec...
research
09/14/2021

YES SIR!Optimizing Semantic Space of Negatives with Self-Involvement Ranker

Pre-trained model such as BERT has been proved to be an effective tool f...
research
03/16/2023

Patch-Token Aligned Bayesian Prompt Learning for Vision-Language Models

For downstream applications of vision-language pre-trained models, there...

Please sign up or login with your details

Forgot password? Click here to reset