LogME: Practical Assessment of Pre-trained Models for Transfer Learning

02/22/2021
by   Kaichao You, et al.
0

This paper studies task adaptive pre-trained model selection, an underexplored problem of assessing pre-trained models so that models suitable for the task can be selected from the model zoo without fine-tuning. A pilot work <cit.> addressed the problem in transferring supervised pre-trained models to classification tasks, but it cannot handle emerging unsupervised pre-trained models or regression tasks. In pursuit of a practical assessment method, we propose to estimate the maximum evidence (marginalized likelihood) of labels given features extracted by pre-trained models. The maximum evidence is less prone to over-fitting than the likelihood, and its expensive computation can be dramatically reduced by our carefully designed algorithm. The Logarithm of Maximum Evidence (LogME) can be used to assess pre-trained models for transfer learning: a pre-trained model with high LogME is likely to have good transfer performance. LogME is fast, accurate, and general, characterizing it as the first practical assessment method for transfer learning. Compared to brute-force fine-tuning, LogME brings over 3000× speedup in wall-clock time. It outperforms prior methods by a large margin in their setting and is applicable to new settings that prior methods cannot deal with. It is general enough to diverse pre-trained models (supervised pre-trained and unsupervised pre-trained), downstream tasks (classification and regression), and modalities (vision and language). Code is at <https://github.com/thuml/LogME>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2021

TransTailor: Pruning the Pre-trained Model for Improved Transfer Learning

The increasing of pre-trained models has significantly facilitated the p...
research
10/20/2021

Ranking and Tuning Pre-trained Models: A New Paradigm of Exploiting Model Hubs

Pre-trained model hubs with many pre-trained models (PTMs) have been a c...
research
08/29/2023

Exploring Model Transferability through the Lens of Potential Energy

Transfer learning has become crucial in computer vision tasks due to the...
research
10/14/2020

Deep Ensembles for Low-Data Transfer Learning

In the low-data regime, it is difficult to train good supervised models ...
research
08/17/2023

ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model Reuse

The rapid expansion of foundation pre-trained models and their fine-tune...
research
08/05/2020

Duality Diagram Similarity: a generic framework for initialization selection in task transfer learning

In this paper, we tackle an open research question in transfer learning,...
research
07/07/2022

Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space

This paper addresses an important problem of ranking the pre-trained dee...

Please sign up or login with your details

Forgot password? Click here to reset