Where to start? Analyzing the potential value of intermediate models

10/31/2022
by   Leshem Choshen, et al.
0

Previous studies observed that finetuned models may be better base models than the vanilla pretrained model. Such a model, finetuned on some source dataset, may provide a better starting point for a new finetuning process on a desired target dataset. Here, we perform a systematic analysis of this intertraining scheme, over a wide range of English classification tasks. Surprisingly, our analysis suggests that the potential intertraining gain can be analyzed independently for the target dataset under consideration, and for a base model being considered as a starting point. This is in contrast to current perception that the alignment between the target dataset and the source dataset used to generate the base model is a major factor in determining intertraining success. We analyze different aspects that contribute to each. Furthermore, we leverage our analysis to propose a practical and efficient approach to determine if and how to select a base model in real-world settings. Last, we release an updating ranking of best models in the HuggingFace hub per architecture https://ibm.github.io/model-recycling/.

READ FULL TEXT

page 3

page 14

page 17

page 18

page 23

page 24

research
04/06/2022

Fusing finetuned models for better pretraining

Pretrained models are the standard starting point for training. This app...
research
10/03/2019

An empirical study of pretrained representations for few-shot classification

Recent algorithms with state-of-the-art few-shot classification results ...
research
07/16/2022

Transfer learning for time series classification using synthetic data generation

In this paper, we propose an innovative Transfer learning for Time serie...
research
09/14/2023

DePT: Decoupled Prompt Tuning

This work breaks through the Base-New Tradeoff (BNT)dilemma in prompt tu...
research
09/15/2021

Miðeind's WMT 2021 submission

We present Miðeind's submission for the English→Icelandic and Icelandic→...
research
03/04/2021

A Systematic Evaluation of Transfer Learning and Pseudo-labeling with BERT-based Ranking Models

Due to high annotation costs, making the best use of existing human-crea...
research
10/20/2022

Towards Sustainable Self-supervised Learning

Although increasingly training-expensive, most self-supervised learning ...

Please sign up or login with your details

Forgot password? Click here to reset