Fusing finetuned models for better pretraining

04/06/2022
by   Leshem Choshen, et al.
0

Pretrained models are the standard starting point for training. This approach consistently outperforms the use of a random initialization. However, pretraining is a costly endeavour that few can undertake. In this paper, we create better base models at hardly any cost, by fusing multiple existing fine tuned models into one. Specifically, we fuse by averaging the weights of these models. We show that the fused model results surpass the pretrained model ones. We also show that fusing is often better than intertraining. We find that fusing is less dependent on the target task. Furthermore, weight decay nullifies intertraining effects but not those of fusing.

READ FULL TEXT
research
06/02/2016

Multi-pretrained Deep Neural Network

Pretraining is widely used in deep neutral network and one of the most f...
research
10/11/2019

A generalized intelligent quality-based approach for fusing multi-source information

In this paper, we propose a generalized intelligent quality-based approa...
research
10/31/2022

Where to start? Analyzing the potential value of intermediate models

Previous studies observed that finetuned models may be better base model...
research
02/04/2023

Representation Deficiency in Masked Language Modeling

Masked Language Modeling (MLM) has been one of the most prominent approa...
research
08/04/2022

Prompt Tuning for Generative Multimodal Pretrained Models

Prompt tuning has become a new paradigm for model tuning and it has demo...
research
05/22/2017

Cost-Performance Tradeoffs in Fusing Unreliable Computational Units

We investigate fusing several unreliable computational units that perfor...
research
05/30/2018

Context Exploitation using Hierarchical Bayesian Models

We consider the problem of how to improve automatic target recognition b...

Please sign up or login with your details

Forgot password? Click here to reset