Scaling Shifting Your Features: A New Baseline for Efficient Model Tuning

10/17/2022
by   Dongze Lian, et al.
0

Existing fine-tuning methods either tune all parameters of the pre-trained model (full fine-tuning), which is not efficient, or only tune the last linear layer (linear probing), which suffers a significant accuracy drop compared to the full fine-tuning. In this paper, we propose a new parameter-efficient fine-tuning method termed as SSF, representing that researchers only need to Scale and Shift the deep Features extracted by a pre-trained model to catch up with the performance of full fine-tuning. In this way, SSF also surprisingly outperforms other parameter-efficient fine-tuning approaches even with a smaller number of tunable parameters. Furthermore, different from some existing parameter-efficient fine-tuning methods (e.g., Adapter or VPT) that introduce the extra parameters and computational cost in the training and inference stages, SSF only adds learnable parameters during the training stage, and these additional parameters can be merged into the original pre-trained model weights via re-parameterization in the inference phase. With the proposed SSF, our model obtains 2.46 performance improvement on FGVC and VTAB-1k in terms of Top-1 accuracy compared to the full fine-tuning but only fine-tuning about 0.3M parameters. We also conduct amounts of experiments in various model families (CNNs, Transformers, and MLPs) and datasets. Results on 26 image classification datasets in total and 3 robustness out-of-distribution datasets show the effectiveness of SSF. Code is available at https://github.com/dongzelian/SSF.

READ FULL TEXT
research
04/24/2020

How fine can fine-tuning be? Learning efficient language models

State-of-the-art performance on language understanding tasks is now achi...
research
04/11/2023

Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond

Recently, fine-tuning pre-trained code models such as CodeBERT on downst...
research
06/13/2023

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

We present Generalized LoRA (GLoRA), an advanced approach for universal ...
research
04/30/2023

Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation

Recently, transformers have shown strong ability as visual feature extra...
research
03/19/2023

Trainable Projected Gradient Method for Robust Fine-tuning

Recent studies on transfer learning have shown that selectively fine-tun...
research
07/16/2023

Tangent Transformers for Composition, Privacy and Removal

We introduce Tangent Attention Fine-Tuning (TAFT), a method for fine-tun...
research
05/05/2023

Physics-based network fine-tuning for robust quantitative susceptibility mapping from high-pass filtered phase

Purpose: To improve the generalization ability of convolutional neural n...

Please sign up or login with your details

Forgot password? Click here to reset