Parameter-efficient Fine-tuning for Vision Transformers

03/29/2022
by   Xuehai He, et al.
2

In computer vision, it has achieved great success in adapting large-scale pretrained vision models (e.g., Vision Transformer) to downstream tasks via fine-tuning. Common approaches for fine-tuning either update all model parameters or leverage linear probes. In this paper, we aim to study parameter-efficient fine-tuning strategies for Vision Transformers on vision tasks. We formulate efficient fine-tuning as a subspace training problem and perform a comprehensive benchmarking over different efficient fine-tuning methods. We conduct an empirical study on each efficient fine-tuning method focusing on its performance alongside parameter cost. Furthermore, we also propose a parameter-efficient fine-tuning framework, which first selects submodules by measuring local intrinsic dimensions and then projects them into subspace for further decomposition via a novel Kronecker Adaptation method. We analyze and compare our method with a diverse set of baseline fine-tuning methods (including state-of-the-art methods for pretrained language models). Our method performs the best in terms of the tradeoff between accuracy and parameter efficiency across three commonly used image classification datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2022

Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision Tasks

Adapting large-scale pretrained models to various downstream tasks via f...
research
03/28/2023

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning

This paper presents a systematic overview and comparison of parameter-ef...
research
12/12/2022

Parameter-Efficient Finetuning of Transformers for Source Code

Pretrained Transformers achieve state-of-the-art performance in various ...
research
05/24/2023

READ: Recurrent Adaptation of Large Transformers

Fine-tuning large-scale Transformers has led to the explosion of many AI...
research
07/06/2023

Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain

Adapting pretrained language models to novel domains, such as clinical a...
research
07/25/2023

E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning

As the size of transformer-based models continues to grow, fine-tuning t...
research
11/17/2022

How to Fine-Tune Vision Models with SGD

SGD (with momentum) and AdamW are the two most used optimizers for fine-...

Please sign up or login with your details

Forgot password? Click here to reset