FacT: Factor-Tuning for Lightweight Adaptation on Vision Transformer

12/06/2022
by   Shibo Jie, et al.
0

Recent work has explored the potential to adapt a pre-trained vision transformer (ViT) by updating only a few parameters so as to improve storage efficiency, called parameter-efficient transfer learning (PETL). Current PETL methods have shown that by tuning only 0.5 adapted to downstream tasks with even better performance than full fine-tuning. In this paper, we aim to further promote the efficiency of PETL to meet the extreme storage constraint in real-world applications. To this end, we propose a tensorization-decomposition framework to store the weight increments, in which the weights of each ViT are tensorized into a single 3D tensor, and their increments are then decomposed into lightweight factors. In the fine-tuning process, only the factors need to be updated and stored, termed Factor-Tuning (FacT). On VTAB-1K benchmark, our method performs on par with NOAH, the state-of-the-art PETL method, while being 5x more parameter-efficient. We also present a tiny version that only uses 8K (0.01 parameters but outperforms full fine-tuning and many other PETL methods such as VPT and BitFit. In few-shot settings, FacT also beats all PETL baselines using the fewest parameters, demonstrating its strong capability in the low-data regime.

READ FULL TEXT
research
10/31/2022

AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

Standard fine-tuning of large pre-trained language models (PLMs) for dow...
research
09/15/2023

SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels

Pre-trained vision transformers have strong representation benefits to v...
research
04/04/2023

Strong Baselines for Parameter Efficient Few-Shot Fine-tuning

Few-shot classification (FSC) entails learning novel classes given only ...
research
08/15/2022

Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets

While parameter efficient tuning (PET) methods have shown great potentia...
research
07/31/2023

Revisiting the Parameter Efficiency of Adapters from the Perspective of Precision Redundancy

Current state-of-the-art results in computer vision depend in part on fi...
research
09/11/2023

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

The Mixture of Experts (MoE) is a widely known neural architecture where...
research
07/14/2022

Convolutional Bypasses Are Better Vision Transformer Adapters

The pretrain-then-finetune paradigm has been widely adopted in computer ...

Please sign up or login with your details

Forgot password? Click here to reset