Prompt Generation Networks for Efficient Adaptation of Frozen Vision Transformers

10/12/2022
by   Jochem Loedeman, et al.
0

Large-scale pretrained models, especially those trained from vision-language data have demonstrated the tremendous value that can be gained from both larger training datasets and models. Thus, in order to benefit from these developments, there is renewed interest in transfer learning and adapting models from large-scale general pretraining to particular downstream tasks. However, the continuously increasing size of the models means that even the classic approach of finetuning is becoming infeasible for all but big institutions. Prompt leaning has emerged as a flexible way to adapt models by solely learning additional inputs to a model that is kept frozen, but so far performances remained inferior to finetuning. To address this, we propose the Prompt Generation Network (PGN) that generates input-dependent prompts by sampling from a learned library of tokens. We show the PGN is effective in adapting pretrained models to various new datasets. It surpasses previous prompt-learning methods by a large margin and even fullfinetuning on 5 out of 12 datasets while requiring 100x less parameters. PGN can even be used for training and inferring on multiple datasets simultaneously and learns to allocate tokens between domains. Given these findings, we conclude that PGN is a viable and scalable approach for downstream adaptation of frozen models. Code is available at https://github.com/jochemloedeman/PGN.

READ FULL TEXT
research
07/21/2022

TinyViT: Fast Pretraining Distillation for Small Vision Transformers

Vision transformer (ViT) recently has drawn great attention in computer ...
research
06/08/2023

Improving Visual Prompt Tuning for Self-supervised Vision Transformers

Visual Prompt Tuning (VPT) is an effective tuning method for adapting pr...
research
08/08/2022

Investigating Efficiently Extending Transformers for Long Input Summarization

While large pretrained Transformer models have proven highly capable at ...
research
06/16/2023

Group Orthogonalization Regularization For Vision Models Adaptation and Robustness

As neural networks become deeper, the redundancy within their parameters...
research
06/29/2021

Zoo-Tuning: Adaptive Transfer from a Zoo of Models

With the development of deep networks on various large-scale datasets, a...
research
06/30/2023

Stitched ViTs are Flexible Vision Backbones

Large pretrained plain vision Transformers (ViTs) have been the workhors...
research
06/16/2023

Neural Priming for Sample-Efficient Adaptation

We propose Neural Priming, a technique for adapting large pretrained mod...

Please sign up or login with your details

Forgot password? Click here to reset