Prompt-aligned Gradient for Prompt Tuning

05/30/2022
by   Beier Zhu, et al.
0

Thanks to the large pre-trained vision-language models (VLMs) like CLIP, we can craft a zero-shot classifier by "prompt", e.g., the confidence score of an image being "[CLASS]" can be obtained by using the VLM provided similarity measure between the image and the prompt sentence "a photo of a [CLASS]". Therefore, prompt shows a great potential for fast adaptation of VLMs to downstream tasks if we fine-tune the prompt-based similarity measure. However, we find a common failure that improper fine-tuning may not only undermine the prompt's inherent prediction for the task-related classes, but also for other classes in the VLM vocabulary. Existing methods still address this problem by using traditional anti-overfitting techniques such as early stopping and data augmentation, which lack a principled solution specific to prompt. We present Prompt-aligned Gradient, dubbed ProGrad, to prevent prompt tuning from forgetting the the general knowledge learned from VLMs. In particular, ProGrad only updates the prompt whose gradient is aligned (or non-conflicting) to the "general direction", which is represented as the gradient of the KL loss of the pre-defined prompt prediction. Extensive experiments demonstrate the stronger few-shot generalization ability of ProGrad over state-of-the-art prompt tuning methods. Codes are available at https://github.com/BeierZhu/Prompt-align.

READ FULL TEXT

page 2

page 6

research
10/06/2021

KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier

Pre-trained models are widely used in fine-tuning downstream tasks with ...
research
01/29/2023

Debiased Fine-Tuning for Vision-language Models by Prompt Regularization

We present a new paradigm for fine-tuning large-scale visionlanguage pre...
research
07/14/2023

Improving Zero-Shot Generalization for CLIP with Synthesized Prompts

With the growing interest in pretrained vision-language models like CLIP...
research
05/03/2022

Embedding Hallucination for Few-Shot Language Fine-tuning

Few-shot language learners adapt knowledge from a pre-trained model to r...
research
02/02/2023

CLIPood: Generalizing CLIP to Out-of-Distributions

Out-of-distribution (OOD) generalization, where the model needs to handl...
research
10/12/2022

Using Massive Multilingual Pre-Trained Language Models Towards Real Zero-Shot Neural Machine Translation in Clinical Domain

Massively multilingual pre-trained language models (MMPLMs) are develope...
research
10/18/2020

Training Stronger Baselines for Learning to Optimize

Learning to optimize (L2O) has gained increasing attention since classic...

Please sign up or login with your details

Forgot password? Click here to reset