Improving Zero-Shot Generalization for CLIP with Synthesized Prompts

07/14/2023
by   Zhengbo Wang, et al.
0

With the growing interest in pretrained vision-language models like CLIP, recent research has focused on adapting these models to downstream tasks. Despite achieving promising results, most existing methods require labeled data for all classes, which may not hold in real-world applications due to the long tail and Zipf's law. For example, some classes may lack labeled data entirely, such as emerging concepts. To address this problem, we propose a plug-and-play generative approach called SyntHesIzed Prompts (SHIP) to improve existing fine-tuning methods. Specifically, we follow variational autoencoders to introduce a generator that reconstructs the visual features by inputting the synthesized prompts and the corresponding class names to the textual encoder of CLIP. In this manner, we easily obtain the synthesized features for the remaining label-only classes. Thereafter, we fine-tune CLIP with off-the-shelf methods by combining labeled and synthesized features. Extensive experiments on base-to-new generalization, cross-dataset transfer learning, and generalized zero-shot learning demonstrate the superiority of our approach. The code is available at <https://github.com/mrflogs/SHIP>.

READ FULL TEXT
research
03/02/2016

Synthesized Classifiers for Zero-Shot Learning

Given semantic descriptions of object classes, zero-shot learning aims t...
research
03/15/2023

UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation

Large Language Models (LLMs) are popular for their impressive abilities,...
research
06/02/2023

Enhancing CLIP with CLIP: Exploring Pseudolabeling for Limited-Label Prompt Tuning

Fine-tuning vision-language models (VLMs) like CLIP to downstream tasks ...
research
05/30/2022

Prompt-aligned Gradient for Prompt Tuning

Thanks to the large pre-trained vision-language models (VLMs) like CLIP,...
research
06/16/2023

Neural Priming for Sample-Efficient Adaptation

We propose Neural Priming, a technique for adapting large pretrained mod...
research
09/14/2023

DePT: Decoupled Prompt Tuning

This work breaks through the Base-New Tradeoff (BNT)dilemma in prompt tu...
research
06/29/2023

Robust Roadside Perception for Autonomous Driving: an Annotation-free Strategy with Synthesized Data

Recently, with the rapid development in vehicle-to-infrastructure commun...

Please sign up or login with your details

Forgot password? Click here to reset