Towards Robust Prompts on Vision-Language Models

04/17/2023
by   Jindong Gu, et al.
0

With the advent of vision-language models (VLMs) that can perform in-context and prompt-based learning, how can we design prompting approaches that robustly generalize to distribution shift and can be used on novel classes outside the support set of the prompts? In this work, we first define two types of robustness to distribution shift on VLMs, namely, robustness on base classes (the classes included in the support set of prompts) and robustness on novel classes. Then, we study the robustness of existing in-context learning and prompt learning approaches, where we find that prompt learning performs robustly on test images from base classes, while it does not generalize well on images from novel classes. We propose robust prompt learning by integrating multiple-scale image features into the prompt, which improves both types of robustness. Comprehensive experiments are conducted to study the defined robustness on six benchmarks and show the effectiveness of our proposal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2022

Conditional Prompt Learning for Vision-Language Models

With the rise of powerful pre-trained vision-language models like CLIP, ...
research
10/24/2022

The Robustness Limits of SoTA Vision Models to Natural Variation

Recent state-of-the-art vision models introduced new architectures, lear...
research
06/02/2023

MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models

Large-scale language models have shown the ability to adapt to a new tas...
research
11/25/2019

Learning to Learn Words from Narrated Video

When we travel, we often encounter new scenarios we have never experienc...
research
10/28/2022

Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers

Large pre-trained language models have shown remarkable performance over...
research
11/06/2022

Robust Lottery Tickets for Pre-trained Language Models

Recent works on Lottery Ticket Hypothesis have shown that pre-trained la...
research
09/14/2023

Gradient constrained sharpness-aware prompt learning for vision-language models

This paper targets a novel trade-off problem in generalizable prompt lea...

Please sign up or login with your details

Forgot password? Click here to reset