How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models?

06/09/2023
by   Yifei Ming, et al.
0

Recent large vision-language models such as CLIP have shown remarkable out-of-distribution (OOD) detection and generalization performance. However, their zero-shot in-distribution (ID) accuracy is often limited for downstream datasets. Recent CLIP-based fine-tuning methods such as prompt learning have demonstrated significant improvements in ID classification and OOD generalization where OOD labels are available. Nonetheless, it remains unclear whether the model is reliable to semantic shifts without OOD labels. In this paper, we aim to bridge the gap and present a comprehensive study to understand how fine-tuning impact OOD detection for few-shot downstream tasks. By framing OOD detection as multi-modal concept matching, we establish a connection between fine-tuning methods and various OOD scores. Our results suggest that a proper choice of OOD scores is essential for CLIP-based fine-tuning. In particular, the maximum concept matching (MCM) score provides a promising solution consistently. We also show that prompt learning demonstrates the state-of-the-art OOD detection performance over the zero-shot counterpart.

READ FULL TEXT

page 3

page 8

research
11/28/2022

SuS-X: Training-Free Name-Only Transfer of Vision-Language Models

Contrastive Language-Image Pre-training (CLIP) has emerged as a simple y...
research
05/24/2022

Toxicity Detection with Generative Prompt-based Inference

Due to the subtleness, implicity, and different possible interpretations...
research
05/10/2023

Do LLMs Understand User Preferences? Evaluating LLMs On User Rating Prediction

Large Language Models (LLMs) have demonstrated exceptional capabilities ...
research
06/16/2023

Clickbait Detection via Large Language Models

Clickbait, which aims to induce users with some surprising and even thri...
research
09/15/2023

Bridging Topic, Domain, and Language Shifts: An Evaluation of Comprehensive Out-of-Distribution Scenarios

Language models (LMs) excel in in-distribution (ID) scenarios where trai...
research
11/29/2022

Context-Aware Robust Fine-Tuning

Contrastive Language-Image Pre-trained (CLIP) models have zero-shot abil...
research
05/23/2022

Looking for a Handsome Carpenter! Debiasing GPT-3 Job Advertisements

The growing capability and availability of generative language models ha...

Please sign up or login with your details

Forgot password? Click here to reset