Towards Realistic Unsupervised Fine-tuning with CLIP

08/24/2023
by   Jian Liang, et al.
0

The emergence of vision-language models (VLMs), such as CLIP, has spurred a significant research effort towards their application for downstream supervised learning tasks. Although some previous studies have explored the unsupervised fine-tuning of CLIP, they often rely on prior knowledge in the form of class names associated with ground truth labels. In this paper, we delve into a realistic unsupervised fine-tuning scenario by assuming that the unlabeled data might contain out-of-distribution samples from unknown classes. Furthermore, we emphasize the importance of simultaneously enhancing out-of-distribution detection capabilities alongside the recognition of instances associated with predefined class labels. To tackle this problem, we present a simple, efficient, and effective fine-tuning approach called Universal Entropy Optimization (UEO). UEO leverages sample-level confidence to approximately minimize the conditional entropy of confident instances and maximize the marginal entropy of less confident instances. Apart from optimizing the textual prompts, UEO also incorporates optimization of channel-wise affine transformations within the visual branch of CLIP. Through extensive experiments conducted across 15 domains and 4 different types of prior knowledge, we demonstrate that UEO surpasses baseline methods in terms of both generalization and out-of-distribution detection.

READ FULL TEXT

page 2

page 17

research
01/29/2023

Debiased Fine-Tuning for Vision-language Models by Prompt Regularization

We present a new paradigm for fine-tuning large-scale visionlanguage pre...
research
08/22/2023

Unsupervised Prototype Adapter for Vision-Language Models

Recently, large-scale pre-trained vision-language models (e.g. CLIP and ...
research
05/02/2022

Robust Fine-tuning via Perturbation and Interpolation from In-batch Instances

Fine-tuning pretrained language models (PLMs) on downstream tasks has be...
research
05/24/2023

An Unsupervised Method for Estimating Class Separability of Datasets with Application to LLMs Fine-Tuning

This paper proposes an unsupervised method that leverages topological ch...
research
12/26/2018

Informative Object Annotations: Tell Me Something I Don't Know

Capturing the interesting components of an image is a key aspect of imag...
research
08/21/2023

Incorprating Prompt tuning for Commit classification with prior Knowledge

Commit Classification(CC) is an important task in software maintenance s...
research
04/08/2022

Checking HateCheck: a cross-functional analysis of behaviour-aware learning for hate speech detection

Behavioural testing – verifying system capabilities by validating human-...

Please sign up or login with your details

Forgot password? Click here to reset