Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models

05/29/2023
by   Shuai Zhao, et al.
0

Misalignment between the outputs of a vision-language (VL) model and task goal hinders its deployment. This issue can worsen when there are distribution shifts between the training and test data. To address this problem, prevailing fully test-time adaptation (TTA) methods bootstrap themselves through entropy minimization. However, minimizing the entropy of the predictions makes the model overfit to incorrect output distributions of itself. In this work, we propose TTA with feedback to avoid such overfitting and align the model with task goals. Specifically, we adopt CLIP as reward model to provide feedback for VL models during test time in various tasks, including image classification, image-text retrieval, and image captioning. Given a single test sample, the model aims to maximize CLIP reward through reinforcement learning. We adopt a reward design with the average CLIP score of sampled candidates as the baseline. This design is simple and surprisingly effective when combined with various task-specific sampling strategies. The entire system is flexible, allowing the reward model to be extended with multiple CLIP models. Plus, a momentum buffer can be used to memorize and leverage the learned knowledge from multiple test samples. Extensive experiments demonstrate that our method significantly improves different VL models after TTA.

READ FULL TEXT

page 9

page 15

research
09/15/2022

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

Pre-trained vision-language models (e.g., CLIP) have shown promising zer...
research
06/18/2020

Fully Test-time Adaptation by Entropy Minimization

Faced with new and different data during testing, a model must adapt its...
research
04/25/2023

Test-Time Adaptation with Perturbation Consistency Learning

Currently, pre-trained language models (PLMs) do not cope well with the ...
research
11/23/2022

ActMAD: Activation Matching to Align Distributions for Test-Time-Training

Test-Time-Training (TTT) is an approach to cope with out-of-distribution...
research
11/21/2022

TEMPERA: Test-Time Prompting via Reinforcement Learning

Careful prompt design is critical to the use of large language models in...
research
10/20/2022

TTTFlow: Unsupervised Test-Time Training with Normalizing Flow

A major problem of deep neural networks for image classification is thei...
research
08/14/2023

Towards Open-Set Test-Time Adaptation Utilizing the Wisdom of Crowds in Entropy Minimization

Test-time adaptation (TTA) methods, which generally rely on the model's ...

Please sign up or login with your details

Forgot password? Click here to reset