Robust fine-tuning of zero-shot models

09/04/2021
by   Mitchell Wortsman, et al.
7

Large pre-trained models such as CLIP offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning approaches substantially improve accuracy in-distribution, they also reduce out-of-distribution robustness. We address this tension by introducing a simple and effective method for improving robustness: ensembling the weights of the zero-shot and fine-tuned models. Compared to standard fine-tuning, the resulting weight-space ensembles provide large accuracy improvements out-of-distribution, while matching or improving in-distribution accuracy. On ImageNet and five derived distribution shifts, weight-space ensembles improve out-of-distribution accuracy by 2 to 10 percentage points while increasing in-distribution accuracy by nearly 1 percentage point relative to standard fine-tuning. These improvements come at no additional computational cost during fine-tuning or inference.

READ FULL TEXT

page 4

page 28

page 39

page 40

page 41

page 42

research
11/06/2022

Momentum-based Weight Interpolation of Strong Zero-Shot Models for Continual Learning

Large pre-trained, zero-shot capable models have shown considerable succ...
research
03/10/2022

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

The conventional recipe for maximizing model accuracy is to (1) train mu...
research
08/10/2022

Patching open-vocabulary models by interpolating weights

Open-vocabulary models like CLIP achieve high accuracy across many image...
research
04/21/2023

Benchmarking Low-Shot Robustness to Natural Distribution Shifts

Robustness to natural distribution shifts has seen remarkable progress t...
research
05/05/2023

Using ChatGPT for Entity Matching

Entity Matching is the task of deciding if two entity descriptions refer...
research
11/25/2021

Amortized Prompt: Lightweight Fine-Tuning for CLIP in Domain Generalization

Domain generalization (DG) is a difficult transfer learning problem aimi...
research
06/30/2021

The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning

Although machine learning models typically experience a drop in performa...

Please sign up or login with your details

Forgot password? Click here to reset