Momentum-based Weight Interpolation of Strong Zero-Shot Models for Continual Learning

11/06/2022
by   Zafir Stojanovski, et al.
0

Large pre-trained, zero-shot capable models have shown considerable success both for standard transfer and adaptation tasks, with particular robustness towards distribution shifts. In addition, subsequent fine-tuning can considerably improve performance on a selected downstream task. However, through naive fine-tuning, these zero-shot models lose their generalizability and robustness towards distribution shifts. This is a particular problem for tasks such as Continual Learning (CL), where continuous adaptation has to be performed as new task distributions are introduced sequentially. In this work, we showcase that where fine-tuning falls short to adapt such zero-shot capable models, simple momentum-based weight interpolation can provide consistent improvements for CL tasks in both memory-free and memory-based settings. In particular, we find improvements of over +4% on standard CL benchmarks, while reducing the error to the upper limit of jointly training on all tasks at once in parts by more than half, allowing the continual learner to inch closer to the joint training limits.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/04/2021

Robust fine-tuning of zero-shot models

Large pre-trained models such as CLIP offer consistent accuracy across a...
research
10/06/2022

CLIP model is an Efficient Continual Learner

The continual learning setting aims to learn new tasks over time without...
research
05/24/2023

Zero-shot Task Preference Addressing Enabled by Imprecise Bayesian Continual Learning

Like generic multi-task learning, continual learning has the nature of m...
research
11/25/2021

Amortized Prompt: Lightweight Fine-Tuning for CLIP in Domain Generalization

Domain generalization (DG) is a difficult transfer learning problem aimi...
research
04/21/2023

Benchmarking Low-Shot Robustness to Natural Distribution Shifts

Robustness to natural distribution shifts has seen remarkable progress t...
research
08/03/2023

Efficient Model Adaptation for Continual Learning at the Edge

Most machine learning (ML) systems assume stationary and matching data d...
research
02/02/2023

CLIPood: Generalizing CLIP to Out-of-Distributions

Out-of-distribution (OOD) generalization, where the model needs to handl...

Please sign up or login with your details

Forgot password? Click here to reset