A Simple Baseline that Questions the Use of Pretrained-Models in Continual Learning

10/10/2022
by   Paul Janson, et al.
10

With the success of pretraining techniques in representation learning, a number of continual learning methods based on pretrained models have been proposed. Some of these methods design continual learning mechanisms on the pre-trained representations and only allow minimum updates or even no updates of the backbone models during the training of continual learning. In this paper, we question whether the complexity of these models is needed to achieve good performance by comparing them to a simple baseline that we designed. We argue that the pretrained feature extractor itself can be strong enough to achieve a competitive or even better continual learning performance on Split-CIFAR100 and CoRe 50 benchmarks. To validate this, we conduct a very simple baseline that 1) use the frozen pretrained model to extract image features for every class encountered during the continual learning stage and compute their corresponding mean features on training data, and 2) predict the class of the input based on the nearest neighbor distance between test samples and mean features of the classes; i.e., Nearest Mean Classifier (NMC). This baseline is single-headed, exemplar-free, and can be task-free (by updating the means continually). This baseline achieved 88.53 surpassing most state-of-the-art continual learning methods that are all initialized using the same pretrained transformer model. We hope our baseline may encourage future progress in designing learning systems that can continually add quality to the learning representations even if they started from some pretrained weights.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/24/2020

Energy-Based Models for Continual Learning

We motivate Energy-Based Models (EBMs) as a promising model class for co...
research
06/24/2020

OvA-INN: Continual Learning with Invertible Neural Networks

In the field of Continual Learning, the objective is to learn several ta...
research
05/29/2019

Meta-Learning Representations for Continual Learning

A continual learning agent should be able to build on top of existing kn...
research
04/25/2023

Towards Compute-Optimal Transfer Learning

The field of transfer learning is undergoing a significant shift with th...
research
02/15/2023

À-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable Prompting

We introduce À-la-carte Prompt Tuning (APT), a transformer-based scheme ...
research
10/01/2021

DualNet: Continual Learning, Fast and Slow

According to Complementary Learning Systems (CLS) theory <cit.> in neuro...
research
07/24/2023

Online Continual Learning in Keyword Spotting for Low-Resource Devices via Pooling High-Order Temporal Statistics

Keyword Spotting (KWS) models on embedded devices should adapt fast to n...

Please sign up or login with your details

Forgot password? Click here to reset