A Closer Look at Rehearsal-Free Continual Learning

03/31/2022
by   James Seale Smith, et al.
0

Continual learning describes a setting where machine learning models learn novel concepts from continuously shifting training data, while simultaneously avoiding degradation of knowledge on previously seen classes (a phenomenon known as the catastrophic forgetting problem) which may disappear from the training data for extended periods of time. Current approaches for continual learning of a single expanding task (aka class-incremental continual learning) require extensive rehearsal of previously seen data to avoid this degradation of knowledge. Unfortunately, rehearsal comes at a sharp cost to memory and computation, and it may also violate data-privacy. Instead, we explore combining knowledge distillation and parameter regularization in new ways to achieve strong continual learning performance without rehearsal. Specifically, we take a deep dive into common continual learning techniques: prediction distillation, feature distillation, L2 parameter regularization, and EWC parameter regularization. We first disprove the common assumption that parameter regularization techniques fail for rehearsal-free continual learning of a single, expanding task. Next, we explore how to leverage knowledge from a pre-trained model in rehearsal-free continual learning and find that vanilla L2 parameter regularization outperforms EWC parameter regularization and feature distillation. We then highlight the impact of the rehearsal-free continual learning settings with a classifier expansion benchmark, showing that a strategy based on our findings combined with a positive/negative label balancing heuristic can close the performance gap between the upper bound and the existing strategies by up to roughly 50 method consisting of pre-training, L2 regularization, and prediction distillation can even outperform rehearsal-based methods on the common CIFAR-100 benchmark.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2022

CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning

Computer vision models suffer from a phenomenon known as catastrophic fo...
research
08/18/2023

Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free Continual Learning

In this work, we investigate exemplar-free class incremental learning (C...
research
08/31/2023

Continual Learning From a Stream of APIs

Continual learning (CL) aims to learn new tasks without forgetting previ...
research
03/28/2023

Projected Latent Distillation for Data-Agnostic Consolidation in Distributed Continual Learning

Distributed learning on the edge often comprises self-centered devices (...
research
09/23/2021

Recent Advances of Continual Learning in Computer Vision: An Overview

In contrast to batch learning where all training data is available at on...
research
11/15/2022

Exploring the Joint Use of Rehearsal and Knowledge Distillation in Continual Learning for Spoken Language Understanding

Continual learning refers to a dynamical framework in which a model or a...
research
09/13/2023

Continual Learning with Dirichlet Generative-based Rehearsal

Recent advancements in data-driven task-oriented dialogue systems (ToDs)...

Please sign up or login with your details

Forgot password? Click here to reset