Gradient Projection Memory for Continual Learning

03/17/2021
by   Gobinda Saha, et al.
0

The ability to learn continually without forgetting the past tasks is a desired attribute for artificial learning systems. Existing approaches to enable such learning in artificial neural networks usually rely on network growth, importance based weight update or replay of old data from the memory. In contrast, we propose a novel approach where a neural network learns new tasks by taking gradient steps in the orthogonal direction to the gradient subspaces deemed important for the past tasks. We find the bases of these subspaces by analyzing network representations (activations) after learning each task with Singular Value Decomposition (SVD) in a single shot manner and store them in the memory as Gradient Projection Memory (GPM). With qualitative and quantitative analyses, we show that such orthogonal gradient descent induces minimum to no interference with the past tasks, thereby mitigates forgetting. We evaluate our algorithm on diverse image classification datasets with short and long sequences of tasks and report better or on-par performance compared to the state-of-the-art approaches.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

09/10/2021

Saliency Guided Experience Packing for Replay in Continual Learning

Artificial learning systems aspire to mimic human intelligence by contin...
01/29/2022

Continual Learning with Recursive Gradient Optimization

Learning multiple tasks sequentially without forgetting previous knowled...
10/22/2020

Continual Learning in Low-rank Orthogonal Subspaces

In continual learning (CL), a learner is faced with a sequence of tasks,...
10/15/2019

Orthogonal Gradient Descent for Continual Learning

Neural networks are achieving state of the art and sometimes super-human...
06/22/2019

Beneficial perturbation network for continual learning

Sequential learning of multiple tasks in artificial neural networks usin...
10/09/2021

Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning

The backpropagation networks are notably susceptible to catastrophic for...
07/05/2020

Pseudo-Rehearsal for Continual Learning with Normalizing Flows

Catastrophic forgetting (CF) happens whenever a neural network overwrite...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.