Scaling the Number of Tasks in Continual Learning

07/10/2022
by   Timothée Lesort, et al.
9

Standard gradient descent algorithms applied to sequences of tasks are known to produce catastrophic forgetting in deep neural networks. When trained on a new task in a sequence, the model updates its parameters on the current task, forgetting past knowledge. This article explores scenarios where we scale the number of tasks in a finite environment. Those scenarios are composed of a long sequence of tasks with reoccurring data. We show that in such setting, stochastic gradient descent can learn, progress, and converge to a solution that according to existing literature needs a continual learning algorithm. In other words, we show that the model performs knowledge retention and accumulation without specific memorization mechanisms. We propose a new experimentation framework, SCoLe (Scaling Continual Learning), to study the knowledge retention and accumulation of algorithms in potentially infinite sequences of tasks. To explore this setting, we performed a large number of experiments on sequences of 1,000 tasks to better understand this new family of settings. We also propose a slight modifications to the vanilla stochastic gradient descent to facilitate continual learning in this setting. The SCoLe framework represents a good simulation of practical training environments with reoccurring situations and allows the study of convergence behavior in long sequences. Our experiments show that previous results on short scenarios cannot always be extrapolated to longer scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2020

Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent

In continual learning settings, deep neural networks are prone to catast...
research
08/13/2021

Continual Backprop: Stochastic Gradient Descent with Persistent Randomness

The Backprop algorithm for learning in neural networks utilizes two mech...
research
05/11/2021

TAG: Task-based Accumulated Gradients for Lifelong learning

When an agent encounters a continual stream of new tasks in the lifelong...
research
05/24/2022

Thalamus: a brain-inspired algorithm for biologically-plausible continual learning and disentangled representations

Animals thrive in a constantly changing environment and leverage the tem...
research
11/03/2021

A Meta-Learned Neuron model for Continual Learning

Continual learning is the ability to acquire new knowledge without forge...
research
06/17/2022

Debugging using Orthogonal Gradient Descent

In this report we consider the following problem: Given a trained model ...
research
09/25/2019

Learning with Long-term Remembering: Following the Lead of Mixed Stochastic Gradient

Current deep neural networks can achieve remarkable performance on a sin...

Please sign up or login with your details

Forgot password? Click here to reset