Understanding the Role of Training Regimes in Continual Learning

by   Seyed-Iman Mirzadeh, et al.

Catastrophic forgetting affects the training of neural networks, limiting their ability to learn multiple tasks sequentially. From the perspective of the well established plasticity-stability dilemma, neural networks tend to be overly plastic, lacking the stability necessary to prevent the forgetting of previous knowledge, which means that as learning progresses, networks tend to forget previously seen tasks. This phenomenon coined in the continual learning literature, has attracted much attention lately, and several families of approaches have been proposed with different degrees of success. However, there has been limited prior work extensively analyzing the impact that different training regimes – learning rate, batch size, regularization method– can have on forgetting. In this work, we depart from the typical approach of altering the learning algorithm to improve stability. Instead, we hypothesize that the geometrical properties of the local minima found for each task play an important role in the overall degree of forgetting. In particular, we study the effect of dropout, learning rate decay, and batch size, on forming training regimes that widen the tasks' local minima and consequently, on helping it not to forget catastrophically. Our study provides practical insights to improve stability via simple yet effective techniques that outperform alternative baselines.


page 1

page 2

page 3

page 4


Importance Driven Continual Learning for Segmentation Across Domains

The ability of neural networks to continuously learn and adapt to new ta...

Overcoming Catastrophic Forgetting in Massively Multilingual Continual Learning

Real-life multilingual systems should be able to efficiently incorporate...

Dropout as an Implicit Gating Mechanism For Continual Learning

In recent years, neural networks have demonstrated an outstanding abilit...

Linear Mode Connectivity in Multitask and Continual Learning

Continual (sequential) training and multitask (simultaneous) training ar...

Overcoming Catastrophic Forgetting beyond Continual Learning: Balanced Training for Neural Machine Translation

Neural networks tend to gradually forget the previously learned knowledg...

Continual Learning in Recurrent Neural Networks with Hypernetworks

The last decade has seen a surge of interest in continual learning (CL),...

Fortuitous Forgetting in Connectionist Networks

Forgetting is often seen as an unwanted characteristic in both human and...

Please sign up or login with your details

Forgot password? Click here to reset