Natural continual learning: success is a journey, not (just) a destination

06/15/2021
by   Ta-Chu Kao, et al.
0

Biological agents are known to learn many different tasks over the course of their lives, and to be able to revisit previous tasks and behaviors with little to no loss in performance. In contrast, artificial agents are prone to 'catastrophic forgetting' whereby performance on previous tasks deteriorates rapidly as new ones are acquired. This shortcoming has recently been addressed using methods that encourage parameters to stay close to those used for previous tasks. This can be done by (i) using specific parameter regularizers that map out suitable destinations in parameter space, or (ii) guiding the optimization journey by projecting gradients into subspaces that do not interfere with previous tasks. However, parameter regularization has been shown to be relatively ineffective in recurrent neural networks (RNNs), a setting relevant to the study of neural dynamics supporting biological continual learning. Similarly, projection based methods can reach capacity and fail to learn any further as the number of tasks increases. To address these limitations, we propose Natural Continual Learning (NCL), a new method that unifies weight regularization and projected gradient descent. NCL uses Bayesian weight regularization to encourage good performance on all tasks at convergence and combines this with gradient projections designed to prevent catastrophic forgetting during optimization. NCL formalizes gradient projection as a trust region algorithm based on the Fisher information metric, and achieves scalability via a novel Kronecker-factored approximation strategy. Our method outperforms both standard weight regularization techniques and projection based approaches when applied to continual learning problems in RNNs. The trained networks evolve task-specific dynamics that are strongly preserved as new tasks are learned, similar to experimental findings in biological circuits.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 21

08/02/2019

Weight Friction: A Simple Method to Overcome Catastrophic Forgetting and Enable Continual Learning

In recent years, deep neural networks have found success in replicating ...
04/24/2019

Facilitating Bayesian Continual Learning by Natural Gradients and Stein Gradients

Continual learning aims to enable machine learning models to learn a gen...
10/15/2019

Orthogonal Gradient Descent for Continual Learning

Neural networks are achieving state of the art and sometimes super-human...
03/12/2022

Sparsity and Heterogeneous Dropout for Continual Learning in the Null Space of Neural Activations

Continual/lifelong learning from a non-stationary input data stream is a...
10/09/2021

Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning

The backpropagation networks are notably susceptible to catastrophic for...
06/22/2020

Continual Learning in Recurrent Neural Networks with Hypernetworks

The last decade has seen a surge of interest in continual learning (CL),...
11/25/2020

Continual learning with direction-constrained optimization

This paper studies a new design of the optimization algorithm for traini...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.