An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks

by   Ian J. Goodfellow, et al.
Beijing University of Posts and Telecommunications
Université de Montréal

Catastrophic forgetting is a problem faced by many machine learning models and algorithms. When trained on one task, then trained on a second task, many machine learning models "forget" how to perform the first task. This is widely believed to be a serious problem for neural networks. Here, we investigate the extent to which the catastrophic forgetting problem occurs for modern neural networks, comparing both established and recent gradient-based training algorithms and activation functions. We also examine the effect of the relationship between the first task and the second task on catastrophic forgetting. We find that it is always best to train using the dropout algorithm--the dropout algorithm is consistently best at adapting to the new task, remembering the old task, and has the best tradeoff curve between these two extremes. We find that different tasks and relationships between tasks result in very different rankings of activation function performance. This suggests the choice of activation function should always be cross-validated.


page 1

page 2

page 3

page 4


Quantum Continual Learning Overcoming Catastrophic Forgetting

Catastrophic forgetting describes the fact that machine learning models ...

Measuring Catastrophic Forgetting in Neural Networks

Deep neural networks are used in many state-of-the-art systems for machi...

Does Standard Backpropagation Forget Less Catastrophically Than Adam?

Catastrophic forgetting remains a severe hindrance to the broad applicat...

Gradient Acceleration in Activation Functions

Dropout has been one of standard approaches to train deep neural network...

On Training Recurrent Neural Networks for Lifelong Learning

Capacity saturation and catastrophic forgetting are the central challeng...

Lifelong Neural Predictive Coding: Sparsity Yields Less Forgetting when Learning Cumulatively

In lifelong learning systems, especially those based on artificial neura...

Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks

A long-term goal of AI is to produce agents that can learn a diversity o...

Please sign up or login with your details

Forgot password? Click here to reset