An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks

12/21/2013
by   Ian J. Goodfellow, et al.
0

Catastrophic forgetting is a problem faced by many machine learning models and algorithms. When trained on one task, then trained on a second task, many machine learning models "forget" how to perform the first task. This is widely believed to be a serious problem for neural networks. Here, we investigate the extent to which the catastrophic forgetting problem occurs for modern neural networks, comparing both established and recent gradient-based training algorithms and activation functions. We also examine the effect of the relationship between the first task and the second task on catastrophic forgetting. We find that it is always best to train using the dropout algorithm--the dropout algorithm is consistently best at adapting to the new task, remembering the old task, and has the best tradeoff curve between these two extremes. We find that different tasks and relationships between tasks result in very different rankings of activation function performance. This suggests the choice of activation function should always be cross-validated.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2021

Quantum Continual Learning Overcoming Catastrophic Forgetting

Catastrophic forgetting describes the fact that machine learning models ...
research
08/07/2017

Measuring Catastrophic Forgetting in Neural Networks

Deep neural networks are used in many state-of-the-art systems for machi...
research
02/15/2021

Does Standard Backpropagation Forget Less Catastrophically Than Adam?

Catastrophic forgetting remains a severe hindrance to the broad applicat...
research
06/26/2018

Gradient Acceleration in Activation Functions

Dropout has been one of standard approaches to train deep neural network...
research
11/16/2018

On Training Recurrent Neural Networks for Lifelong Learning

Capacity saturation and catastrophic forgetting are the central challeng...
research
05/25/2019

Lifelong Neural Predictive Coding: Sparsity Yields Less Forgetting when Learning Cumulatively

In lifelong learning systems, especially those based on artificial neura...
research
05/20/2017

Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks

A long-term goal of AI is to produce agents that can learn a diversity o...

Please sign up or login with your details

Forgot password? Click here to reset