Does Standard Backpropagation Forget Less Catastrophically Than Adam?

02/15/2021
by   Dylan R. Ashley, et al.
0

Catastrophic forgetting remains a severe hindrance to the broad application of artificial neural networks (ANNs), however, it continues to be a poorly understood phenomenon. Despite the extensive amount of work on catastrophic forgetting, we argue that it is still unclear how exactly the phenomenon should be quantified, and, moreover, to what degree all of the choices we make when designing learning systems affect the amount of catastrophic forgetting. We use various testbeds from the reinforcement learning and supervised learning literature to (1) provide evidence that the choice of which modern gradient-based optimization algorithm is used to train an ANN has a significant impact on the amount of catastrophic forgetting and show that–surprisingly–in many instances classical algorithms such as vanilla SGD experience less catastrophic forgetting than the more modern algorithms such as Adam. We empirically compare four different existing metrics for quantifying catastrophic forgetting and (2) show that the degree to which the learning systems experience catastrophic forgetting is sufficiently sensitive to the metric used that a change from one principled metric to another is enough to change the conclusions of a study dramatically. Our results suggest that a much more rigorous experimental methodology is required when looking at catastrophic forgetting. Based on our results, we recommend inter-task forgetting in supervised learning must be measured with both retention and relearning metrics concurrently, and intra-task forgetting in reinforcement learning must–at the very least–be measured with pairwise interference.

READ FULL TEXT
research
08/20/2018

Catastrophic Importance of Catastrophic Forgetting

This paper describes some of the possibilities of artificial neural netw...
research
12/21/2013

An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks

Catastrophic forgetting is a problem faced by many machine learning mode...
research
11/16/2018

On Training Recurrent Neural Networks for Lifelong Learning

Capacity saturation and catastrophic forgetting are the central challeng...
research
04/05/2019

Reducing catastrophic forgetting when evolving neural networks

A key stepping stone in the development of an artificial general intelli...
research
05/25/2019

Lifelong Neural Predictive Coding: Sparsity Yields Less Forgetting when Learning Cumulatively

In lifelong learning systems, especially those based on artificial neura...
research
05/20/2017

Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks

A long-term goal of AI is to produce agents that can learn a diversity o...
research
05/18/2022

Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation

Continual learning - learning new tasks in sequence while maintaining pe...

Please sign up or login with your details

Forgot password? Click here to reset