Orthogonal Gradient Descent for Continual Learning

10/15/2019
by   Mehrdad Farajtabar, et al.
13

Neural networks are achieving state of the art and sometimes super-human performance on learning tasks across a variety of domains. Whenever these problems require learning in a continual or sequential manner, however, neural networks suffer from the problem of catastrophic forgetting; they forget how to solve previous tasks after being trained on a new task, despite having the essential capacity to solve both tasks if they were trained on both simultaneously. In this paper, we propose to address this issue from a parameter space perspective and study an approach to restrict the direction of the gradient updates to avoid forgetting previously-learned data. We present the Orthogonal Gradient Descent (OGD) method, which accomplishes this goal by projecting the gradients from new tasks onto a subspace in which the neural network output on previous task does not change and the projected gradient is still in a useful direction for learning the new task. Our approach utilizes the high capacity of a neural network more efficiently and does not require storing the previously learned data that might raise privacy concerns. Experiments on common benchmarks reveal the effectiveness of the proposed OGD method.

READ FULL TEXT
research
06/21/2020

Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent

In continual learning settings, deep neural networks are prone to catast...
research
08/02/2019

Weight Friction: A Simple Method to Overcome Catastrophic Forgetting and Enable Continual Learning

In recent years, deep neural networks have found success in replicating ...
research
07/28/2022

One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares

While deep neural networks are capable of achieving state-of-the-art per...
research
01/30/2017

PathNet: Evolution Channels Gradient Descent in Super Neural Networks

For artificial general intelligence (AGI) it would be efficient if multi...
research
02/02/2023

Continual Learning with Scaled Gradient Projection

In neural networks, continual learning results in gradient interference ...
research
06/15/2021

Natural continual learning: success is a journey, not (just) a destination

Biological agents are known to learn many different tasks over the cours...
research
05/18/2018

Overcoming catastrophic forgetting problem by weight consolidation and long-term memory

Sequential learning of multiple tasks in artificial neural networks usin...

Please sign up or login with your details

Forgot password? Click here to reset