Mitigating Catastrophic Forgetting in Long Short-Term Memory Networks

05/26/2023
by   Ketaki Joshi, et al.
0

Continual learning on sequential data is critical for many machine learning (ML) deployments. Unfortunately, LSTM networks, which are commonly used to learn on sequential data, suffer from catastrophic forgetting and are limited in their ability to learn multiple tasks continually. We discover that catastrophic forgetting in LSTM networks can be overcome in two novel and readily-implementable ways – separating the LSTM memory either for each task or for each target label. Our approach eschews the need for explicit regularization, hypernetworks, and other complex methods. We quantify the benefits of our approach on recently-proposed LSTM networks for computer memory access prefetching, an important sequential learning problem in ML-based computer system optimization. Compared to state-of-the-art weight regularization methods to mitigate catastrophic forgetting, our approach is simple, effective, and enables faster learning. We also show that our proposal enables the use of small, non-regularized LSTM networks for complex natural language processing in the offline learning scenario, which was previously considered difficult.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2021

Provable Continual Learning via Sketched Jacobian Approximations

An important problem in machine learning is the ability to learn tasks i...
research
11/25/2022

Overcoming Catastrophic Forgetting by XAI

Explaining the behaviors of deep neural networks, usually considered as ...
research
05/03/2020

Explaining How Deep Neural Networks Forget by Deep Visualization

Explaining the behaviors of deep neural networks, usually considered as ...
research
03/22/2021

Catastrophic Forgetting in Deep Graph Networks: an Introductory Benchmark for Graph Classification

In this work, we study the phenomenon of catastrophic forgetting in the ...
research
06/22/2020

Continual Learning in Recurrent Neural Networks with Hypernetworks

The last decade has seen a surge of interest in continual learning (CL),...
research
01/11/2023

ML-FEED: Machine Learning Framework for Efficient Exploit Detection (Extended version)

Machine learning (ML)-based methods have recently become attractive for ...
research
03/03/2023

EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization

This paper presents a simple yet effective approach that improves contin...

Please sign up or login with your details

Forgot password? Click here to reset