Continual Learning in Recurrent Neural Networks with Hypernetworks

by   Benjamin Ehret, et al.

The last decade has seen a surge of interest in continual learning (CL), and a variety of methods have been developed to alleviate catastrophic forgetting. However, most prior work has focused on tasks with static data, while CL on sequential data has remained largely unexplored. Here we address this gap in two ways. First, we evaluate the performance of established CL methods when applied to recurrent neural networks (RNNs). We primarily focus on elastic weight consolidation, which is limited by a stability-plasticity trade-off, and explore the particularities of this trade-off when using sequential data. We show that high working memory requirements, but not necessarily sequence length, lead to an increased need for stability at the cost of decreased performance on subsequent tasks. Second, to overcome this limitation we employ a recent method based on hypernetworks and apply it to RNNs to address catastrophic forgetting on sequential data. By generating the weights of a main RNN in a task-dependent manner, our approach disentangles stability and plasticity, and outperforms alternative methods in a range of experiments. Overall, our work provides several key insights on the differences between CL in feedforward networks and in RNNs, while offering a novel solution to effectively tackle CL on sequential data.


page 26

page 28


Continual Learning with Gated Incremental Memories for sequential data processing

The ability to learn in dynamic, nonstationary environments without forg...

Continual Learning with Dependency Preserving Hypernetworks

Humans learn continually throughout their lifespan by accumulating diver...

The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by Isolating Task-Specific Subnetworks in Feedforward Neural Networks

Neural networks have seen an explosion of usage and research in the past...

Formalizing the Generalization-Forgetting Trade-off in Continual Learning

We formulate the continual learning (CL) problem via dynamic programming...

Understanding the Role of Training Regimes in Continual Learning

Catastrophic forgetting affects the training of neural networks, limitin...

Mitigating Catastrophic Forgetting in Long Short-Term Memory Networks

Continual learning on sequential data is critical for many machine learn...

Continual Learning for Recurrent Neural Networks: a Review and Empirical Evaluation

Learning continuously during all model lifetime is fundamental to deploy...

Please sign up or login with your details

Forgot password? Click here to reset