Revisiting Parameter Reuse to Overcome Catastrophic Forgetting in Neural Networks

by   Yuqing Zhao, et al.

Neural networks tend to forget previously learned knowledge when continuously learning on datasets with varying distributions, a phenomenon known as catastrophic forgetting. More significant distribution shifts among datasets lead to more forgetting. Recently, parameter-isolation-based approaches have shown great potential in overcoming forgetting with significant distribution shifts. However, they suffer from poor generalization as they fix the neural path for each dataset during training and require dataset labels during inference. In addition, they do not support backward knowledge transfer as they prioritize past data over future ones. In this paper, we propose a new adaptive learning method, named AdaptCL, that fully reuses and grows on learned parameters to overcome catastrophic forgetting and allows the positive backward transfer without requiring dataset labels. Our proposed technique adaptively grows on the same neural path by allowing optimal reuse of frozen parameters. Besides, it uses parameter-level data-driven pruning to assign equal priority to the data. We conduct extensive experiments on MNIST Variants, DomainNet, and Food Freshness Detection datasets under different intensities of distribution shifts without requiring dataset labels. Results demonstrate that our proposed method is superior to alternative baselines in minimizing forgetting and enabling positive backward knowledge transfer.


Attention-Based Structural-Plasticity

Catastrophic forgetting/interference is a critical problem for lifelong ...

The Importance of Robust Features in Mitigating Catastrophic Forgetting

Continual learning (CL) is an approach to address catastrophic forgettin...

Synaptic Metaplasticity in Binarized Neural Networks

While deep neural networks have surpassed human performance in multiple ...

Memory-based Parameter Adaptation

Deep neural networks have excelled on a wide range of problems, from vis...

Bayesian Gradient Descent: Online Variational Bayes Learning with Increased Robustness to Catastrophic Forgetting and Weight Pruning

We suggest a novel approach for the estimation of the posterior distribu...

Iterative Network Pruning with Uncertainty Regularization for Lifelong Sentiment Classification

Lifelong learning capabilities are crucial for sentiment classifiers to ...

Please sign up or login with your details

Forgot password? Click here to reset