DeepAI AI Chat
Log In Sign Up

Continual learning with hypernetworks

by   Johannes von Oswald, et al.
ETH Zurich

Artificial neural networks suffer from catastrophic forgetting when they are sequentially trained on multiple tasks. To overcome this problem, we present a novel approach based on task-conditioned hypernetworks, i.e., networks that generate the weights of a target model based on task identity. Continual learning (CL) is less difficult for this class of models thanks to a simple key observation: instead of relying on recalling the input-output relations of all previously seen data, task-conditioned hypernetworks only require rehearsing previous weight realizations, which can be maintained in memory using a simple regularizer. Besides achieving good performance on standard CL benchmarks, additional experiments on long task sequences reveal that task-conditioned hypernetworks display an unprecedented capacity to retain previous memories. Notably, such long memory lifetimes are achieved in a compressive regime, when the number of trainable weights is comparable or smaller than target network size. We provide insight into the structure of low-dimensional task embedding spaces (the input space of the hypernetwork) and show that task-conditioned hypernetworks demonstrate transfer learning properties. Finally, forward information transfer is further supported by empirical results on a challenging CL benchmark based on the CIFAR-10/100 image datasets.


CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning

Computer vision models suffer from a phenomenon known as catastrophic fo...

Continual Learning by Modeling Intra-Class Variation

It has been observed that neural networks perform poorly when the data o...

Lifelong GAN: Continual Learning for Conditional Image Generation

Lifelong learning is challenging for deep neural networks due to their s...

Defeating Catastrophic Forgetting via Enhanced Orthogonal Weights Modification

The ability of neural networks (NNs) to learn and remember multiple task...

Continual Learning with Recursive Gradient Optimization

Learning multiple tasks sequentially without forgetting previous knowled...

Active Long Term Memory Networks

Continual Learning in artificial neural networks suffers from interferen...

CaSpeR: Latent Spectral Regularization for Continual Learning

While biological intelligence grows organically as new knowledge is gath...