Continual learning with hypernetworks

06/03/2019
by   Johannes von Oswald, et al.
2

Artificial neural networks suffer from catastrophic forgetting when they are sequentially trained on multiple tasks. To overcome this problem, we present a novel approach based on task-conditioned hypernetworks, i.e., networks that generate the weights of a target model based on task identity. Continual learning (CL) is less difficult for this class of models thanks to a simple key observation: instead of relying on recalling the input-output relations of all previously seen data, task-conditioned hypernetworks only require rehearsing previous weight realizations, which can be maintained in memory using a simple regularizer. Besides achieving good performance on standard CL benchmarks, additional experiments on long task sequences reveal that task-conditioned hypernetworks display an unprecedented capacity to retain previous memories. Notably, such long memory lifetimes are achieved in a compressive regime, when the number of trainable weights is comparable or smaller than target network size. We provide insight into the structure of low-dimensional task embedding spaces (the input space of the hypernetwork) and show that task-conditioned hypernetworks demonstrate transfer learning properties. Finally, forward information transfer is further supported by empirical results on a challenging CL benchmark based on the CIFAR-10/100 image datasets.

READ FULL TEXT
research
11/23/2022

CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning

Computer vision models suffer from a phenomenon known as catastrophic fo...
research
10/11/2022

Continual Learning by Modeling Intra-Class Variation

It has been observed that neural networks perform poorly when the data o...
research
07/23/2019

Lifelong GAN: Continual Learning for Conditional Image Generation

Lifelong learning is challenging for deep neural networks due to their s...
research
11/19/2021

Defeating Catastrophic Forgetting via Enhanced Orthogonal Weights Modification

The ability of neural networks (NNs) to learn and remember multiple task...
research
06/07/2016

Active Long Term Memory Networks

Continual Learning in artificial neural networks suffers from interferen...
research
01/09/2023

CaSpeR: Latent Spectral Regularization for Continual Learning

While biological intelligence grows organically as new knowledge is gath...
research
06/26/2020

Supermasks in Superposition

We present the Supermasks in Superposition (SupSup) model, capable of se...

Please sign up or login with your details

Forgot password? Click here to reset