Routing Networks with Co-training for Continual Learning

by   Mark Collier, et al.

The core challenge with continual learning is catastrophic forgetting, the phenomenon that when neural networks are trained on a sequence of tasks they rapidly forget previously learned tasks. It has been observed that catastrophic forgetting is most severe when tasks are dissimilar to each other. We propose the use of sparse routing networks for continual learning. For each input, these network architectures activate a different path through a network of experts. Routing networks have been shown to learn to route similar tasks to overlapping sets of experts and dissimilar tasks to disjoint sets of experts. In the continual learning context this behaviour is desirable as it minimizes interference between dissimilar tasks while allowing positive transfer between related tasks. In practice, we find it is necessary to develop a new training method for routing networks, which we call co-training which avoids poorly initialized experts when new tasks are presented. When combined with a small episodic memory replay buffer, sparse routing networks with co-training outperform densely connected networks on the MNIST-Permutations and MNIST-Rotations benchmarks.


Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning

Continual Learning (CL) methods aim to enable machine learning models to...

Overcoming Catastrophic Interference by Conceptors

Catastrophic interference has been a major roadblock in the research of ...

Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting

The goal of continual learning (CL) is to learn a sequence of tasks with...

Reducing Catastrophic Forgetting in Modular Neural Networks by Dynamic Information Balancing

Lifelong learning is a very important step toward realizing robust auton...

HC-Net: Memory-based Incremental Dual-Network System for Continual learning

Training a neural network for a classification task typically assumes th...

Continual Learning of Predictive Models in Video Sequences via Variational Autoencoders

This paper proposes a method for performing continual learning of predic...

Avoiding Catastrophe: Active Dendrites Enable Multi-Task Learning in Dynamic Environments

A key challenge for AI is to build embodied systems that operate in dyna...