The Effectiveness of Memory Replay in Large Scale Continual Learning

10/06/2020
by   Yogesh Balaji, et al.
0

We study continual learning in the large scale setting where tasks in the input sequence are not limited to classification, and the outputs can be of high dimension. Among multiple state-of-the-art methods, we found vanilla experience replay (ER) still very competitive in terms of both performance and scalability, despite its simplicity. However, a degraded performance is observed for ER with small memory. A further visualization of the feature space reveals that the intermediate representation undergoes a distributional drift. While existing methods usually replay only the input-output pairs, we hypothesize that their regularization effect is inadequate for complex deep models and diverse tasks with small replay buffer size. Following this observation, we propose to replay the activation of the intermediate layers in addition to the input-output pairs. Considering that saving raw activation maps can dramatically increase memory and compute cost, we propose the Compressed Activation Replay technique, where compressed representations of layer activation are saved to the replay buffer. We show that this approach can achieve superior regularization effect while adding negligible memory overhead to replay method. Experiments on both the large-scale Taskonomy benchmark with a diverse set of tasks and standard common datasets (Split-CIFAR and Split-miniImageNet) demonstrate the effectiveness of the proposed method.

READ FULL TEXT
research
05/18/2021

ACAE-REMIND for Online Continual Learning with Compressed Feature Replay

Online continual learning aims to learn from a non-IID stream of data fr...
research
10/29/2018

Marginal Replay vs Conditional Replay for Continual Learning

We present a new replay-based method of continual classification learnin...
research
05/19/2022

Transformer with Memory Replay

Transformers achieve state-of-the-art performance for natural language p...
research
10/20/2021

Class Incremental Online Streaming Learning

A wide variety of methods have been developed to enable lifelong learnin...
research
11/23/2022

Integral Continual Learning Along the Tangent Vector Field of Tasks

We propose a continual learning method which incorporates information fr...
research
05/25/2023

Batch Model Consolidation: A Multi-Task Model Consolidation Framework

In Continual Learning (CL), a model is required to learn a stream of tas...
research
10/12/2022

Improving information retention in large scale online continual learning

Given a stream of data sampled from non-stationary distributions, online...

Please sign up or login with your details

Forgot password? Click here to reset