Continual learning: a feature extraction formalization, an efficient algorithm, and fundamental obstructions

03/27/2022
by   Binghui Peng, et al.
0

Continual learning is an emerging paradigm in machine learning, wherein a model is exposed in an online fashion to data from multiple different distributions (i.e. environments), and is expected to adapt to the distribution change. Precisely, the goal is to perform well in the new environment, while simultaneously retaining the performance on the previous environments (i.e. avoid "catastrophic forgetting") – without increasing the size of the model. While this setup has enjoyed a lot of attention in the applied community, there hasn't be theoretical work that even formalizes the desired guarantees. In this paper, we propose a framework for continual learning through the framework of feature extraction – namely, one in which features, as well as a classifier, are being trained with each environment. When the features are linear, we design an efficient gradient-based algorithm 𝖣𝖯𝖦𝖣, that is guaranteed to perform well on the current environment, as well as avoid catastrophic forgetting. In the general case, when the features are non-linear, we show such an algorithm cannot exist, whether efficient or not.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2023

Utility-based Perturbed Gradient Descent: An Optimizer for Continual Learning

Modern representation learning methods may fail to adapt quickly under n...
research
04/18/2021

Dynamically Addressing Unseen Rumor via Continual Learning

Rumors are often associated with newly emerging events, thus, an ability...
research
07/07/2020

Continual BERT: Continual Learning for Adaptive Extractive Summarization of COVID-19 Literature

The scientific community continues to publish an overwhelming amount of ...
research
04/29/2023

The Ideal Continual Learner: An Agent That Never Forgets

The goal of continual learning is to find a model that solves multiple l...
research
03/27/2023

CoDeC: Communication-Efficient Decentralized Continual Learning

Training at the edge utilizes continuously evolving data generated at di...
research
02/25/2020

Training Binary Neural Networks using the Bayesian Learning Rule

Neural networks with binary weights are computation-efficient and hardwa...
research
12/06/2021

CSG0: Continual Urban Scene Generation with Zero Forgetting

With the rapid advances in generative adversarial networks (GANs), the v...

Please sign up or login with your details

Forgot password? Click here to reset