Does Continual Learning Equally Forget All Parameters?

04/09/2023
by   Haiyan Zhao, et al.
0

Distribution shift (e.g., task or domain shift) in continual learning (CL) usually results in catastrophic forgetting of neural networks. Although it can be alleviated by repeatedly replaying buffered data, the every-step replay is time-consuming. In this paper, we study which modules in neural networks are more prone to forgetting by investigating their training dynamics during CL. Our proposed metrics show that only a few modules are more task-specific and sensitively alter between tasks, while others can be shared across tasks as common knowledge. Hence, we attribute forgetting mainly to the former and find that finetuning them only on a small buffer at the end of any CL method can bring non-trivial improvement. Due to the small number of finetuned parameters, such “Forgetting Prioritized Finetuning (FPF)” is efficient in computation. We further propose a more efficient and simpler method that entirely removes the every-step replay and replaces them by only k-times of FPF periodically triggered during CL. Surprisingly, this “k-FPF” performs comparably to FPF and outperforms the SOTA CL methods but significantly reduces their computational overhead and cost. In experiments on several benchmarks of class- and domain-incremental CL, FPF consistently improves existing CL methods by a large margin, and k-FPF further excels in efficiency without degrading the accuracy. We also empirically studied the impact of buffer size, epochs per task, and finetuning modules on the cost and accuracy of our methods.

READ FULL TEXT

page 8

page 15

page 17

page 18

research
02/19/2020

Using Hindsight to Anchor Past Knowledge in Continual Learning

In continual learning, the learner faces a stream of data whose distribu...
research
01/29/2023

Progressive Prompts: Continual Learning for Language Models

We introduce Progressive Prompts - a simple and efficient approach for c...
research
07/08/2023

Integrating Curricula with Replays: Its Effects on Continual Learning

Humans engage in learning and reviewing processes with curricula when ac...
research
09/09/2020

Routing Networks with Co-training for Continual Learning

The core challenge with continual learning is catastrophic forgetting, t...
research
11/23/2022

Integral Continual Learning Along the Tangent Vector Field of Tasks

We propose a continual learning method which incorporates information fr...
research
03/25/2021

Efficient Feature Transformations for Discriminative and Generative Continual Learning

As neural networks are increasingly being applied to real-world applicat...
research
11/03/2018

Closed-Loop GAN for continual Learning

Sequential learning of tasks using gradient descent leads to an unremitt...

Please sign up or login with your details

Forgot password? Click here to reset