Exclusive Supermask Subnetwork Training for Continual Learning

10/18/2022
by   Prateek Yadav, et al.
1

Continual Learning (CL) methods mainly focus on avoiding catastrophic forgetting and learning representations that are transferable to new tasks. Recently, Wortsman et al. (2020) proposed a CL method, SupSup, which uses a randomly initialized, fixed base network (model) and finds a supermask for each new task that selectively keeps or removes each weight to produce a subnetwork. They prevent forgetting as the network weights are not being updated. Although there is no forgetting, the performance of the supermask is sub-optimal because fixed weights restrict its representational power. Furthermore, there is no accumulation or transfer of knowledge inside the model when new tasks are learned. Hence, we propose ExSSNeT (Exclusive Supermask SubNEtwork Training), which performs exclusive and non-overlapping subnetwork weight training. This avoids conflicting updates to the shared weights by subsequent tasks to improve performance while still preventing forgetting. Furthermore, we propose a novel KNN-based Knowledge Transfer (KKT) module that dynamically initializes a new task's mask based on previous tasks for improving knowledge transfer. We demonstrate that ExSSNeT outperforms SupSup and other strong previous methods on both text classification and vision tasks while preventing forgetting. Moreover, ExSSNeT is particularly advantageous for sparse masks that activate 2-10 SupSup. Additionally, ExSSNeT scales to a large number of tasks (100), and our KKT module helps to learn new tasks faster while improving overall performance. Our code is available at https://github.com/prateeky2806/exessnet

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2020

Adversarial Continual Learning

Continual learning aims to learn new tasks without forgetting previously...
research
01/23/2020

Ternary Feature Masks: continual learning without any forgetting

In this paper, we propose an approach without any forgetting to continua...
research
06/09/2021

Optimizing Reusable Knowledge for Continual Learning via Metalearning

When learning tasks over time, artificial neural networks suffer from a ...
research
07/19/2022

Incremental Task Learning with Incremental Rank Updates

Incremental Task learning (ITL) is a category of continual learning that...
research
10/05/2022

ImpressLearn: Continual Learning via Combined Task Impressions

This work proposes a new method to sequentially train a deep neural netw...
research
06/26/2020

Supermasks in Superposition

We present the Supermasks in Superposition (SupSup) model, capable of se...
research
11/15/2021

Continual Learning via Local Module Composition

Modularity is a compelling solution to continual learning (CL), the prob...

Please sign up or login with your details

Forgot password? Click here to reset