Overcoming Multi-Model Forgetting

02/21/2019
by   Yassine Benyahia, et al.
0

We identify a phenomenon, which we refer to as multi-model forgetting, that occurs when sequentially training multiple deep networks with partially-shared parameters; the performance of previously-trained models degrades as one optimizes a subsequent one, due to the overwriting of shared parameters. To overcome this, we introduce a statistically-justified weight plasticity loss that regularizes the learning of a model's shared parameters according to their importance for the previous models, and demonstrate its effectiveness when training two models sequentially and for neural architecture search. Adding weight plasticity in neural architecture search preserves the best models to the end of the search and yields improved results in both natural language processing and computer vision tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/08/2020

Evaluating the Effectiveness of Efficient Neural Architecture Search for Sentence-Pair Tasks

Neural Architecture Search (NAS) methods, which automatically learn enti...
research
03/29/2020

Disturbance-immune Weight Sharing for Neural Architecture Search

Neural architecture search (NAS) has gained increasing attention in the ...
research
06/12/2020

NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing

Neural Architecture Search (NAS) is a promising and rapidly evolving res...
research
07/03/2019

FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search

The ability to rank models by its real strength is the key to Neural Arc...
research
06/12/2019

Continual and Multi-Task Architecture Search

Architecture search is the process of automatically learning the neural ...
research
01/06/2020

Deeper Insights into Weight Sharing in Neural Architecture Search

With the success of deep neural networks, Neural Architecture Search (NA...
research
08/29/2021

Analyzing and Mitigating Interference in Neural Architecture Search

Weight sharing has become the de facto approach to reduce the training c...

Please sign up or login with your details

Forgot password? Click here to reset