Continual Learning with Extended Kronecker-factored Approximate Curvature

04/16/2020
by   Janghyeon Lee, et al.
0

We propose a quadratic penalty method for continual learning of neural networks that contain batch normalization (BN) layers. The Hessian of a loss function represents the curvature of the quadratic penalty function, and a Kronecker-factored approximate curvature (K-FAC) is used widely to practically compute the Hessian of a neural network. However, the approximation is not valid if there is dependence between examples, typically caused by BN layers in deep network architectures. We extend the K-FAC method so that the inter-example relations are taken into account and the Hessian of deep neural networks can be properly approximated under practical assumptions. We also propose a method of weight merging and reparameterization to properly handle statistical parameters of BN, which plays a critical role for continual learning with BN, and a method that selects hyperparameters without source task data. Our method shows better performance than baselines in the permuted MNIST task with BN layers and in sequential learning from the ImageNet classification task to fine-grained classification tasks with ResNet-50, without any explicit or implicit use of source task data for hyperparameter selection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/17/2020

Residual Continual Learning

We propose a novel continual learning method called Residual Continual L...
research
06/23/2023

Maintaining Plasticity in Deep Continual Learning

Modern deep-learning systems are specialized to problem settings in whic...
research
08/08/2019

Continual Learning by Asymmetric Loss Approximation with Single-Side Overestimation

Catastrophic forgetting is a critical challenge in training deep neural ...
research
02/04/2021

Rethinking Quadratic Regularizers: Explicit Movement Regularization for Continual Learning

Quadratic regularizers are often used for mitigating catastrophic forget...
research
01/31/2019

Functional Regularisation for Continual Learning using Gaussian Processes

We introduce a novel approach for supervised continual learning based on...
research
02/22/2022

Increasing Depth of Neural Networks for Life-long Learning

Increasing neural network depth is a well-known method for improving neu...
research
05/31/2023

The Tunnel Effect: Building Data Representations in Deep Neural Networks

Deep neural networks are widely known for their remarkable effectiveness...

Please sign up or login with your details

Forgot password? Click here to reset