Continual Learning by Asymmetric Loss Approximation with Single-Side Overestimation

by   Dongmin Park, et al.

Catastrophic forgetting is a critical challenge in training deep neural networks. Although continual learning has been investigated as a countermeasure to the problem, it often suffers from requirements of additional network components and weak scalability to a large number of tasks. We propose a novel approach to continual learning by approximating a true loss function based on an asymmetric quadratic function with one of its sides overestimated. Our algorithm is motivated by the empirical observation that updates of network parameters affect target loss functions asymmetrically. In the proposed continual learning framework, we estimate an asymmetric loss function for the tasks considered in the past through a proper overestimation of its unobserved side in training new tasks, while deriving the accurate model parameter for the observed side. In contrast to existing approaches, our method is free from side effects and achieves the state-of-the-art results that are even close to the upper-bound performance on several challenging benchmark datasets.



There are no comments yet.


page 1

page 8


SOLA: Continual Learning with Second-Order Loss Approximation

Neural networks have achieved remarkable success in many cognitive tasks...

Continual learning with direction-constrained optimization

This paper studies a new design of the optimization algorithm for traini...

Edge Continual Learning for Dynamic Digital Twins over Wireless Networks

Digital twins (DTs) constitute a critical link between the real-world an...

Continual Learning with Extended Kronecker-factored Approximate Curvature

We propose a quadratic penalty method for continual learning of neural n...

Training Networks in Null Space of Feature Covariance for Continual Learning

In the setting of continual learning, a network is trained on a sequence...

Rethinking Quadratic Regularizers: Explicit Movement Regularization for Continual Learning

Quadratic regularizers are often used for mitigating catastrophic forget...

How to Evaluate the Next System: Automatic Dialogue Evaluation from the Perspective of Continual Learning

Automatic dialogue evaluation plays a crucial role in open-domain dialog...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.