Heterogeneous Continual Learning

by   Divyam Madaan, et al.
NYU college

We propose a novel framework and a solution to tackle the continual learning (CL) problem with changing network architectures. Most CL methods focus on adapting a single architecture to a new task/class by modifying its weights. However, with rapid progress in architecture design, the problem of adapting existing solutions to novel architectures becomes relevant. To address this limitation, we propose Heterogeneous Continual Learning (HCL), where a wide range of evolving network architectures emerge continually together with novel data/tasks. As a solution, we build on top of the distillation family of techniques and modify it to a new setting where a weaker model takes the role of a teacher; meanwhile, a new stronger architecture acts as a student. Furthermore, we consider a setup of limited access to previous data and propose Quick Deep Inversion (QDI) to recover prior task visual features to support knowledge transfer. QDI significantly reduces computational costs compared to previous solutions and improves overall performance. In summary, we propose a new setup for CL with a modified knowledge distillation paradigm and design a quick data inversion method to enhance distillation. Our evaluation of various benchmarks shows a significant improvement on accuracy in comparison to state-of-the-art methods over various networks architectures.


page 8

page 14


Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference

Despite rapid advances in continual learning, a large body of research i...

Subspace Distillation for Continual Learning

An ultimate objective in continual learning is to preserve knowledge lea...

Rethinking Momentum Knowledge Distillation in Online Continual Learning

Online Continual Learning (OCL) addresses the problem of training neural...

Split-and-Bridge: Adaptable Class Incremental Learning within a Single Neural Network

Continual learning has been a major problem in the deep learning communi...

Online Continual Learning via the Meta-learning Update with Multi-scale Knowledge Distillation and Data Augmentation

Continual learning aims to rapidly and continually learn the current tas...

Projected Latent Distillation for Data-Agnostic Consolidation in Distributed Continual Learning

Distributed learning on the edge often comprises self-centered devices (...

Dynamic Y-KD: A Hybrid Approach to Continual Instance Segmentation

Despite the success of deep learning methods on instance segmentation, t...

Please sign up or login with your details

Forgot password? Click here to reset