Continual Learning of Generative Models with Limited Data: From Wasserstein-1 Barycenter to Adaptive Coalescence

01/22/2021
by   Mehmet Dedeoglu, et al.
7

Learning generative models is challenging for a network edge node with limited data and computing power. Since tasks in similar environments share model similarity, it is plausible to leverage pre-trained generative models from the cloud or other edge nodes. Appealing to optimal transport theory tailored towards Wasserstein-1 generative adversarial networks (WGAN), this study aims to develop a framework which systematically optimizes continual learning of generative models using local data at the edge node while exploiting adaptive coalescence of pre-trained generative models. Specifically, by treating the knowledge transfer from other nodes as Wasserstein balls centered around their pre-trained models, continual learning of generative models is cast as a constrained optimization problem, which is further reduced to a Wasserstein-1 barycenter problem. A two-stage approach is devised accordingly: 1) The barycenters among the pre-trained models are computed offline, where displacement interpolation is used as the theoretic foundation for finding adaptive barycenters via a "recursive" WGAN configuration; 2) the barycenter computed offline is used as meta-model initialization for continual learning and then fast adaptation is carried out to find the generative model using the local samples at the target edge node. Finally, a weight ternarization method, based on joint optimization of weights and threshold for quantization, is developed to compress the generative model further.

READ FULL TEXT

page 3

page 9

page 18

page 28

research
12/21/2018

Generative Models from the perspective of Continual Learning

Which generative model is the most suitable for Continual Learning? This...
research
05/19/2023

Few-Shot Continual Learning for Conditional Generative Adversarial Networks

In few-shot continual learning for generative models, a target mode must...
research
01/17/2022

Logarithmic Continual Learning

We introduce a neural network architecture that logarithmically reduces ...
research
09/13/2023

PILOT: A Pre-Trained Model-Based Continual Learning Toolbox

While traditional machine learning can effectively tackle a wide range o...
research
02/11/2022

Continual Learning with Invertible Generative Models

Catastrophic forgetting (CF) happens whenever a neural network overwrite...
research
08/23/2023

Overcoming General Knowledge Loss with Selective Parameter Finetuning

Foundation models encompass an extensive knowledge base and offer remark...
research
08/30/2022

Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models

Deep generative models (DGMs) are data-eager. Essentially, it is because...

Please sign up or login with your details

Forgot password? Click here to reset