MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning

07/31/2023
by   Baoquan Zhang, et al.
0

Equipping a deep model the abaility of few-shot learning, i.e., learning quickly from only few examples, is a core challenge for artificial intelligence. Gradient-based meta-learning approaches effectively address the challenge by learning how to learn novel tasks. Its key idea is learning a deep model in a bi-level optimization manner, where the outer-loop process learns a shared gradient descent algorithm (i.e., its hyperparameters), while the inner-loop process leverage it to optimize a task-specific model by using only few labeled data. Although these existing methods have shown superior performance, the outer-loop process requires calculating second-order derivatives along the inner optimization path, which imposes considerable memory burdens and the risk of vanishing gradients. Drawing inspiration from recent progress of diffusion models, we find that the inner-loop gradient descent process can be actually viewed as a reverse process (i.e., denoising) of diffusion where the target of denoising is model weights but the origin data. Based on this fact, in this paper, we propose to model the gradient descent optimizer as a diffusion model and then present a novel task-conditional diffusion-based meta-learning, called MetaDiff, that effectively models the optimization process of model weights from Gaussion noises to target weights in a denoising manner. Thanks to the training efficiency of diffusion models, our MetaDiff do not need to differentiate through the inner-loop path such that the memory burdens and the risk of vanishing gradients can be effectvely alleviated. Experiment results show that our MetaDiff outperforms the state-of-the-art gradient-based meta-learning family in few-shot learning tasks.

READ FULL TEXT
research
04/04/2023

Meta-Learning with a Geometry-Adaptive Preconditioner

Model-agnostic meta-learning (MAML) is one of the most successful meta-l...
research
09/10/2019

Meta-Learning with Implicit Gradients

A core capability of intelligent systems is the ability to quickly learn...
research
09/08/2021

Do What Nature Did To Us: Evolving Plastic Recurrent Neural Networks For Task Generalization

While artificial neural networks (ANNs) have been widely adopted in mach...
research
06/08/2020

Multi-step Estimation for Gradient-based Meta-learning

Gradient-based meta-learning approaches have been successful in few-shot...
research
06/06/2018

Meta Learning by the Baldwin Effect

The scope of the Baldwin effect was recently called into question by two...
research
11/10/2020

Fast Slow Learning: Incorporating Synthetic Gradients in Neural Memory Controllers

Neural Memory Networks (NMNs) have received increased attention in recen...
research
06/05/2020

UFO-BLO: Unbiased First-Order Bilevel Optimization

Bilevel optimization (BLO) is a popular approach with many applications ...

Please sign up or login with your details

Forgot password? Click here to reset