DeepAI AI Chat
Log In Sign Up

Continuous-Time Meta-Learning with Forward Mode Differentiation

by   Tristan Deleu, et al.

Drawing inspiration from gradient-based meta-learning methods with infinitely small gradient steps, we introduce Continuous-Time Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field. Specifically, representations of the inputs are meta-learned such that a task-specific linear classifier is obtained as a solution of an ordinary differential equation (ODE). Treating the learning process as an ODE offers the notable advantage that the length of the trajectory is now continuous, as opposed to a fixed and discrete number of gradient steps. As a consequence, we can optimize the amount of adaptation necessary to solve a new task using stochastic gradient descent, in addition to learning the initial conditions as is standard practice in gradient-based meta-learning. Importantly, in order to compute the exact meta-gradients required for the outer-loop updates, we devise an efficient algorithm based on forward mode differentiation, whose memory requirements do not scale with the length of the learning trajectory, thus allowing longer adaptation in constant memory. We provide analytical guarantees for the stability of COMLN, we show empirically its efficiency in terms of runtime and memory usage, and we illustrate its effectiveness on a range of few-shot image classification problems.


page 1

page 2

page 3

page 4


Meta-Learning with Warped Gradient Descent

A versatile and effective approach to meta-learning is to infer a gradie...

Meta-Learning with Adaptive Layerwise Metric and Subspace

Recent advances in meta-learning demonstrate that deep representations c...

Meta-Learning with Implicit Gradients

A core capability of intelligent systems is the ability to quickly learn...

Meta-Learning with Adjoint Methods

Model Agnostic Meta-Learning (MAML) is widely used to find a good initia...

Towards Understanding Generalization in Gradient-Based Meta-Learning

In this work we study generalization of neural networks in gradient-base...

Why Does MAML Outperform ERM? An Optimization Perspective

Model-Agnostic Meta-Learning (MAML) has demonstrated widespread success ...

Trajectory-Based Meta-Learning for Out-Of-Vocabulary Word Embedding Learning

Word embedding learning methods require a large number of occurrences of...