Model agnostic methods meta-learn despite misspecifications

03/02/2023
by   Oguz Yuksel, et al.
0

Due to its empirical success on few shot classification and reinforcement learning, meta-learning recently received a lot of interest. Meta-learning leverages data from previous tasks to quickly learn a new task, despite limited data. In particular, model agnostic methods look for initialisation points from which gradient descent quickly adapts to any new task. Although it has been empirically suggested that such methods learn a good shared representation during training, there is no strong theoretical evidence of such behavior. More importantly, it is unclear whether these methods truly are model agnostic, i.e., whether they still learn a shared structure despite architecture misspecifications. To fill this gap, this work shows in the limit of an infinite number of tasks that first order ANIL with a linear two-layer network architecture successfully learns a linear shared representation. Moreover, this result holds despite misspecifications: having a large width with respect to the hidden dimension of the shared representation does not harm the algorithm performance. The learnt parameters then allow to get a small test loss after a single gradient step on any new task. Overall this illustrates how well model agnostic methods can adapt to any (unknown) model structure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2018

Probabilistic Model-Agnostic Meta-Learning

Meta-learning for few-shot learning entails acquiring a prior over previ...
research
10/27/2020

Why Does MAML Outperform ERM? An Optimization Perspective

Model-Agnostic Meta-Learning (MAML) has demonstrated widespread success ...
research
02/07/2022

MAML and ANIL Provably Learn Representations

Recent empirical evidence has driven conventional wisdom to believe that...
research
11/10/2020

Fast Slow Learning: Incorporating Synthetic Gradients in Neural Memory Controllers

Neural Memory Networks (NMNs) have received increased attention in recen...
research
05/28/2019

Discrete Infomax Codes for Meta-Learning

Learning compact discrete representations of data is itself a key task i...
research
06/06/2018

Meta Learning by the Baldwin Effect

The scope of the Baldwin effect was recently called into question by two...

Please sign up or login with your details

Forgot password? Click here to reset