Gradient Agreement as an Optimization Objective for Meta-Learning

10/18/2018
by   Amir Erfan Eshratifar, et al.
0

This paper presents a novel optimization method for maximizing generalization over tasks in meta-learning. The goal of meta-learning is to learn a model for an agent adapting rapidly when presented with previously unseen tasks. Tasks are sampled from a specific distribution which is assumed to be similar for both seen and unseen tasks. We focus on a family of meta-learning methods learning initial parameters of a base model which can be fine-tuned quickly on a new task, by few gradient steps (MAML). Our approach is based on pushing the parameters of the model to a direction in which tasks have more agreement upon. If the gradients of a task agree with the parameters update vector, then their inner product will be a large positive value. As a result, given a batch of tasks to be optimized for, we associate a positive (negative) weight to the loss function of a task, if the inner product between its gradients and the average of the gradients of all tasks in the batch is a positive (negative) value. Therefore, the degree of the contribution of a task to the parameter updates is controlled by introducing a set of weights on the loss function of the tasks. Our method can be easily integrated with the current meta-learning algorithms for neural networks. Our experiments demonstrate that it yields models with better generalization compared to MAML and Reptile.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2019

Towards Understanding Generalization in Gradient-Based Meta-Learning

In this work we study generalization of neural networks in gradient-base...
research
02/14/2021

Large-Scale Meta-Learning with Continual Trajectory Shifting

Meta-learning of shared initialization parameters has shown to be highly...
research
03/08/2018

Reptile: a Scalable Metalearning Algorithm

This paper considers metalearning problems, where there is a distributio...
research
02/23/2023

MetaLDC: Meta Learning of Low-Dimensional Computing Classifiers for Fast On-Device Adaption

Fast model updates for unseen tasks on intelligent edge devices are cruc...
research
06/06/2018

Meta Learning by the Baldwin Effect

The scope of the Baldwin effect was recently called into question by two...
research
02/25/2021

Multi-Domain Learning by Meta-Learning: Taking Optimal Steps in Multi-Domain Loss Landscapes by Inner-Loop Learning

We consider a model-agnostic solution to the problem of Multi-Domain Lea...
research
06/03/2022

Dynamic Kernel Selection for Improved Generalization and Memory Efficiency in Meta-learning

Gradient based meta-learning methods are prone to overfit on the meta-tr...

Please sign up or login with your details

Forgot password? Click here to reset