General-Purpose In-Context Learning by Meta-Learning Transformers

12/08/2022
by   Louis Kirsch, et al.
0

Modern machine learning requires system designers to specify aspects of the learning pipeline, such as losses, architectures, and optimizers. Meta-learning, or learning-to-learn, instead aims to learn those aspects, and promises to unlock greater capabilities with less manual effort. One particularly ambitious goal of meta-learning is to train general-purpose in-context learning algorithms from scratch, using only black-box models with minimal inductive bias. Such a model takes in training data, and produces test-set predictions across a wide range of problems, without any explicit definition of an inference model, training loss, or optimization algorithm. In this paper we show that Transformers and other black-box models can be meta-trained to act as general-purpose in-context learners. We characterize phase transitions between algorithms that generalize, algorithms that memorize, and algorithms that fail to meta-train at all, induced by changes in model size, number of tasks, and meta-optimization. We further show that the capabilities of meta-trained algorithms are bottlenecked by the accessible state size (memory) determining the next prediction, unlike standard models which are thought to be bottlenecked by parameter count. Finally, we propose practical interventions such as biasing the training distribution that improve the meta-training and meta-generalization of general-purpose learning algorithms.

READ FULL TEXT

page 4

page 17

page 19

research
05/28/2023

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

Data-free meta-learning (DFML) aims to enable efficient learning of new ...
research
03/05/2021

Meta Learning Black-Box Population-Based Optimizers

The no free lunch theorem states that no model is better suited to every...
research
02/09/2020

Local Nonparametric Meta-Learning

A central goal of meta-learning is to find a learning rule that enables ...
research
05/28/2014

An Easy to Use Repository for Comparing and Improving Machine Learning Algorithm Usage

The results from most machine learning experiments are used for a specif...
research
01/05/2020

From Learning to Meta-Learning: Reduced Training Overhead and Complexity for Communication Systems

Machine learning methods adapt the parameters of a model, constrained to...
research
08/26/2019

An Introduction to Advanced Machine Learning : Meta Learning Algorithms, Applications and Promises

In [1, 2], we have explored the theoretical aspects of feature extractio...
research
09/02/2019

Data-driven simulation for general purpose multibody dynamics using deep neural networks

In this paper, a machine learning-based simulation framework of general-...

Please sign up or login with your details

Forgot password? Click here to reset