Meta-Learning for Simple Regret Minimization

02/25/2022
by   MohammadJavad Azizi, et al.
0

We develop a meta-learning framework for simple regret minimization in bandits. In this framework, a learning agent interacts with a sequence of bandit tasks, which are sampled i.i.d. from an unknown prior distribution, and learns its meta-parameters to perform better on future tasks. We propose the first Bayesian and frequentist algorithms for this meta-learning problem. The Bayesian algorithm has access to a prior distribution over the meta-parameters and its meta simple regret over m bandit tasks with horizon n is mere Õ(m / √(n)). This is while we show that the meta simple regret of the frequentist algorithm is Õ(√(m) n + m/ √(n)), and thus, worse. However, the algorithm is more general, because it does not need a prior distribution over the meta-parameters, and is easier to implement for various distributions. We instantiate our algorithms for several classes of bandit problems. Our algorithms are general and we complement our theory by evaluating them empirically in several environments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2022

Multi-Environment Meta-Learning in Stochastic Linear Bandits

In this work we investigate meta-learning (or learning-to-learn) approac...
research
02/25/2022

Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms

We study a sequential decision problem where the learner faces a sequenc...
research
02/11/2021

Meta-Thompson Sampling

Efficient exploration in multi-armed bandits is a fundamental online lea...
research
07/13/2021

No Regrets for Learning the Prior in Bandits

We propose AdaTS, a Thompson sampling algorithm that adapts sequentially...
research
02/26/2022

Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework

Online learning in large-scale structured bandits is known to be challen...
research
09/29/2021

Dynamic Regret Analysis for Online Meta-Learning

The online meta-learning framework has arisen as a powerful tool for the...
research
12/29/2022

Eliminating Meta Optimization Through Self-Referential Meta Learning

Meta Learning automates the search for learning algorithms. At the same ...

Please sign up or login with your details

Forgot password? Click here to reset