Multi-Environment Meta-Learning in Stochastic Linear Bandits

05/12/2022
by   Ahmadreza Moradipari, et al.
0

In this work we investigate meta-learning (or learning-to-learn) approaches in multi-task linear stochastic bandit problems that can originate from multiple environments. Inspired by the work of [1] on meta-learning in a sequence of linear bandit problems whose parameters are sampled from a single distribution (i.e., a single environment), here we consider the feasibility of meta-learning when task parameters are drawn from a mixture distribution instead. For this problem, we propose a regularized version of the OFUL algorithm that, when trained on tasks with labeled environments, achieves low regret on a new task without requiring knowledge of the environment from which the new task originates. Specifically, our regret bound for the new algorithm captures the effect of environment misclassification and highlights the benefits over learning each task separately or meta-learning without recognition of the distinct mixture components.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2022

Meta-Learning for Simple Regret Minimization

We develop a meta-learning framework for simple regret minimization in b...
research
05/18/2020

Meta-learning with Stochastic Linear Bandits

We investigate meta-learning procedures in the setting of stochastic lin...
research
05/24/2018

Been There, Done That: Meta-Learning with Episodic Recall

Meta-learning agents excel at rapidly learning new tasks from open-ended...
research
12/01/2021

Meta Arcade: A Configurable Environment Suite for Meta-Learning

Most approaches to deep reinforcement learning (DRL) attempt to solve a ...
research
05/30/2022

Meta Representation Learning with Contextual Linear Bandits

Meta-learning seeks to build algorithms that rapidly learn how to solve ...
research
01/21/2022

Meta Learning MDPs with Linear Transition Models

We study meta-learning in Markov Decision Processes (MDP) with linear tr...
research
02/26/2022

Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework

Online learning in large-scale structured bandits is known to be challen...

Please sign up or login with your details

Forgot password? Click here to reset