Thompson Sampling for Robust Transfer in Multi-Task Bandits

06/17/2022
by   Zhi Wang, et al.
0

We study the problem of online multi-task learning where the tasks are performed within similar but not necessarily identical multi-armed bandit environments. In particular, we study how a learner can improve its overall performance across multiple related tasks through robust transfer of knowledge. While an upper confidence bound (UCB)-based algorithm has recently been shown to achieve nearly-optimal performance guarantees in a setting where all tasks are solved concurrently, it remains unclear whether Thompson sampling (TS) algorithms, which have superior empirical performance in general, share similar theoretical properties. In this work, we present a TS-type algorithm for a more general online multi-task learning protocol, which extends the concurrent setting. We provide its frequentist analysis and prove that it is also nearly-optimal using a novel concentration inequality for multi-task data aggregation at random stopping times. Finally, we evaluate the algorithm on synthetic data and show that the TS-type algorithm enjoys superior empirical performance in comparison with the UCB-based algorithm and a baseline algorithm that performs TS for each individual task without transfer.

READ FULL TEXT
research
05/24/2017

Multi-Task Learning for Contextual Bandits

Contextual bandits are a form of multi-armed bandit in which the agent h...
research
10/29/2020

Multitask Bandit Learning through Heterogeneous Feedback Aggregation

In many real-world applications, multiple agents seek to learn how to pe...
research
02/21/2022

Multi-task Representation Learning with Stochastic Linear Bandits

We study the problem of transfer-learning in the setting of stochastic l...
research
03/31/2023

Learning from Similar Linear Representations: Adaptivity, Minimaxity, and Robustness

Representation multi-task learning (MTL) and transfer learning (TL) have...
research
05/10/2023

Efficient Training of Multi-task Neural Solver with Multi-armed Bandits

Efficiently training a multi-task neural solver for various combinatoria...
research
02/21/2016

Multi-Task Learning with Labeled and Unlabeled Tasks

In multi-task learning, a learner is given a collection of prediction ta...
research
06/19/2018

Dynamic Multi-Level Multi-Task Learning for Sentence Simplification

Sentence simplification aims to improve readability and understandabilit...

Please sign up or login with your details

Forgot password? Click here to reset