Nearly Minimax Algorithms for Linear Bandits with Shared Representation

03/29/2022
by   Jiaqi Yang, et al.
0

We give novel algorithms for multi-task and lifelong linear bandits with shared representation. Specifically, we consider the setting where we play M linear bandits with dimension d, each for T rounds, and these M bandit tasks share a common k(≪ d) dimensional linear representation. For both the multi-task setting where we play the tasks concurrently, and the lifelong setting where we play tasks sequentially, we come up with novel algorithms that achieve O(d√(kMT) + kM√(T)) regret bounds, which matches the known minimax regret lower bound up to logarithmic factors and closes the gap in existing results [Yang et al., 2021]. Our main technique include a more efficient estimator for the low-rank linear feature extractor and an accompanied novel analysis for this estimator.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2021

Near-optimal Representation Learning for Linear Bandits and Linear RL

This paper studies representation learning for multi-task linear bandits...
research
02/21/2022

Multi-task Representation Learning with Stochastic Linear Bandits

We study the problem of transfer-learning in the setting of stochastic l...
research
06/24/2022

Joint Representation Training in Sequential Tasks with Shared Structure

Classical theory in reinforcement learning (RL) predominantly focuses on...
research
06/11/2022

Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual Bandits

We propose a novel algorithm for linear contextual bandits with O(√(dT l...
research
12/06/2019

Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling

Stochastic Rank-One Bandits (Katarya et al, (2017a,b)) are a simple fram...
research
06/20/2019

Sequential Experimental Design for Transductive Linear Bandits

In this paper we introduce the transductive linear bandit problem: given...
research
05/30/2022

Meta Representation Learning with Contextual Linear Bandits

Meta-learning seeks to build algorithms that rapidly learn how to solve ...

Please sign up or login with your details

Forgot password? Click here to reset