Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits

06/26/2023
by   Yuwei Luo, et al.
0

This paper is motivated by recent developments in the linear bandit literature, which have revealed a discrepancy between the promising empirical performance of algorithms such as Thompson sampling and Greedy, when compared to their pessimistic theoretical regret bounds. The challenge arises from the fact that while these algorithms may perform poorly in certain problem instances, they generally excel in typical instances. To address this, we propose a new data-driven technique that tracks the geometry of the uncertainty ellipsoid, enabling us to establish an instance-dependent frequentist regret bound for a broad class of algorithms, including Greedy, OFUL, and Thompson sampling. This result empowers us to identify and “course-correct" instances in which the base algorithms perform poorly. The course-corrected algorithms achieve the minimax optimal regret of order 𝒪̃(d√(T)), while retaining most of the desirable properties of the base algorithms. We present simulation results to validate our findings and compare the performance of our algorithms with the baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2022

An Experimental Design Approach for Regret Minimization in Logistic Bandits

In this work we consider the problem of regret minimization for logistic...
research
03/12/2022

Instance-Dependent Regret Analysis of Kernelized Bandits

We study the kernelized bandit problem, that involves designing an adapt...
research
06/05/2020

Adaptation to the Range in K-Armed Bandits

We consider stochastic bandit problems with K arms, each associated with...
research
12/03/2018

Thompson Sampling for Noncompliant Bandits

Thompson sampling, a Bayesian method for balancing exploration and explo...
research
02/10/2021

On the Suboptimality of Thompson Sampling in High Dimensions

In this paper we consider Thompson Sampling for combinatorial semi-bandi...
research
08/29/2023

Exploiting Problem Geometry in Safe Linear Bandits

The safe linear bandit problem is a version of the classic linear bandit...

Please sign up or login with your details

Forgot password? Click here to reset