An Experimental Design Approach for Regret Minimization in Logistic Bandits

02/04/2022
by   Blake Mason, et al.
5

In this work we consider the problem of regret minimization for logistic bandits. The main challenge of logistic bandits is reducing the dependence on a potentially large problem dependent constant κ that can at worst scale exponentially with the norm of the unknown parameter θ_∗. Abeille et al. (2021) have applied self-concordance of the logistic function to remove this worst-case dependence providing regret guarantees like O(dlog^2(κ)√(μ̇T)log(|𝒳|)) where d is the dimensionality, T is the time horizon, and μ̇ is the variance of the best-arm. This work improves upon this bound in the fixed arm setting by employing an experimental design procedure that achieves a minimax regret of O(√(d μ̇Tlog(|𝒳|))). Our regret bound in fact takes a tighter instance (i.e., gap) dependent regret bound for the first time in logistic bandits. We also propose a new warmup sampling algorithm that can dramatically reduce the lower order term in the regret in general and prove that it can replace the lower order term dependency on κ to log^2(κ) for some instances. Finally, we discuss the impact of the bias of the MLE on the logistic bandit problem, providing an example where d^2 lower order regret (cf., it is d for linear bandits) may not be improved as long as the MLE is used and how bias-corrected estimators may be used to make it closer to d.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2020

Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

We propose improved fixed-design confidence bounds for the linear logist...
research
10/23/2020

Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits

Logistic Bandits have recently attracted substantial attention, by provi...
research
02/18/2020

Improved Optimistic Algorithms for Logistic Bandits

The generalized linear bandit framework has attracted a lot of attention...
research
01/06/2022

Jointly Efficient and Optimal Algorithms for Logistic Bandits

Logistic Bandits have recently undergone careful scrutiny by virtue of t...
research
05/03/2022

Norm-Agnostic Linear Bandits

Linear bandits have a wide variety of applications including recommendat...
research
06/26/2023

Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits

This paper is motivated by recent developments in the linear bandit lite...
research
02/12/2022

Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits

We consider the problem of combining and learning over a set of adversar...

Please sign up or login with your details

Forgot password? Click here to reset