Pessimism for Offline Linear Contextual Bandits using ℓ_p Confidence Sets

05/21/2022
by   Gene Li, et al.
0

We present a family {π̂}_p≥ 1 of pessimistic learning rules for offline learning of linear contextual bandits, relying on confidence sets with respect to different ℓ_p norms, where π̂_2 corresponds to Bellman-consistent pessimism (BCP), while π̂_∞ is a novel generalization of lower confidence bound (LCB) to the linear setting. We show that the novel π̂_∞ learning rule is, in a sense, adaptively optimal, as it achieves the minimax performance (up to log factors) against all ℓ_q-constrained problems, and as such it strictly dominates all other predictors in the family, including π̂_2.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2017

Provably Optimal Algorithms for Generalized Linear Contextual Bandits

Contextual bandits are widely used in Internet services from news recomm...
research
11/13/2020

Improving Offline Contextual Bandits with Distributional Robustness

This paper extends the Distributionally Robust Optimization (DRO) approa...
research
10/19/2021

Regret Minimization in Isotonic, Heavy-Tailed Contextual Bandits via Adaptive Confidence Bands

In this paper we initiate a study of non parametric contextual bandits u...
research
06/02/2023

A Convex Relaxation Approach to Bayesian Regret Minimization in Offline Bandits

Algorithms for offline bandits must optimize decisions in uncertain envi...
research
07/04/2020

Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design

Motivated by practical needs such as large-scale learning, we study the ...
research
03/04/2020

Taking a hint: How to leverage loss predictors in contextual bandits?

We initiate the study of learning in contextual bandits with the help of...
research
10/08/2020

Online and Distribution-Free Robustness: Regression and Contextual Bandits with Huber Contamination

In this work we revisit two classic high-dimensional online learning pro...

Please sign up or login with your details

Forgot password? Click here to reset