Dynamic Batch Learning in High-Dimensional Sparse Linear Contextual Bandits

08/27/2020
by   Zhimei Ren, et al.
0

We study the problem of dynamic batch learning in high-dimensional sparse linear contextual bandits, where a decision maker, under a given maximum-number-of-batch constraint and only able to observe rewards at the end of each batch, can dynamically decide how many individuals to include in the next batch (at the end of the current batch) and what personalized action-selection scheme to adopt within each batch. Such batch constraints are ubiquitous in a variety of practical contexts, including personalized product offerings in marketing and medical treatment selection in clinical trials. We characterize the fundamental learning limit in this problem via a regret lower bound and provide a matching upper bound (up to log factors), thus prescribing an optimal scheme for this problem. To the best of our knowledge, our work provides the first inroad into a theoretical understanding of dynamic batch learning in high-dimensional sparse linear contextual bandits. Notably, even a special case of our result (when no batch constraint is present) yields the first minimax optimal Õ(√(s_0T)) regret bound for standard online learning in high-dimensional linear contextual bandits (for the no-margin case), where s_0 is the sparsity parameter (or an upper bound thereof) and T is the learning horizon. This result (both that Õ(√(s_0 T)) is achievable and that Ω(√(s_0 T)) is a lower bound) appears to be unknown in the emerging literature of high-dimensional contextual bandits.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/14/2020

Sequential Batch Learning in Finite-Action Linear Contextual Bandits

We study the sequential batch learning problem in linear contextual band...
research
10/15/2021

Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits

We study the optimal batch-regret tradeoff for batch linear contextual b...
research
11/11/2022

Thompson Sampling for High-Dimensional Sparse Linear Contextual Bandits

We consider the stochastic linear contextual bandit problem with high-di...
research
07/04/2020

Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design

Motivated by practical needs such as large-scale learning, we study the ...
research
02/14/2022

The Impact of Batch Learning in Stochastic Linear Bandits

We consider a special case of bandit problems, named batched bandits, in...
research
03/20/2019

Contextual Bandits with Random Projection

Contextual bandits with linear payoffs, which are also known as linear b...
research
11/03/2021

The Impact of Batch Learning in Stochastic Bandits

We consider a special case of bandit problems, namely batched bandits. M...

Please sign up or login with your details

Forgot password? Click here to reset