Regret Lower Bound and Optimal Algorithm for High-Dimensional Contextual Linear Bandit

09/23/2021
by   Ke Li, et al.
0

In this paper, we consider the multi-armed bandit problem with high-dimensional features. First, we prove a minimax lower bound, 𝒪((log d)^α+1/2T^1-α/2+log T), for the cumulative regret, in terms of horizon T, dimension d and a margin parameter α∈[0,1], which controls the separation between the optimal and the sub-optimal arms. This new lower bound unifies existing regret bound results that have different dependencies on T due to the use of different values of margin parameter α explicitly implied by their assumptions. Second, we propose a simple and computationally efficient algorithm inspired by the general Upper Confidence Bound (UCB) strategy that achieves a regret upper bound matching the lower bound. The proposed algorithm uses a properly centered ℓ_1-ball as the confidence set in contrast to the commonly used ellipsoid confidence set. In addition, the algorithm does not require any forced sampling step and is thereby adaptive to the practically unknown margin parameter. Simulations and a real data analysis are conducted to compare the proposed method with existing ones in the literature.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2015

Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays

We discuss a multiple-play multi-armed bandit (MAB) problem in which sev...
research
03/30/2021

Optimal Stochastic Nonconvex Optimization with Bandit Feedback

In this paper, we analyze the continuous armed bandit problems for nonco...
research
06/04/2020

Differentiable Linear Bandit Algorithm

Upper Confidence Bound (UCB) is arguably the most commonly used method f...
research
12/07/2018

Online Learning and Decision-Making under Generalized Linear Model with High-Dimensional Data

We propose a minimax concave penalized multi-armed bandit algorithm unde...
research
01/25/2020

Tight Regret Bounds for Noisy Optimization of a Brownian Motion

We consider the problem of Bayesian optimization of a one-dimensional Br...
research
12/13/2022

Towards Efficient and Domain-Agnostic Evasion Attack with High-dimensional Categorical Inputs

Our work targets at searching feasible adversarial perturbation to attac...
research
01/31/2022

Generalized Bayesian Upper Confidence Bound with Approximate Inference for Bandit Problems

Bayesian bandit algorithms with approximate inference have been widely u...

Please sign up or login with your details

Forgot password? Click here to reset