Problem-Complexity Adaptive Model Selection for Stochastic Linear Bandits

06/04/2020
by   Avishek Ghosh, et al.
0

We consider the problem of model selection for two popular stochastic linear bandit settings, and propose algorithms that adapts to the unknown problem complexity. In the first setting, we consider the K armed mixture bandits, where the mean reward of arm i ∈ [K], is μ_i+ 〈α_i,t,θ^* 〉, with α_i,t∈R^d being the known context vector and μ_i ∈ [-1,1] and θ^* are unknown parameters. We define θ^* as the problem complexity and consider a sequence of nested hypothesis classes, each positing a different upper bound on θ^*. Exploiting this, we propose Adaptive Linear Bandit (ALB), a novel phase based algorithm that adapts to the true problem complexity, θ^*. We show that ALB achieves regret scaling of O(θ^*√(T)), where θ^* is apriori unknown. As a corollary, when θ^*=0, ALB recovers the minimax regret for the simple bandit algorithm without such knowledge of θ^*. ALB is the first algorithm that uses parameter norm as model section criteria for linear bandits. Prior state of art algorithms <cit.> achieve a regret of O(L√(T)), where L is the upper bound on θ^*, fed as an input to the problem. In the second setting, we consider the standard linear bandit problem (with possibly an infinite number of arms) where the sparsity of θ^*, denoted by d^* ≤ d, is unknown to the algorithm. Defining d^* as the problem complexity, we show that ALB achieves O(d^*√(T)) regret, matching that of an oracle who knew the true sparsity level. This is the first algorithm that achieves such model selection guarantees resolving an open problem in <cit.>. We further verify through synthetic and real-data experiments that the performance gains are fundamental and not artifacts of mathematical bounds.

READ FULL TEXT

page 1

page 2

page 3

page 4

07/07/2021

Model Selection for Generic Contextual Bandits

We consider the problem of model selection for the general stochastic co...
06/09/2021

Parameter and Feature Selection in Stochastic Linear Bandits

We study two model selection settings in stochastic linear bandits (LB)....
05/03/2022

Norm-Agnostic Linear Bandits

Linear bandits have a wide variety of applications including recommendat...
06/01/2022

An α-No-Regret Algorithm For Graphical Bilinear Bandits

We propose the first regret-based approach to the Graphical Bilinear Ban...
10/25/2021

The Pareto Frontier of model selection for general Contextual Bandits

Recent progress in model selection raises the question of the fundamenta...
07/23/2022

Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference

We present a non-asymptotic lower bound on the eigenspectrum of the desi...
11/25/2019

Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple Plays

We investigate the adversarial bandit problem with multiple plays under ...