Parameter and Feature Selection in Stochastic Linear Bandits

06/09/2021
by   Ahmadreza Moradipari, et al.
0

We study two model selection settings in stochastic linear bandits (LB). In the first setting, the reward parameter of the LB problem is arbitrarily selected from M models represented as (possibly) overlapping balls in ℝ^d. However, the agent only has access to misspecified models, i.e., estimates of the centers and radii of the balls. We refer to this setting as parameter selection. In the second setting, which we refer to as feature selection, the expected reward of the LB problem is in the linear span of at least one of M feature maps (models). For each setting, we develop and analyze an algorithm that is based on a reduction from bandits to full-information problems. This allows us to obtain regret bounds that are not worse (up to a √(log M) factor) than the case where the true model is known. Our parameter selection algorithm is OFUL-style and the one for feature selection is based on the SquareCB algorithm. We also show that the regret of our parameter selection algorithm scales logarithmically with model misspecification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2020

Problem-Complexity Adaptive Model Selection for Stochastic Linear Bandits

We consider the problem of model selection for two popular stochastic li...
research
12/24/2020

Regret Bound Balancing and Elimination for Model Selection in Bandits and RL

We propose a simple model selection approach for algorithms in stochasti...
research
06/11/2021

Optimal Model Selection in Contextual Bandits with Many Classes via Offline Oracles

We study the problem of model selection for contextual bandits, in which...
research
07/07/2021

Model Selection for Generic Contextual Bandits

We consider the problem of model selection for the general stochastic co...
research
07/24/2023

Anytime Model Selection in Linear Bandits

Model selection in the context of bandit optimization is a challenging p...
research
02/18/2020

Improved Optimistic Algorithms for Logistic Bandits

The generalized linear bandit framework has attracted a lot of attention...
research
06/15/2021

Collaborative Learning and Personalization in Multi-Agent Stochastic Linear Bandits

We consider the problem of minimizing regret in an N agent heterogeneous...

Please sign up or login with your details

Forgot password? Click here to reset