Structured Stochastic Linear Bandits

06/17/2016
by   Nicholas Johnson, et al.
0

The stochastic linear bandit problem proceeds in rounds where at each round the algorithm selects a vector from a decision set after which it receives a noisy linear loss parameterized by an unknown vector. The goal in such a problem is to minimize the (pseudo) regret which is the difference between the total expected loss of the algorithm and the total expected loss of the best fixed vector in hindsight. In this paper, we consider settings where the unknown parameter has structure, e.g., sparse, group sparse, low-rank, which can be captured by a norm, e.g., L_1, L_(1,2), nuclear norm. We focus on constructing confidence ellipsoids which contain the unknown parameter across all rounds with high-probability. We show the radius of such ellipsoids depend on the Gaussian width of sets associated with the norm capturing the structure. Such characterization leads to tighter confidence ellipsoids and, therefore, sharper regret bounds compared to bounds in the existing literature which are based on the ambient dimensionality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2013

Stochastic continuum armed bandit problem of few linear parameters in high dimensions

We consider a stochastic continuum armed bandit problem where the arms a...
research
01/19/2015

Implementable confidence sets in high dimensional regression

We consider the setting of linear regression in high dimension. We focus...
research
02/26/2020

Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis

Bandit learning algorithms typically involve the balance of exploration ...
research
02/08/2022

Budgeted Combinatorial Multi-Armed Bandits

We consider a budgeted combinatorial multi-armed bandit setting where, i...
research
11/23/2020

Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

We propose improved fixed-design confidence bounds for the linear logist...
research
06/26/2019

Orthogonal Projection in Linear Bandits

The expected reward in a linear stochastic bandit model is an unknown li...

Please sign up or login with your details

Forgot password? Click here to reset