A Scale Free Algorithm for Stochastic Bandits with Bounded Kurtosis

03/27/2017
by   Tor Lattimore, et al.
0

Existing strategies for finite-armed stochastic bandits mostly depend on a parameter of scale that must be known in advance. Sometimes this is in the form of a bound on the payoffs, or the knowledge of a variance or subgaussian parameter. The notable exceptions are the analysis of Gaussian bandits with unknown mean and variance by Cowan and Katehakis [2015] and of uniform distributions with unknown support [Cowan and Katehakis, 2015]. The results derived in these specialised cases are generalised here to the non-parametric setup, where the learner knows only a bound on the kurtosis of the noise, which is a scale free measure of the extremity of outliers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2023

Asymptotically Optimal Thompson Sampling Based Policy for the Uniform Bandits and the Gaussian Bandits

Thompson sampling (TS) for the parametric stochastic multi-armed bandits...
research
03/03/2020

Bounded Regret for Finitely Parameterized Multi-Armed Bandits

We consider the problem of finitely parameterized multi-armed bandits wh...
research
03/19/2019

Adaptivity, Variance and Separation for Adversarial Bandits

We make three contributions to the theory of k-armed adversarial bandits...
research
10/22/2019

Smoothness-Adaptive Stochastic Bandits

We consider the problem of non-parametric multi-armed bandits with stoch...
research
05/24/2019

Polynomial Cost of Adaptation for X -Armed Bandits

In the context of stochastic continuum-armed bandits, we present an algo...
research
05/30/2019

Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits

We consider the setup of stochastic multi-armed bandits in the case when...
research
10/26/2015

A Parallel algorithm for X-Armed bandits

The target of X-armed bandit problem is to find the global maximum of an...

Please sign up or login with your details

Forgot password? Click here to reset