An Asymptotically Optimal Strategy for Constrained Multi-armed Bandit Problems

05/03/2018
by   Hyeong Soo Chang, et al.
0

For the stochastic multi-armed bandit (MAB) problem from a constrained model that generalizes the classical one, we show that an asymptotic optimality is achievable by a simple strategy extended from the ϵ_t-greedy strategy. We provide a finite-time lower bound on the probability of correct selection of an optimal near-feasible arm that holds for all time steps. Under some conditions, the bound approaches one as time t goes to infinity. A particular example sequence of {ϵ_t} having the asymptotic convergence rate in the order of (1-1/t)^4 that holds from a sufficiently large t is also discussed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/29/2020

An Index-based Deterministic Asymptotically Optimal Algorithm for Constrained Multi-armed Bandit Problems

For the model of constrained multi-armed bandit, we show that by constru...
research
01/28/2020

Faster Activity and Data Detection in Massive Random Access: A Multi-armed Bandit Approach

This paper investigates the grant-free random access with massive IoT de...
research
06/30/2020

Bounded Rationality in Las Vegas: Probabilistic Finite Automata PlayMulti-Armed Bandits

While traditional economics assumes that humans are fully rational agent...
research
11/21/2021

The Gittins Policy in the M/G/1 Queue

The Gittins policy is a highly general scheduling policy that minimizes ...
research
08/17/2019

A Batched Multi-Armed Bandit Approach to News Headline Testing

Optimizing news headlines is important for publishers and media sites. A...
research
02/21/2020

Double Explore-then-Commit: Asymptotic Optimality and Beyond

We study the two-armed bandit problem with subGaussian rewards. The expl...
research
05/23/2017

A Multi-Armed Bandit to Smartly Select a Training Set from Big Medical Data

With the availability of big medical image data, the selection of an ade...

Please sign up or login with your details

Forgot password? Click here to reset