research
∙
01/30/2020
Finite-time Analysis of Kullback-Leibler Upper Confidence Bounds for Optimal Adaptive Allocation with Multiple Plays and Markovian Rewards
We study an extension of the classic stochastic multi-armed bandit probl...
research
∙
01/05/2020
A Hoeffding Inequality for Finite State Markov Chains and its Applications to Markovian Bandits
This paper develops a Hoeffding inequality for the partial sums ∑_k=1^n ...
research
∙
12/02/2019
Optimal Best Markovian Arm Identification with Fixed Confidence
We give a complete characterization of the sampling complexity of best M...
research
∙
07/10/2019