Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets

05/30/2022
by   Zongqi Wan, et al.
8

Multi-arm bandit (MAB) and stochastic linear bandit (SLB) are important models in reinforcement learning, and it is well-known that classical algorithms for bandits with time horizon T suffer Ω(√(T)) regret. In this paper, we study MAB and SLB with quantum reward oracles and propose quantum algorithms for both models with O((log T)) regrets, exponentially improving the dependence in terms of T. To the best of our knowledge, this is the first provable quantum speedup for regrets of bandit problems and in general exploitation in reinforcement learning. Compared to previous literature on quantum exploration algorithms for MAB and reinforcement learning, our quantum input model is simpler and only assumes quantum oracles for each individual arm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2020

Quantum exploration algorithms for multi-armed bandits

Identifying the best arm of a multi-armed bandit is a central problem in...
research
10/30/2020

The Combinatorial Multi-Bandit Problem and its Application to Energy Management

We study a Combinatorial Multi-Bandit Problem motivated by applications ...
research
02/15/2020

Quantum Bandits

We consider the quantum version of the bandit problem known as best arm...
research
09/26/2022

Quantum Speedups of Optimizing Approximately Convex Functions with Applications to Logarithmic Regret Stochastic Convex Bandits

We initiate the study of quantum algorithms for optimizing approximately...
research
09/15/2021

Estimation of Warfarin Dosage with Reinforcement Learning

In this paper, it has attempted to use Reinforcement learning to model t...
research
04/21/2020

Algorithms for slate bandits with non-separable reward functions

In this paper, we study a slate bandit problem where the function that d...
research
07/02/2019

Bandit Learning Through Biased Maximum Likelihood Estimation

We propose BMLE, a new family of bandit algorithms, that are formulated ...

Please sign up or login with your details

Forgot password? Click here to reset