Quantum Heavy-tailed Bandits

01/23/2023
by   Yulian Wu, et al.
0

In this paper, we study multi-armed bandits (MAB) and stochastic linear bandits (SLB) with heavy-tailed rewards and quantum reward oracle. Unlike the previous work on quantum bandits that assumes bounded/sub-Gaussian distributions for rewards, here we investigate the quantum bandits problem under a weaker assumption that the distributions of rewards only have bounded (1+v)-th moment for some v∈ (0,1]. In order to achieve regret improvements for heavy-tailed bandits, we first propose a new quantum mean estimator for heavy-tailed distributions, which is based on the Quantum Monte Carlo Mean Estimator and achieves a quadratic improvement of estimation error compared to the classical one. Based on our quantum mean estimator, we focus on quantum heavy-tailed MAB and SLB and propose quantum algorithms based on the Upper Confidence Bound (UCB) framework for both problems with O(T^1-v/1+v) regrets, polynomially improving the dependence in terms of T as compared to classical (near) optimal regrets of O(T^1/1+v), where T is the number of rounds. Finally, experiments also support our theoretical results and show the effectiveness of our proposed methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2021

Optimal Rates of (Locally) Differentially Private Heavy-tailed Multi-Armed Bandits

In this paper we study the problem of stochastic multi-armed bandits (MA...
research
08/27/2021

Quantum Sub-Gaussian Mean Estimator

We present a new quantum algorithm for estimating the mean of a real-val...
research
10/26/2021

Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs

Despite a large amount of effort in dealing with heavy-tailed error in m...
research
03/07/2022

Bandits Corrupted by Nature: Lower Bounds on Regret and Robust Optimistic Algorithm

In this paper, we study the stochastic bandits problem with k unknown he...
research
03/28/2019

Large Deviations of Linear Models with Regularly-Varying Tails: Asymptotics and Efficient Estimation

We analyze the Large Deviation Probability (LDP) of linear factor models...
research
02/06/2023

On Private and Robust Bandits

We study private and robust multi-armed bandits (MABs), where the agent ...
research
07/06/2021

Distributed Adaptive Huber Regression

Distributed data naturally arise in scenarios involving multiple sources...

Please sign up or login with your details

Forgot password? Click here to reset