DeepAI AI Chat
Log In Sign Up

Optimal Rates of (Locally) Differentially Private Heavy-tailed Multi-Armed Bandits

by   Youming Tao, et al.

In this paper we study the problem of stochastic multi-armed bandits (MAB) in the (local) differential privacy (DP/LDP) model. Unlike the previous results which need to assume bounded reward distributions, here we mainly focus on the case the reward distribution of each arm only has (1+v)-th moment with some v∈ (0, 1]. In the first part, we study the problem in the central ϵ-DP model. We first provide a near-optimal result by developing a private and robust Upper Confidence Bound (UCB) algorithm. Then, we improve the result via a private and robust version of the Successive Elimination (SE) algorithm. Finally, we show that the instance-dependent regret bound of our improved algorithm is optimal by showing its lower bound. In the second part of the paper, we study the problem in the ϵ-LDP model. We propose an algorithm which could be seen as locally private and robust version of the SE algorithm, and show it could achieve (near) optimal rates for both instance-dependent and instance-independent regrets. All of the above results can also reveal the differences between the problem of private MAB with bounded rewards and heavy-tailed rewards. To achieve these (near) optimal rates, we develop several new hard instances and private robust estimators as byproducts, which might could be used to other related problems. Finally, experimental results also support our theoretical analysis and show the effectiveness of our algorithms.


page 1

page 2

page 3

page 4


When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits

We study the problem of multi-armed bandits with ϵ-global Differential P...

On Private and Robust Bandits

We study private and robust multi-armed bandits (MABs), where the agent ...

Quantum Heavy-tailed Bandits

In this paper, we study multi-armed bandits (MAB) and stochastic linear ...

(Locally) Differentially Private Combinatorial Semi-Bandits

In this paper, we study Combinatorial Semi-Bandits (CSB) that is an exte...

Optimal Algorithms for Private Online Learning in a Stochastic Environment

We consider two variants of private stochastic online learning. The firs...

An Optimal Private Stochastic-MAB Algorithm Based on an Optimal Private Stopping Rule

We present a provably optimal differentially private algorithm for the s...

Differentially Private ℓ_1-norm Linear Regression with Heavy-tailed Data

We study the problem of Differentially Private Stochastic Convex Optimiz...