Optimal Rates of (Locally) Differentially Private Heavy-tailed Multi-Armed Bandits

06/04/2021
by   Youming Tao, et al.
0

In this paper we study the problem of stochastic multi-armed bandits (MAB) in the (local) differential privacy (DP/LDP) model. Unlike the previous results which need to assume bounded reward distributions, here we mainly focus on the case the reward distribution of each arm only has (1+v)-th moment with some v∈ (0, 1]. In the first part, we study the problem in the central ϵ-DP model. We first provide a near-optimal result by developing a private and robust Upper Confidence Bound (UCB) algorithm. Then, we improve the result via a private and robust version of the Successive Elimination (SE) algorithm. Finally, we show that the instance-dependent regret bound of our improved algorithm is optimal by showing its lower bound. In the second part of the paper, we study the problem in the ϵ-LDP model. We propose an algorithm which could be seen as locally private and robust version of the SE algorithm, and show it could achieve (near) optimal rates for both instance-dependent and instance-independent regrets. All of the above results can also reveal the differences between the problem of private MAB with bounded rewards and heavy-tailed rewards. To achieve these (near) optimal rates, we develop several new hard instances and private robust estimators as byproducts, which might could be used to other related problems. Finally, experimental results also support our theoretical analysis and show the effectiveness of our algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2023

Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards

In this paper, we study the problem of (finite horizon tabular) Markov d...
research
01/23/2023

Quantum Heavy-tailed Bandits

In this paper, we study multi-armed bandits (MAB) and stochastic linear ...
research
02/06/2023

On Private and Robust Bandits

We study private and robust multi-armed bandits (MABs), where the agent ...
research
06/01/2020

(Locally) Differentially Private Combinatorial Semi-Bandits

In this paper, we study Combinatorial Semi-Bandits (CSB) that is an exte...
research
09/06/2022

When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits

We study the problem of multi-armed bandits with ϵ-global Differential P...
research
02/16/2021

Optimal Algorithms for Private Online Learning in a Stochastic Environment

We consider two variants of private stochastic online learning. The firs...
research
05/22/2019

An Optimal Private Stochastic-MAB Algorithm Based on an Optimal Private Stopping Rule

We present a provably optimal differentially private algorithm for the s...

Please sign up or login with your details

Forgot password? Click here to reset