Interactive and Concentrated Differential Privacy for Bandits

09/01/2023
by   Achraf Azize, et al.
0

Bandits play a crucial role in interactive learning schemes and modern recommender systems. However, these systems often rely on sensitive user data, making privacy a critical concern. This paper investigates privacy in bandits with a trusted centralized decision-maker through the lens of interactive Differential Privacy (DP). While bandits under pure ϵ-global DP have been well-studied, we contribute to the understanding of bandits under zero Concentrated DP (zCDP). We provide minimax and problem-dependent lower bounds on regret for finite-armed and linear bandits, which quantify the cost of ρ-global zCDP in these settings. These lower bounds reveal two hardness regimes based on the privacy budget ρ and suggest that ρ-global zCDP incurs less regret than pure ϵ-global DP. We propose two ρ-global zCDP bandit algorithms, AdaC-UCB and AdaC-GOPE, for finite-armed and linear bandits respectively. Both algorithms use a common recipe of Gaussian mechanism and adaptive episodes. We analyze the regret of these algorithms to show that AdaC-UCB achieves the problem-dependent regret lower bound up to multiplicative constants, while AdaC-GOPE achieves the minimax regret lower bound up to poly-logarithmic factors. Finally, we provide experimental validation of our theoretical results under different settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2022

When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits

We study the problem of multi-armed bandits with ϵ-global Differential P...
research
05/24/2021

Cascading Bandit under Differential Privacy

This paper studies differential privacy (DP) and local differential priv...
research
06/08/2023

Federated Linear Contextual Bandits with User-level Differential Privacy

This paper studies federated linear contextual bandits under the notion ...
research
05/29/2019

Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?

We introduce a number of privacy definitions for the multi-armed bandit ...
research
07/06/2020

Multi-Armed Bandits with Local Differential Privacy

This paper investigates the problem of regret minimization for multi-arm...
research
06/20/2019

Sequential Experimental Design for Transductive Linear Bandits

In this paper we introduce the transductive linear bandit problem: given...
research
11/03/2021

The Impact of Batch Learning in Stochastic Bandits

We consider a special case of bandit problems, namely batched bandits. M...

Please sign up or login with your details

Forgot password? Click here to reset