Log In Sign Up

Differentially Private Contextual Linear Bandits

by   Roshan Shariff, et al.

We study the contextual linear bandit problem, a version of the standard stochastic multi-armed bandit (MAB) problem where a learner sequentially selects actions to maximize a reward which depends also on a user provided per-round context. Though the context is chosen arbitrarily or adversarially, the reward is assumed to be a stochastic function of a feature vector that encodes the context and selected action. Our goal is to devise private learners for the contextual linear bandit problem. We first show that using the standard definition of differential privacy results in linear regret. So instead, we adopt the notion of joint differential privacy, where we assume that the action chosen on day t is only revealed to user t and thus needn't be kept private that day, only on following days. We give a general scheme converting the classic linear-UCB algorithm into a joint differentially private algorithm using the tree-based algorithm. We then apply either Gaussian noise or Wishart noise to achieve joint-differentially private algorithms and bound the resulting algorithms' regrets. In addition, we give the first lower bound on the additional regret any private algorithms for the MAB problem must incur.


page 1

page 2

page 3

page 4


The Price of Differential Privacy For Online Learning

We design differentially private algorithms for the problem of online li...

Mitigating Bias in Adaptive Data Gathering via Differential Privacy

Data that is gathered adaptively --- via bandit algorithms, for example ...

Algorithms for Differentially Private Multi-Armed Bandits

We present differentially private algorithms for the stochastic Multi-Ar...

Dynamic Global Sensitivity for Differentially Private Contextual Bandits

Bandit algorithms have become a reference solution for interactive recom...

Contextual Bandits for adapting to changing User preferences over time

Contextual bandits provide an effective way to model the dynamic data pr...

Concurrent Shuffle Differential Privacy Under Continual Observation

We introduce the concurrent shuffle model of differential privacy. In th...

Differentially Private Online Submodular Optimization

In this paper we develop the first algorithms for online submodular mini...