Differentially Private Contextual Linear Bandits

09/28/2018
by   Roshan Shariff, et al.
0

We study the contextual linear bandit problem, a version of the standard stochastic multi-armed bandit (MAB) problem where a learner sequentially selects actions to maximize a reward which depends also on a user provided per-round context. Though the context is chosen arbitrarily or adversarially, the reward is assumed to be a stochastic function of a feature vector that encodes the context and selected action. Our goal is to devise private learners for the contextual linear bandit problem. We first show that using the standard definition of differential privacy results in linear regret. So instead, we adopt the notion of joint differential privacy, where we assume that the action chosen on day t is only revealed to user t and thus needn't be kept private that day, only on following days. We give a general scheme converting the classic linear-UCB algorithm into a joint differentially private algorithm using the tree-based algorithm. We then apply either Gaussian noise or Wishart noise to achieve joint-differentially private algorithms and bound the resulting algorithms' regrets. In addition, we give the first lower bound on the additional regret any private algorithms for the MAB problem must incur.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2017

The Price of Differential Privacy For Online Learning

We design differentially private algorithms for the problem of online li...
research
04/23/2023

Robust and differentially private stochastic linear bandits

In this paper, we study the stochastic linear bandit problem under the a...
research
06/06/2018

Mitigating Bias in Adaptive Data Gathering via Differential Privacy

Data that is gathered adaptively --- via bandit algorithms, for example ...
research
11/27/2015

Algorithms for Differentially Private Multi-Armed Bandits

We present differentially private algorithms for the stochastic Multi-Ar...
research
08/30/2022

Dynamic Global Sensitivity for Differentially Private Contextual Bandits

Bandit algorithms have become a reference solution for interactive recom...
research
09/21/2020

Contextual Bandits for adapting to changing User preferences over time

Contextual bandits provide an effective way to model the dynamic data pr...
research
07/06/2018

Differentially Private Online Submodular Optimization

In this paper we develop the first algorithms for online submodular mini...

Please sign up or login with your details

Forgot password? Click here to reset