Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?

05/29/2019 ∙ by Debabrota Basu, et al. ∙ 0

We introduce a number of privacy definitions for the multi-armed bandit problem, based on differential privacy. We relate them through a unifying graphical model representation and connect them to existing definitions. We then derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions. We show that for all of them, the learner's regret is increased by a multiplicative factor dependent on the privacy level ϵ, but that the dependency is weaker when we do not require local differential privacy for the rewards.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.