Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

10/19/2021 ∙ by Chonghua Liao, et al. ∙ 8

Reinforcement learning (RL) algorithms can be used to provide personalized services, which rely on users' private and sensitive data. To protect the users' privacy, privacy-preserving RL algorithms are in demand. In this paper, we study RL with linear function approximation and local differential privacy (LDP) guarantees. We propose a novel (ε, δ)-LDP algorithm for learning a class of Markov decision processes (MDPs) dubbed linear mixture MDPs, and obtains an 𝒪̃( d^5/4H^7/4T^3/4(log(1/δ))^1/4√(1/ε)) regret, where d is the dimension of feature mapping, H is the length of the planning horizon, and T is the number of interactions with the environment. We also prove a lower bound Ω(dH√(T)/(e^ε(e^ε-1))) for learning linear mixture MDPs under ε-LDP constraint. Experiments on synthetic datasets verify the effectiveness of our algorithm. To the best of our knowledge, this is the first provable privacy-preserving RL algorithm with linear function approximation.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.