Near-Optimal Differentially Private Reinforcement Learning

12/09/2022
by   Dan Qiao, et al.
0

Motivated by personalized healthcare and other applications involving sensitive data, we study online exploration in reinforcement learning with differential privacy (DP) constraints. Existing work on this problem established that no-regret learning is possible under joint differential privacy (JDP) and local differential privacy (LDP) but did not provide an algorithm with optimal regret. We close this gap for the JDP case by designing an ϵ-JDP algorithm with a regret of O(√(SAH^2T)+S^2AH^3/ϵ) which matches the information-theoretic lower bound of non-private learning for all choices of ϵ> S^1.5A^0.5 H^2/√(T). In the above, S, A denote the number of states and actions, H denotes the planning horizon, and T is the number of steps. To the best of our knowledge, this is the first private RL algorithm that achieves privacy for free asymptotically as T→∞. Our techniques – which could be of independent interest – include privately releasing Bernstein-type exploration bonuses and an improved method for releasing visitation statistics. The same techniques also imply a slightly improved regret bound for the LDP case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2017

The Price of Differential Privacy For Online Learning

We design differentially private algorithms for the problem of online li...
research
01/18/2022

Differentially Private Reinforcement Learning with Linear Function Approximation

Motivated by the wide adoption of reinforcement learning (RL) in real-wo...
research
02/02/2022

Improved Regret for Differentially Private Exploration in Linear MDP

We study privacy-preserving exploration in sequential decision-making fo...
research
08/26/2021

Adaptive Control of Differentially Private Linear Quadratic Systems

In this paper, we study the problem of regret minimization in reinforcem...
research
02/27/2023

On Differentially Private Online Predictions

In this work we introduce an interactive variant of joint differential p...
research
09/18/2020

Private Reinforcement Learning with PAC and Regret Guarantees

Motivated by high-stakes decision-making domains like personalized medic...
research
01/31/2020

Locally Private Distributed Reinforcement Learning

We study locally differentially private algorithms for reinforcement lea...

Please sign up or login with your details

Forgot password? Click here to reset