Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients

12/30/2020
by   Chris Cundy, et al.
0

As reinforcement learning techniques are increasingly applied to real-world decision problems, attention has turned to how these algorithms use potentially sensitive information. We consider the task of training a policy that maximizes reward while minimizing disclosure of certain sensitive state variables through the actions. We give examples of how this setting covers real-world problems in privacy for sequential decision-making. We solve this problem in the policy gradients framework by introducing a regularizer based on the mutual information (MI) between the sensitive state and the actions at a given timestep. We develop a model-based stochastic gradient estimator for optimization of privacy-constrained policies. We also discuss an alternative MI regularizer that serves as an upper bound to our main MI regularizer and can be optimized in a model-free setting. We contrast previous work in differentially-private RL to our mutual-information formulation of information disclosure. Experimental results show that our training method results in policies which hide the sensitive state.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/06/2018

Learning to Share and Hide Intentions using Information Regularization

Learning to cooperate with friends and compete with foes is a key compon...
05/28/2019

Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer

In this work, we develop a novel regularizer to improve the learning of ...
01/18/2022

Differentially Private Reinforcement Learning with Linear Function Approximation

Motivated by the wide adoption of reinforcement learning (RL) in real-wo...
09/22/2020

Distributed Differentially Private Mutual Information Ranking and Its Applications

Computation of Mutual Information (MI) helps understand the amount of in...
02/01/2019

Privacy Preserving Off-Policy Evaluation

Many reinforcement learning applications involve the use of data that is...
08/04/2022

Invariant Representations with Stochastically Quantized Neural Networks

Representation learning algorithms offer the opportunity to learn invari...
05/16/2020

Mutual Information Maximization for Robust Plannable Representations

Extending the capabilities of robotics to real-world complex, unstructur...