Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients

12/30/2020
by   Chris Cundy, et al.
0

As reinforcement learning techniques are increasingly applied to real-world decision problems, attention has turned to how these algorithms use potentially sensitive information. We consider the task of training a policy that maximizes reward while minimizing disclosure of certain sensitive state variables through the actions. We give examples of how this setting covers real-world problems in privacy for sequential decision-making. We solve this problem in the policy gradients framework by introducing a regularizer based on the mutual information (MI) between the sensitive state and the actions at a given timestep. We develop a model-based stochastic gradient estimator for optimization of privacy-constrained policies. We also discuss an alternative MI regularizer that serves as an upper bound to our main MI regularizer and can be optimized in a model-free setting. We contrast previous work in differentially-private RL to our mutual-information formulation of information disclosure. Experimental results show that our training method results in policies which hide the sensitive state.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/06/2018

Learning to Share and Hide Intentions using Information Regularization

Learning to cooperate with friends and compete with foes is a key compon...
research
05/28/2019

Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer

In this work, we develop a novel regularizer to improve the learning of ...
research
10/14/2022

Mutual Information Regularized Offline Reinforcement Learning

Offline reinforcement learning (RL) aims at learning an effective policy...
research
01/18/2022

Differentially Private Reinforcement Learning with Linear Function Approximation

Motivated by the wide adoption of reinforcement learning (RL) in real-wo...
research
09/22/2020

Distributed Differentially Private Mutual Information Ranking and Its Applications

Computation of Mutual Information (MI) helps understand the amount of in...
research
02/01/2019

Privacy Preserving Off-Policy Evaluation

Many reinforcement learning applications involve the use of data that is...
research
08/26/2021

Quadratic mutual information regularization in real-time deep CNN models

In this paper, regularized lightweight deep convolutional neural network...

Please sign up or login with your details

Forgot password? Click here to reset