Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

11/01/2021
by   Tianshi Cao, et al.
6

Although machine learning models trained on massive data have led to break-throughs in several areas, their deployment in privacy-sensitive domains remains limited due to restricted access to data. Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead. We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy. DP-Sinkhorn minimizes the Sinkhorn divergence, a computationally efficient approximation to the exact optimal transport distance, between the model and data in a differentially private manner and uses a novel technique for control-ling the bias-variance trade-off of gradient estimates. Unlike existing approaches for training differentially private generative models, which are mostly based on generative adversarial networks, we do not rely on adversarial objectives, which are notoriously difficult to optimize, especially in the presence of noise imposed by privacy constraints. Hence, DP-Sinkhorn is easy to train and deploy. Experimentally, we improve upon the state-of-the-art on multiple image modeling benchmarks and show differentially private synthesis of informative RGB images. Project page:https://nv-tlabs.github.io/DP-Sinkhorn.

READ FULL TEXT

page 8

page 19

research
10/18/2022

Differentially Private Diffusion Models

While modern machine learning models rely on increasingly large training...
research
08/05/2022

DP^2-VAE: Differentially Private Pre-trained Variational Autoencoders

Modern machine learning systems achieve great success when trained on la...
research
02/09/2019

Passing Tests without Memorizing: Two Models for Fooling Discriminators

We introduce two mathematical frameworks for foolability in the context ...
research
05/27/2019

Private Learning and Regularized Optimal Transport

Private data are valuable either by remaining private (for instance if t...
research
08/28/2023

Generating tabular datasets under differential privacy

Machine Learning (ML) is accelerating progress across fields and industr...
research
06/15/2020

GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators

The wide-spread availability of rich data has fueled the growth of machi...
research
07/05/2021

Differentially Private Sliced Wasserstein Distance

Developing machine learning methods that are privacy preserving is today...

Please sign up or login with your details

Forgot password? Click here to reset