Differentially Private Diffusion Models Generate Useful Synthetic Images

02/27/2023
by   Sahra Ghalebikesabi, et al.
3

The ability to generate privacy-preserving synthetic versions of sensitive image datasets could unlock numerous ML applications currently constrained by data availability. Due to their astonishing image generation quality, diffusion models are a prime candidate for generating high-quality synthetic data. However, recent studies have found that, by default, the outputs of some diffusion models do not preserve training data privacy. By privately fine-tuning ImageNet pre-trained diffusion models with more than 80M parameters, we obtain SOTA results on CIFAR-10 and Camelyon17 in terms of both FID and the accuracy of downstream classifiers trained on synthetic data. We decrease the SOTA FID on CIFAR-10 from 26.2 to 9.8, and increase the accuracy from 51.0 accuracy of 91.1 data. We leverage the ability of generative models to create infinite amounts of data to maximise the downstream prediction performance, and further show how to use synthetic data for hyperparameter tuning. Our results demonstrate that diffusion models fine-tuned with differential privacy can produce useful and provably private synthetic data, even in applications with significant distribution shift between the pre-training and fine-tuning distributions.

READ FULL TEXT

page 1

page 2

page 18

page 19

page 20

research
10/05/2022

Fine-Tuning with Differential Privacy Necessitates an Additional Hyperparameter Search

Models need to be trained with privacy-preserving learning algorithms to...
research
01/30/2023

Extracting Training Data from Diffusion Models

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion ha...
research
05/25/2023

Differentially Private Latent Diffusion Models

Diffusion models (DMs) are widely used for generating high-quality image...
research
06/02/2023

Harnessing large-language models to generate private synthetic text

Differentially private (DP) training methods like DP-SGD can protect sen...
research
09/04/2023

FinDiff: Diffusion Models for Financial Tabular Data Generation

The sharing of microdata, such as fund holdings and derivative instrumen...
research
05/23/2023

Selective Pre-training for Private Fine-tuning

Suppose we want to train text prediction models in email clients or word...
research
05/22/2018

Deep Learning with Cinematic Rendering - Fine-Tuning Deep Neural Networks Using Photorealistic Medical Images

Deep learning has emerged as a powerful artificial intelligence tool to ...

Please sign up or login with your details

Forgot password? Click here to reset