imdpGAN: Generating Private and Specific Data with Generative Adversarial Networks
Generative Adversarial Network (GAN) and its variants have shown promising results in generating synthetic data. However, the issues with GANs are: (i) the learning happens around the training samples and the model often ends up remembering them, consequently, compromising the privacy of individual samples - this becomes a major concern when GANs are applied to training data including personally identifiable information, (ii) the randomness in generated data - there is no control over the specificity of generated samples. To address these issues, we propose imdpGAN - an information maximizing differentially private Generative Adversarial Network. It is an end-to-end framework that simultaneously achieves privacy protection and learns latent representations. With experiments on MNIST dataset, we show that imdpGAN preserves the privacy of the individual data point, and learns latent codes to control the specificity of the generated samples. We perform binary classification on digit pairs to show the utility versus privacy trade-off. The classification accuracy decreases as we increase privacy levels in the framework. We also experimentally show that the training process of imdpGAN is stable but experience a 10-fold time increase as compared with other GAN frameworks. Finally, we extend imdpGAN framework to CelebA dataset to show how the privacy and learned representations can be used to control the specificity of the output.
READ FULL TEXT