Distributed generation of privacy preserving data with user customization

04/20/2019
by   Xiao Chen, et al.
6

Distributed devices such as mobile phones can produce and store large amounts of data that can enhance machine learning models; however, this data may contain private information specific to the data owner that prevents the release of the data. We wish to reduce the correlation between user-specific private information and data while maintaining the useful information. Rather than learning a large model to achieve privatization from end to end, we introduce a decoupling of the creation of a latent representation and the privatization of data that allows user-specific privatization to occur in a distributed setting with limited computation and minimal disturbance on the utility of the data. We leverage a Variational Autoencoder (VAE) to create a compact latent representation of the data; however, the VAE remains fixed for all devices and all possible private labels. We then train a small generative filter to perturb the latent representation based on individual preferences regarding the private and utility information. The small filter is trained by utilizing a GAN-type robust optimization that can take place on a distributed device. We conduct experiments on three popular datasets: MNIST, UCI-Adult, and CelebA, and give a thorough evaluation including visualizing the geometry of the latent embeddings and estimating the empirical mutual information to show the effectiveness of our approach.

READ FULL TEXT

page 2

page 7

page 18

page 19

page 20

research
12/02/2020

Generating private data with user customization

Personal devices such as mobile phones can produce and store large amoun...
research
09/06/2018

Discovering Influential Factors in Variational Autoencoder

In the field of machine learning, it is still a critical issue to identi...
research
07/04/2023

Approximate, Adapt, Anonymize (3A): a Framework for Privacy Preserving Training Data Release for Machine Learning

The availability of large amounts of informative data is crucial for suc...
research
09/19/2019

Learning to Conceal: A Deep Learning Based Method for Preserving Privacy and Avoiding Prejudice

In this paper, we introduce a learning model able to conceals personal i...
research
02/11/2022

Privacy-preserving Generative Framework Against Membership Inference Attacks

Artificial intelligence and machine learning have been integrated into a...
research
05/22/2023

EXACT: Extensive Attack for Split Learning

Privacy-Preserving machine learning (PPML) can help us train and deploy ...
research
11/09/2019

Preservation of Anomalous Subgroups On Machine Learning Transformed Data

In this paper, we investigate the effect of machine learning based anony...

Please sign up or login with your details

Forgot password? Click here to reset