Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion Models

06/02/2023
by   Virginia Fernandez, et al.
0

Knowledge distillation in neural networks refers to compressing a large model or dataset into a smaller version of itself. We introduce Privacy Distillation, a framework that allows a text-to-image generative model to teach another model without exposing it to identifiable data. Here, we are interested in the privacy issue faced by a data provider who wishes to share their data via a multimodal generative model. A question that immediately arises is “How can a data provider ensure that the generative model is not leaking identifiable information about a patient?”. Our solution consists of (1) training a first diffusion model on real data (2) generating a synthetic dataset using this model and filtering it to exclude images with a re-identifiability risk (3) training a second diffusion model on the filtered synthetic data only. We showcase that datasets sampled from models trained with privacy distillation can effectively reduce re-identification risk whilst maintaining downstream performance.

READ FULL TEXT

page 4

page 7

research
06/08/2023

BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping

Diffusion models have demonstrated excellent potential for generating di...
research
12/16/2022

Swing Distillation: A Privacy-Preserving Knowledge Distillation Framework

Knowledge distillation (KD) has been widely used for model compression a...
research
03/08/2023

DiM: Distilling Dataset into Generative Model

Dataset distillation reduces the network training cost by synthesizing s...
research
05/02/2023

Generalizing Dataset Distillation via Deep Generative Prior

Dataset Distillation aims to distill an entire dataset's knowledge into ...
research
05/14/2023

On enhancing the robustness of Vision Transformers: Defensive Diffusion

Privacy and confidentiality of medical data are of utmost importance in ...
research
08/21/2018

Text-to-image Synthesis via Symmetrical Distillation Networks

Text-to-image synthesis aims to automatically generate images according ...
research
03/16/2022

Learning to Generate Synthetic Training Data using Gradient Matching and Implicit Differentiation

Using huge training datasets can be costly and inconvenient. This articl...

Please sign up or login with your details

Forgot password? Click here to reset