Safeguarding Data in Multimodal AI: A Differentially Private Approach to CLIP Training

06/13/2023
by   Alyssa Huang, et al.
0

The surge in multimodal AI's success has sparked concerns over data privacy in vision-and-language tasks. While CLIP has revolutionized multimodal learning through joint training on images and text, its potential to unintentionally disclose sensitive information necessitates the integration of privacy-preserving mechanisms. We introduce a differentially private adaptation of the Contrastive Language-Image Pretraining (CLIP) model that effectively addresses privacy concerns while retaining accuracy. Our proposed method, Dp-CLIP, is rigorously evaluated on benchmark datasets encompassing diverse vision-and-language tasks such as image classification and visual question answering. We demonstrate that our approach retains performance on par with the standard non-private CLIP model. Furthermore, we analyze our proposed algorithm under linear representation settings. We derive the convergence rate of our algorithm and show a trade-off between utility and privacy when gradients are clipped per-batch and the loss function does not satisfy smoothness conditions assumed in the literature for the analysis of DP-SGD.

READ FULL TEXT
research
06/08/2023

Differentially Private Image Classification by Learning Priors from Random Processes

In privacy-preserving machine learning, differentially private stochasti...
research
08/05/2022

DP^2-VAE: Differentially Private Pre-trained Variational Autoencoders

Modern machine learning systems achieve great success when trained on la...
research
12/01/2022

Differentially Private Learning with Per-Sample Adaptive Clipping

Privacy in AI remains a topic that draws attention from researchers and ...
research
08/21/2023

Unlocking Accuracy and Fairness in Differentially Private Image Classification

Privacy-preserving machine learning aims to train models on private data...
research
10/03/2020

Differentially Private Representation for NLP: Formal Guarantee and An Empirical Study on Privacy and Fairness

It has been demonstrated that hidden representation learned by a deep mo...
research
05/10/2022

Privacy Enhancement for Cloud-Based Few-Shot Learning

Requiring less data for accurate models, few-shot learning has shown rob...
research
02/19/2023

Why Is Public Pretraining Necessary for Private Model Training?

In the privacy-utility tradeoff of a model trained on benchmark language...

Please sign up or login with your details

Forgot password? Click here to reset