Learning to Generate Image Embeddings with User-level Differential Privacy

11/20/2022
by   Zheng Xu, et al.
0

Small on-device models have been successfully trained with user-level differential privacy (DP) for next word prediction and image classification tasks in the past. However, existing methods can fail when directly applied to learn embedding models using supervised training data with a large class space. To achieve user-level DP for large image-to-embedding feature extractors, we propose DP-FedEmb, a variant of federated learning algorithms with per-user sensitivity control and noise addition, to train from user-partitioned data centralized in the datacenter. DP-FedEmb combines virtual clients, partial aggregation, private local fine-tuning, and public pretraining to achieve strong privacy utility trade-offs. We apply DP-FedEmb to train image embedding models for faces, landmarks and natural species, and demonstrate its superior utility under same privacy budget on benchmark datasets DigiFace, EMNIST, GLD and iNaturalist. We further illustrate it is possible to achieve strong user-level DP guarantees of ϵ<2 while controlling the utility drop within 5

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2023

Federated Learning of Gboard Language Models with Differential Privacy

We train language models (LMs) with federated learning (FL) and differen...
research
05/10/2022

Sentence-level Privacy for Document Embeddings

User language data can contain highly sensitive personal content. As suc...
research
06/08/2023

Federated Linear Contextual Bandits with User-level Differential Privacy

This paper studies federated linear contextual bandits under the notion ...
research
06/26/2023

Private Federated Learning in Gboard

This white paper describes recent advances in Gboard(Google Keyboard)'s ...
research
03/22/2023

Exploring the Benefits of Visual Prompting in Differential Privacy

Visual Prompting (VP) is an emerging and powerful technique that allows ...
research
06/14/2022

Self-Supervised Pretraining for Differentially Private Learning

We demonstrate self-supervised pretraining (SSP) is a scalable solution ...
research
02/13/2019

Privacy-Utility Trade-off of Linear Regression under Random Projections and Additive Noise

Data privacy is an important concern in machine learning, and is fundame...

Please sign up or login with your details

Forgot password? Click here to reset