Private Distribution Learning with Public Data: The View from Sample Compression

08/11/2023
βˆ™
by   Shai Ben-David, et al.
βˆ™
0
βˆ™

We study the problem of private distribution learning with access to public data. In this setup, which we refer to as public-private learning, the learner is given public and private samples drawn from an unknown distribution p belonging to a class 𝒬, with the goal of outputting an estimate of p while adhering to privacy constraints (here, pure differential privacy) only with respect to the private samples. We show that the public-private learnability of a class 𝒬 is connected to the existence of a sample compression scheme for 𝒬, as well as to an intermediate notion we refer to as list learning. Leveraging this connection: (1) approximately recovers previous results on Gaussians over ℝ^d; and (2) leads to new ones, including sample complexity upper bounds for arbitrary k-mixtures of Gaussians over ℝ^d, results for agnostic and distribution-shift resistant learners, as well as closure properties for public-private learnability under taking mixtures and products of distributions. Finally, via the connection to list learning, we show that for Gaussians in ℝ^d, at least d public samples are necessary for private learnability, which is close to the known upper bound of d+1 public samples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
βˆ™ 04/23/2020

Private Query Release Assisted by Public Data

We study the problem of differentially private query release assisted by...
research
βˆ™ 03/10/2020

Closure Properties for Private Classification and Online Prediction

Let H be a class of boolean functions and consider acomposed class H' th...
research
βˆ™ 08/01/2020

Learning from Mixtures of Private and Public Populations

We initiate the study of a new model of supervised learning under privac...
research
βˆ™ 10/25/2019

Limits of Private Learning with Access to Public Data

We consider learning problems where the training set consists of two typ...
research
βˆ™ 11/24/2020

InstaHide's Sample Complexity When Mixing Two Private Images

Inspired by InstaHide challenge [Huang, Song, Li and Arora'20], [Chen, S...
research
βˆ™ 11/23/2020

On InstaHide, Phase Retrieval, and Sparse Matrix Factorization

In this work, we examine the security of InstaHide, a scheme recently pr...
research
βˆ™ 10/14/2017

Agnostic Distribution Learning via Compression

We study sample-efficient distribution learning, where a learner is give...

Please sign up or login with your details

Forgot password? Click here to reset