Towards Measuring Membership Privacy

12/25/2017
by   Yunhui Long, et al.
0

Machine learning models are increasingly made available to the masses through public query interfaces. Recent academic work has demonstrated that malicious users who can query such models are able to infer sensitive information about records within the training data. Differential privacy can thwart such attacks, but not all models can be readily trained to achieve this guarantee or to achieve it with acceptable utility loss. As a result, if a model is trained without differential privacy guarantee, little is known or can be said about the privacy risk of releasing it. In this work, we investigate and analyze membership attacks to understand why and how they succeed. Based on this understanding, we propose Differential Training Privacy (DTP), an empirical metric to estimate the privacy risk of publishing a classier when methods such as differential privacy cannot be applied. DTP is a measure of a classier with respect to its training dataset, and we show that calculating DTP is efficient in many practical cases. We empirically validate DTP using state-of-the-art machine learning models such as neural networks trained on real-world datasets. Our results show that DTP is highly predictive of the success of membership attacks and therefore reducing DTP also reduces the privacy risk. We advocate for DTP to be used as part of the decision-making process when considering publishing a classifier. To this end, we also suggest adopting the DTP-1 hypothesis: if a classifier has a DTP value above 1, it should not be published.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2019

Effects of Differential Privacy and Data Skewness on Membership Inference Vulnerability

Membership inference attacks seek to infer the membership of individual ...
research
05/12/2023

Comparison of machine learning models applied on anonymized data with different techniques

Anonymization techniques based on obfuscating the quasi-identifiers by m...
research
02/25/2022

Does Label Differential Privacy Prevent Label Inference Attacks?

Label differential privacy (LDP) is a popular framework for training pri...
research
07/21/2023

Epsilon*: Privacy Metric for Machine Learning Models

We introduce Epsilon*, a new privacy metric for measuring the privacy ri...
research
07/26/2020

Anonymizing Machine Learning Models

There is a known tension between the need to analyze personal data to dr...
research
05/31/2023

A Note On Interpreting Canary Exposure

Canary exposure, introduced in Carlini et al. is frequently used to empi...
research
10/07/2021

The Connection between Out-of-Distribution Generalization and Privacy of ML Models

With the goal of generalizing to out-of-distribution (OOD) data, recent ...

Please sign up or login with your details

Forgot password? Click here to reset