Synthetic is all you need: removing the auxiliary data assumption for membership inference attacks against synthetic data

07/04/2023
by   Florent Guépin, et al.
0

Synthetic data is emerging as the most promising solution to share individual-level data while safeguarding privacy. Membership inference attacks (MIAs), based on shadow modeling, have become the standard to evaluate the privacy of synthetic data. These attacks, however, currently assume the attacker to have access to an auxiliary dataset sampled from a similar distribution as the training dataset. This often is a very strong assumption that would make an attack unlikely to happen in practice. We here show how this assumption can be removed and how MIAs can be performed using only the synthetic data. More specifically, in three different attack scenarios using only synthetic data, our results demonstrate that MIAs are still successful, across two real-world datasets and two synthetic data generators. These results show how the strong hypothesis made when auditing synthetic data releases - access to an auxiliary dataset - can be relaxed to perform an actual attack.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2023

Membership Inference Attacks against Synthetic Data through Overfitting Detection

Data is the foundation of most science. Unfortunately, sharing data can ...
research
06/17/2023

Achilles' Heels: Vulnerable Record Identification in Synthetic Data Publishing

Synthetic data is seen as the most promising solution to share individua...
research
07/17/2021

Spatial Data Generators

This gem describes a standard method for generating synthetic spatial da...
research
03/13/2022

Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

The main question this work aims at answering is: can morphing attack de...
research
09/12/2020

Revisiting the Threat Space for Vision-based Keystroke Inference Attacks

A vision-based keystroke inference attack is a side-channel attack in wh...
research
02/05/2021

Measuring Utility and Privacy of Synthetic Genomic Data

Genomic data provides researchers with an invaluable source of informati...
research
12/08/2022

GenSyn: A Multi-stage Framework for Generating Synthetic Microdata using Macro Data Sources

Individual-level data (microdata) that characterizes a population, is es...

Please sign up or login with your details

Forgot password? Click here to reset