Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning

04/01/2019
by   Ahmed Salem, et al.
0

Machine learning (ML) has progressed rapidly during the past decade and the major factor that drives such development is the unprecedented large-scale data. As data generation is a continuous process, this leads to ML service providers updating their models frequently with newly-collected data in an online learning scenario. In consequence, if an ML model is queried with the same set of data samples at two different points in time, it will provide different results. In this paper, we investigate whether the change in the output of a black-box ML model before and after being updated can leak information of the dataset used to perform the update. This constitutes a new attack surface against black-box ML models and such information leakage severely damages the intellectual property and data privacy of the ML model owner/provider. In contrast to membership inference attacks, we use an encoder-decoder formulation that allows inferring diverse information ranging from detailed characteristics to full reconstruction of the dataset. Our new attacks are facilitated by state-of-the-art deep learning techniques. In particular, we propose a hybrid generative model (BM-GAN) that is based on generative adversarial networks (GANs) but includes a reconstructive loss that allows generating accurate samples. Our experiments show effective prediction of dataset characteristics and even full reconstruction in challenging conditions.

READ FULL TEXT

page 1

page 5

page 7

page 8

page 11

page 12

research
05/13/2022

l-Leaks: Membership Inference Attacks with Logits

Machine Learning (ML) has made unprecedented progress in the past severa...
research
06/07/2019

Reconstruction and Membership Inference Attacks against Generative Models

We present two information leakage attacks that outperform previous work...
research
03/28/2020

DaST: Data-free Substitute Training for Adversarial Attacks

Machine learning models are vulnerable to adversarial examples. For the ...
research
05/09/2020

Estimating g-Leakage via Machine Learning

This paper considers the problem of estimating the information leakage o...
research
05/30/2022

White-box Membership Attack Against Machine Learning Based Retinopathy Classification

The advances in machine learning (ML) have greatly improved AI-based dia...
research
02/25/2022

On the Effectiveness of Dataset Watermarking in Adversarial Settings

In a data-driven world, datasets constitute a significant economic value...
research
09/28/2022

MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference

The cost efficiency of model inference is critical to real-world machine...

Please sign up or login with your details

Forgot password? Click here to reset