Dataset correlation inference attacks against machine learning models

12/16/2021
by   Ana-Maria Cretu, et al.
0

Machine learning models are increasingly used by businesses and organizations around the world to automate tasks and decision-making. Trained on potentially sensitive datasets, machine learning models have been shown to leak information about individuals in the dataset as well as global dataset information. We here take research in dataset property inference attacks one step further by proposing a new attack against ML models: a dataset correlation inference attack, where an attacker's goal is to infer the correlation between input variables of a model. We first show that an attacker can exploit the spherical parametrization of correlation matrices, to make an informed guess. This means that using only the correlation between the input variables and the target variable, an attacker can infer the correlation between two input variables much better than a random guess baseline. We propose a second attack which exploits the access to a machine learning model using shadow modeling to refine the guess. Our attack uses Gaussian copula-based generative modeling to generate synthetic datasets with a wide variety of correlations in order to train a meta-model for the correlation inference task. We evaluate our attack against Logistic Regression and Multi-layer perceptron models and show it to outperform the model-less attack. Our results show that the accuracy of the second, machine learning-based attack decreases with the number of variables and converges towards the accuracy of the model-less attack. However, correlations between input variables which are highly correlated with the target variable are more vulnerable regardless of the number of variables. Our work bridges the gap between what can be considered a global leakage about the training dataset and individual-level leakages. When coupled with marginal leakage attacks,it might also constitute a first step towards dataset reconstruction.

READ FULL TEXT

page 1

page 9

research
08/25/2022

SNAP: Efficient Extraction of Private Properties with Poisoning

Property inference attacks allow an adversary to extract global properti...
research
04/27/2021

Property Inference Attacks on Convolutional Neural Networks: Influence and Implications of Target Model's Complexity

Machine learning models' goal is to make correct predictions for specifi...
research
09/20/2023

Information Leakage from Data Updates in Machine Learning Models

In this paper we consider the setting where machine learning models are ...
research
12/15/2022

Holistic risk assessment of inference attacks in machine learning

As machine learning expanding application, there are more and more unign...
research
10/20/2022

New data poison attacks on machine learning classifiers for mobile exfiltration

Most recent studies have shown several vulnerabilities to attacks with t...
research
12/15/2022

White-box Inference Attacks against Centralized Machine Learning and Federated Learning

With the development of information science and technology, various indu...
research
07/26/2022

Generative Extraction of Audio Classifiers for Speaker Identification

It is perhaps no longer surprising that machine learning models, especia...

Please sign up or login with your details

Forgot password? Click here to reset