An Overview of Privacy in Machine Learning

05/18/2020
by   Emiliano De Cristofaro, et al.
0

Over the past few years, providers such as Google, Microsoft, and Amazon have started to provide customers with access to software interfaces allowing them to easily embed machine learning tasks into their applications. Overall, organizations can now use Machine Learning as a Service (MLaaS) engines to outsource complex tasks, e.g., training classifiers, performing predictions, clustering, etc. They can also let others query models trained on their data. Naturally, this approach can also be used (and is often advocated) in other contexts, including government collaborations, citizen science projects, and business-to-business partnerships. However, if malicious users were able to recover data used to train these models, the resulting information leakage would create serious issues. Likewise, if the inner parameters of the model are considered proprietary information, then access to the model should not allow an adversary to learn such parameters. In this document, we set to review privacy challenges in this space, providing a systematic review of the relevant research literature, also exploring possible countermeasures. More specifically, we provide ample background information on relevant concepts around machine learning and privacy. Then, we discuss possible adversarial models and settings, cover a wide range of attacks that relate to private and/or sensitive information leakage, and review recent results attempting to defend against such attacks. Finally, we conclude with a list of open problems that require more work, including the need for better evaluations, more targeted defenses, and the study of the relation to policy and data protection efforts.

READ FULL TEXT

page 2

page 7

research
07/05/2018

Privacy-preserving Machine Learning through Data Obfuscation

As machine learning becomes a practice and commodity, numerous cloud-bas...
research
07/04/2021

Survey: Leakage and Privacy at Inference Time

Leakage of data from publicly available Machine Learning (ML) models is ...
research
06/01/2023

Adversarial Robustness in Unsupervised Machine Learning: A Systematic Review

As the adoption of machine learning models increases, ensuring robust mo...
research
12/15/2022

White-box Inference Attacks against Centralized Machine Learning and Federated Learning

With the development of information science and technology, various indu...
research
05/20/2021

Preventing Machine Learning Poisoning Attacks Using Authentication and Provenance

Recent research has successfully demonstrated new types of data poisonin...
research
02/07/2022

Redactor: Targeted Disinformation Generation using Probabilistic Decision Boundaries

Information leakage is becoming a critical problem as various informatio...
research
02/09/2021

k-Anonymity in Practice: How Generalisation and Suppression Affect Machine Learning Classifiers

The protection of private information is a crucial issue in data-driven ...

Please sign up or login with your details

Forgot password? Click here to reset