Prediction Poisoning: Utility-Constrained Defenses Against Model Stealing Attacks

06/26/2019
by   Tribhuvanesh Orekondy, et al.
0

With the advances of ML models in recent years, we are seeing an increasing number of real-world commercial applications and services e.g., autonomous vehicles, medical equipment, web APIs emerge. Recent advances in model functionality stealing attacks via black-box access (i.e., inputs in, predictions out) threaten the business model of such ML applications, which require a lot of time, money, and effort to develop. In this paper, we address the issue by studying defenses for model stealing attacks, largely motivated by a lack of effective defenses in literature. We work towards the first defense which introduces targeted perturbations to the model predictions under a utility constraint. Our approach introduces the perturbations targeted towards manipulating the training procedure of the attacker. We evaluate our approach on multiple datasets and attack scenarios across a range of utility constrains. Our results show that it is indeed possible to trade-off utility (e.g., deviation from original prediction, test accuracy) to significantly reduce effectiveness of model stealing attacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2023

Investigating Stateful Defenses Against Black-Box Adversarial Examples

Defending machine-learning (ML) models against white-box adversarial att...
research
05/26/2019

Enhancing ML Robustness Using Physical-World Constraints

Recent advances in Machine Learning (ML) have demonstrated that neural n...
research
09/04/2023

Efficient Defense Against Model Stealing Attacks on Convolutional Neural Networks

Model stealing attacks have become a serious concern for deep learning m...
research
11/16/2019

Defending Against Model Stealing Attacks with Adaptive Misinformation

Deep Neural Networks (DNNs) are susceptible to model stealing attacks, w...
research
06/28/2022

How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection

Model stealing attacks present a dilemma for public machine learning API...
research
09/05/2022

An Adaptive Black-box Defense against Trojan Attacks (TrojDef)

Trojan backdoor is a poisoning attack against Neural Network (NN) classi...
research
10/17/2022

Deepfake Text Detection: Limitations and Opportunities

Recent advances in generative models for language have enabled the creat...

Please sign up or login with your details

Forgot password? Click here to reset